監視対象ホストや監視項目の追加

１．監視対象ホストの追加
　　監視対象ホストの追加は、localhost.cfgのに
　　define host{
        use                     linux-server
        host_name               test
        alias                   LinkStation Test Server
        address                 192.168.0.71
        }

　などと追記する。ただし警報発生時の通知箇所や通知方法などが事なる場合は、前段で別の雛型を作って

use XXXXXXXXXXXX

　このように置き換える。

　また、作成したホストに対して一つはサービスが必要であるので、サービスのところで例えばPINGのサービスを追加する。この場合も監視形態が雛型でダメな場合は別途作成して引用する。

　　　　define service{
        　　　　use                              local-service
        　　　　host_name                        test
        　　　　service_description              PING
        　　　　check_command                    check_ping!100.0,20%!500.0,60%
        　　　　}

　また、確かhostgroupへの登録も必要だったような気がする。
　　　　define hostgroup{
        　　　　hostgroup_name HomeServer
        　　　　alias           Home Servers
        　　　　members         LS-GL,test,yahoo
        　　　　}

２．登録済ホストへの監視項目の追加（Ｗｅｂサーバー監視の追加）
　　/usr/local/nagios/etc/localhost.cfgの該当個所に以下の内容を追加する。雛型の使用方法は前のPINGの項と同じである。

　　　　define service{
        　　　　use                              local-service
        　　　　host_name                        LS-GL
        　　　　service_description              HTTP
        　　　　check_command                    check_http
        　　　　}

　なお、サービスの参照構造は以下の通りである。PINGを例に示す。

　localhost.cfgでサービスPINGを指定はこのような内容となるが、
　　　　define service{
        　　　　use                              local-service
        　　　　host_name                        LS-GL
        　　　　service_description              PING
        　　　　check_command                    check_ping!100.0,20%!500.0,60%
        　　　　}

　ここでのcheck_pingはcommand.cfgの中にあるcommand_nameで指定するcheck_pingを参照する。
　　　　# 'check_ping' command definition
　　　　define command{
        　　　　command_name    check_ping
        　　　　command_line    $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
-p 5
        　　　　}

　ここで現実のコマンドcheck_pingは/usr/local/nagios/libexecにあるcheck_pingを参照する。
　上の例では、
　　　　test:/usr/local/nagios/libexec# ./check_ping -H 192.168.0.62
　　　　<wrta> was not set
　　　　check_ping: Could not parse arguments
　　　　Usage:check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
　　　　[-p packets] [-t timeout] [-4|-6]
　でcheck_pingでエラーを出してコマンドのパラメータの構造を見てみる。
　ここでrtaはリターンタイムで    plはパケットロス　    -p 5は５パケットの平均を取るということ

　これを参考に上のLS-GLの指定を正確に設定すると、
　　　　test:/usr/local/nagios/libexec# ./check_ping -H 192.168.0.62 -w 100.0,20% -c 500.0,60% -p 5
　　　　PING OK - Packet loss = 0%, RTA = 7.00 ms

　ここでの　-c 500.0,60% とはrtaが500ms以上又はパケットロスが60%以上で警報を発することを示す。
　上の例ではwにもcにも引っかからないため何の警報も発しない。

　次にディスクチェックを追加してみる。
　check_diskのコマンドのパラメータを見てみる。
test:/usr/local/nagios/libexec# ./check_disk --help
check_disk v1848 (nagios-plugins 1.4.11)
Copyright (c) 1999 Ethan Galstad <nagios@nagios.org>
Copyright (c) 1999-2006 Nagios Plugin Development Team
        <nagiosplug-devel@lists.sourceforge.net>

This plugin checks the amount of used disk space on a mounted file system
and generates an alert if free space is less than one of the threshold values

Usage: check_disk -w limit -c limit [-W limit] [-K limit] {-p path | -x device}
[-C] [-E] [-e] [-g group ] [-k] [-l] [-M] [-m] [-R path ] [-r path ]
[-t timeout] [-u unit] [-v] [-X type]

Options:
-h, --help
    Print detailed help screen
-V, --version
    Print version information
-w, --warning=INTEGER
    Exit with WARNING status if less than INTEGER units of disk are free
-w, --warning=PERCENT%
    Exit with WARNING status if less than PERCENT of disk space is free
-c, --critical=INTEGER
    Exit with CRITICAL status if less than INTEGER units of disk are free
-c, --critical=PERCENT%
    Exit with CRITCAL status if less than PERCENT of disk space is free
-W, --iwarning=PERCENT%
    Exit with WARNING status if less than PERCENT of inode space is free
-K, --icritical=PERCENT%
    Exit with CRITICAL status if less than PERCENT of inode space is free
-p, --path=PATH, --partition=PARTITION
    Path or partition (may be repeated)
-x, --exclude_device=PATH <STRING>
    Ignore device (only works if -p unspecified)
-C, --clear
    Clear thresholds
-E, --exact-match
    For paths or partitions specified with -p, only check for exact paths
-e, --errors-only
    Display only devices/mountpoints with errors
-g, --group=NAME
    Group pathes. Thresholds apply to (free-)space of all partitions together
-k, --kilobytes
    Same as '--units kB'
-l, --local
    Only check local filesystems
-L, --stat-remote-fs
    Only check local filesystems against thresholds. Yet call stat on remote filesystems
    to test if they are accessible (e.g. to detect Stale NFS Handles)
-M, --mountpoint
    Display the mountpoint instead of the partition
-m, --megabytes
    Same as '--units MB'
-A, --all
    Explicitly select all pathes. This is equivalent to -R '.*'
-R, --eregi-path=PATH, --eregi-partition=PARTITION
    Case insensitive regular expression for path/partition (may be repeated)
-r, --ereg-path=PATH, --ereg-partition=PARTITION
    Regular expression for path or partition (may be repeated)
-I, --ignore-eregi-path=PATH, --ignore-eregi-partition=PARTITION
    Regular expression to ignore selected path/partition (case insensitive) (may be repeated)
-i, --ignore-ereg-path=PATH, --ignore-ereg-partition=PARTITION
    Regular expression to ignore selected path or partition (may be repeated)
-t, --timeout=INTEGER
    Seconds before connection times out (default: 10)
-u, --units=STRING
    Choose bytes, kB, MB, GB, TB (default: MB)
-v, --verbose
    Show details for command-line debugging (Nagios may truncate output)
-X, --exclude-type=TYPE
    Ignore all filesystems of indicated type (may be repeated)

Examples:
check_disk -w 10% -c 5% -p /tmp -p /var -C -w 100000 -c 50000 -p /
    Checks /tmp and /var at 10% and 5%, and / at 100MB and 50MB
check_disk -w 100M -c 50M -C -w 1000M -c 500M -g sidDATA -r '^/oracle/SID/data.*$'
    Checks all filesystems not matching -r at 100M and 50M. The fs matching the -r regex
    are grouped which means the freespace thresholds are applied to all disks together
check_disk -w 100M -c 50M -C -w 1000M -c 500M -p /foo -C -w 5% -c 3% -p /bar
    Checks /foo for 1000M/500M and /bar for 5/3%. All remaining volumes use 100M/50M

Send email to nagios-users@lists.sourceforge.net if you have questions
regarding use of this software. To submit patches or suggest improvements,
send email to nagiosplug-devel@lists.sourceforge.net

　command.cfgの内容
　# 'check_local_disk' command definition
　　　　define command{
        　　　　command_name    check_local_disk
        　　　　command_line    $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
        　　　　}

　localhost.cfgにある例　これを変更して組み込む。この例では残り容量２０％で警報wを１０％でcを発するようにしている。
　#define service{
        use                             local-service
　　　　host_name                        localhost
　　　　service_description             Root Partition
　　　　check_command                    check_local_disk!20%!10%!/
　　　　}

　例にこの内容で現実のコマンドを動かしてみる。
　test:/usr/local/nagios/libexec# ./check_disk -w 20% -c 10% -p /
　DISK OK - free space: / 1984 MB (71% inode=86%);| /=794MB;2342;2635;0;2928
　test:/usr/local/nagios/libexec# ./check_disk -w 20% -c 10% -p /mnt
　DISK OK - free space: /mnt 30392 MB (93% inode=99%);| /mnt=2013MB;27312;30726;0;34140

　実際のディスクの状態
　Filesystem           1K-ブロック    使用   使用可使用% マウント位置
　/dev/sda2              2998824    813864   2032628 29% /
　tmpfs                    63088         0     63088   0% /lib/init/rw
　tmpfs                    63088         0     63088   0% /dev/shm
　/dev/root.old            13303     10998      2305 83% /initrd
　/dev/sda1               197657     12457    174994   7% /boot
　/dev/sda6             34959480   2061500 31122140   7% /mnt

　/と/mntの残り容量が検出されている。しかし閾値は超えていないので警報は発しない。
　なお、このcheck_diskはＮａｇｉｓｏをインストールしているホストのみの監視で別のホストの監視は当然出来ないようである。別ホストの資源監視はnrpeというソフトを使用する。この件に関しては別のページに記述する。

ホームページトップ　　ＬｉｎｋＳｔａｔｉｏｎトップ