Nagiosが起動しなくなったので、その状態と対応内容となります。
原因はNDOUtils1.xがNagios4.xに対応していなかったからですが、1月にNagios4.0.2にアップデートしてから
何回かサーバを再起動しており、その時は正常に起動してました。
何故にこのタイミングで動かなくなった理由はわかりません(面倒くさくなって調べていない)。
また、復旧手段が結構乱暴かつ強引です。
■環境
・CnetOS 6.5
・Nagios 4.0.2
・NDOUtils 1.5.2
■障害発生時の状態
# /etc/rc.d/init.d/nagios configtest
Nagios Core 4.0.2
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 11-25-2013
License: GPL
Website: http://www.nagios.org
Reading configuration data...
~~~省略~~~
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Object precache file created:
/usr/local/nagios/var/objects.precache
※"/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg"でも可
# /etc/rc.d/init.d/nagios status
nagios は停止していますがサブシステムがロックされています
# cat /var/log/messages
~~~省略~~~
Warning: use_embedded_perl_implicitly is deprecated and will be removed.
Warning: enable_embedded_perl is deprecated and will be removed.
Warning: p1_file is deprecated and will be removed.
Warning: sleep_time is deprecated and will be removed.
Warning: external_command_buffer_slots is deprecated and will be removed. All commands are always processed upon arrival
Warning: command_check_interval is deprecated and will be removed. Commands are always handled on arrival
Nagios 4.0.3rc1 starting... (PID=18306)
Local time is Sun Feb 02 00:00:41 JST 2014
LOG VERSION: 2.0
qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
qh: core query handler registered
nerd: Channel hostchecks registered successfully
nerd: Channel servicechecks registered successfully
nerd: Channel opathchecks registered successfully
nerd: Fully initialized and ready to rock!
wproc: Successfully registered manager as @wproc with query handler
wproc: Registry request: name=Core Worker 18310;pid=18310
wproc: Registry request: name=Core Worker 18311;pid=18311
wproc: Registry request: name=Core Worker 18309;pid=18309
wproc: Registry request: name=Core Worker 18308;pid=18308
Error: Could not load module '/usr/local/nagios/bin/ndomod.o' -> /usr/local/nagios/bin/ndomod.o: undefined symbol: servicedependency_list
Error: Failed to load module '/usr/local/nagios/bin/ndomod.o'.
Error: Module loading failed. Aborting.
■復旧方法
Nagios4.xでNDOUtilsを使うにはndomod-4x.o(/usr/local/nagios/bin/ndomod.o)が必要なんですが
NDOUtils1.xには含まれておりません。
なので、普通に公開されているものではないソースコードを使うことになります。
1. コンパイルするソースコードを入手
以下にアクセスし、ソースコードを入手します。
右上にある”Download Snapshot”をクリックするとダウンロードが開始されます。
http://sourceforge.net/p/nagios/ndoutils/ci/ndoutils-2-0/tree/
2. ソースコードのコンパイル・インストール
ダウンロードしたZIPファイルをLinux上で展開、コンパイルします
# mv nagios-ndoutils.zip /usr/local/src/ # cd /usr/local/src/ # unzip nagios-ndoutils.zip ~~~省略~~~ # chmod 775 ./* # make cd ./src && make make[1]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o io.o io.c gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o utils.o utils.c gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o file2sock file2sock.c io.o utils.o -lm -lnsl gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o log2ndo log2ndo.c io.o utils.o -lm -lnsl make ndo2db-2x make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -c -o db.o db.c gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -c -o dbhandlers-2x.o dbhandlers.c gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -o ndo2db-2x queue.c ndo2db.c dbhandlers-2x.o io.o utils.o db.o -lnsl -rdynamic -L/usr/lib64/mysql -lmysqlclient -lz -lcrypt -lnsl -lm -lssl -lcrypto -lm make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます make ndo2db-3x make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_3X -c -o dbhandlers-3x.o dbhandlers.c gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_3X -o ndo2db-3x queue.c ndo2db.c dbhandlers-3x.o io.o utils.o db.o -lnsl -rdynamic -L/usr/lib64/mysql -lmysqlclient -lz -lcrypt -lnsl -lm -lssl -lcrypto -lm make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます make ndo2db-4x make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -I ../include/nagios-4x -D BUILD_NAGIOS_4X -c -o dbhandlers-4x.o dbhandlers.c gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_4X -o ndo2db-4x queue.c ndo2db.c dbhandlers-4x.o io.o utils.o db.o -lnsl -rdynamic -L/usr/lib64/mysql -lmysqlclient -lz -lcrypt -lnsl -lm -lssl -lcrypto -lm make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます make ndomod-2x.o make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_2X -o ndomod-2x.o ndomod.c io.o utils.o -shared -lnsl make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます make ndomod-3x.o make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -D BUILD_NAGIOS_3X -o ndomod-3x.o ndomod.c io.o utils.o -shared -lnsl make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます make ndomod-4x.o make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります gcc -fPIC -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -I ../include/nagios-4x -D BUILD_NAGIOS_4X -o ndomod-4x.o ndomod.c io.o utils.o -shared -lnsl make[2]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます gcc -g -O2 -I/usr/include/mysql -DHAVE_CONFIG_H -o sockdebug sockdebug.c io.o utils.o -lm -lnsl make[1]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます # make install cd ./src && make install make[1]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' に入ります /usr/bin/install -c -m 775 -o nagios -g nagios -d /usr/local/nagios/bin /usr/bin/install -c -m 755 -o nagios -g nagios ndo2db-4x /usr/local/nagios/bin/ndo2db /usr/bin/install -c -m 755 -o nagios -g nagios ndomod-4x.o /usr/local/nagios/bin/ndomod.o /usr/bin/install -c -m 774 -o nagios -g nagios file2sock /usr/local/nagios/bin /usr/bin/install -c -m 774 -o nagios -g nagios log2ndo /usr/local/nagios/bin /usr/bin/install -c -m 774 -o nagios -g nagios sockdebug /usr/local/nagios/bin Hint: NDOUtils Installation against Nagios v4.x completed. If you want to install NDOUtils for Nagios v3.x please type 'make install-3x If you want to install NDOUtils for Nagios v2.x please type 'make install-2x Next step should be the database initialization/upgrade cd into the db/ directory and either: ./installdb (for a new installation) or: ./upgradedb (for an existing one) make[1]: ディレクトリ `/usr/local/src/nagios-ndoutils/src' から出ます Main NDOUtils components installed
3. Nagios(ndoutils)の起動
# /etc/rc.d/init.d/ndoutils start # /etc/rc.d/init.d/nagios start
4. Nagios(ndoutils)の起動確認
# cat /var/log/messages
Feb 2 01:00:36 HOGE nagios: Warning: use_embedded_perl_implicitly is deprecated and will be removed.
Feb 2 01:00:36 HOGE nagios: Warning: enable_embedded_perl is deprecated and will be removed.
Feb 2 01:00:36 HOGE nagios: Warning: p1_file is deprecated and will be removed.
Feb 2 01:00:36 HOGE nagios: Warning: sleep_time is deprecated and will be removed.
Feb 2 01:00:36 HOGE nagios: Warning: external_command_buffer_slots is deprecated and will be removed. All commands are always processed upon arrival
Feb 2 01:00:36 HOGE nagios: Warning: command_check_interval is deprecated and will be removed. Commands are always handled on arrival
Feb 2 01:00:36 HOGE nagios: Nagios 4.0.2 starting... (PID=21747)
Feb 2 01:00:36 HOGE nagios: Local time is Sun Feb 02 01:00:36 JST 2014
Feb 2 01:00:36 HOGE nagios: LOG VERSION: 2.0
Feb 2 01:00:36 HOGE nagios: qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized
Feb 2 01:00:36 HOGE nagios: qh: core query handler registered
Feb 2 01:00:36 HOGE nagios: nerd: Channel hostchecks registered successfully
Feb 2 01:00:36 HOGE nagios: nerd: Channel servicechecks registered successfully
Feb 2 01:00:36 HOGE nagios: nerd: Channel opathchecks registered successfully
Feb 2 01:00:36 HOGE nagios: nerd: Fully initialized and ready to rock!
Feb 2 01:00:36 HOGE nagios: wproc: Successfully registered manager as @wproc with query handler
Feb 2 01:00:36 HOGE nagios: wproc: Registry request: name=Core Worker 21750;pid=21750
Feb 2 01:00:36 HOGE nagios: wproc: Registry request: name=Core Worker 21751;pid=21751
Feb 2 01:00:36 HOGE nagios: wproc: Registry request: name=Core Worker 21749;pid=21749
Feb 2 01:00:36 HOGE nagios: wproc: Registry request: name=Core Worker 21752;pid=21752
Feb 2 01:00:36 HOGE nagios: ndomod: NDOMOD 2.0.0 (10-30-2012) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Feb 2 01:00:36 HOGE nagios: ndomod: Successfully connected to data sink. 0 queued items to flush.
Feb 2 01:00:36 HOGE nagios: ndomod registered for process data
Feb 2 01:00:36 HOGE nagios: ndomod registered for timed event data
Feb 2 01:00:36 HOGE nagios: ndomod registered for log data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for system command data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for event handler data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for notification data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for service check data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for host check data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for comment data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for downtime data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for flapping data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for program status data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for host status data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for service status data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for adaptive program data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for adaptive host data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for adaptive service data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for external command data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for aggregated status data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for retention data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for contact data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for contact notification data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for acknowledgement data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for state change data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for contact status data'
Feb 2 01:00:36 HOGE nagios: ndomod registered for adaptive contact data'
Feb 2 01:00:36 HOGE nagios: Event broker module '/usr/local/nagios/bin/ndomod.o' initialized successfully.
Feb 2 01:00:36 HOGE nagios: Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/templates.cfg', starting at line 13)
Feb 2 01:00:36 HOGE nagios: Warning: failure_prediction_enabled is obsoleted and no longer has any effect in host type objects (config file '/usr/local/nagios/etc/templates.cfg', starting at line 32)
Feb 2 01:00:36 HOGE nagios: Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/templates.cfg', starting at line 51)
Feb 2 01:00:36 HOGE nagios: Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/templates.cfg', starting at line 77)
Feb 2 01:00:36 HOGE nagios: Warning: failure_prediction_enabled is obsoleted and no longer has any effect in service type objects (config file '/usr/local/nagios/etc/templates.cfg', starting at line 103)
Feb 2 01:00:36 HOGE ndo2db: Warning: Retrying message send. This can occur because you have too few messages allowed or too few total bytes allowed in message queues. You are currently using 64 of 7658 messages and 65536 of 65536 bytes in the queue. See README for kernel tuning options.
Feb 2 01:00:36 HOGE nagios: Successfully launched command file worker with pid 21755
==================================================
upgradedbしないといけないような気がするんですが、実行すると以下のようなエラーが・・・・・。
# cd /usr/local/src/nagios-ndoutils/db # ./upgradedb -u root -p パスワード -h localhost -d nagios Current database version: 1.5.2 ** DB upgrade required for 2.0.0 Using mysql-upgrade-2.0.0.sql for upgrade... ERROR 1060 (42S21) at line 8: Duplicate column name 'minimum_importance' Upgrade from mysql-upgrade-2.0.0.sql failed at ./upgradedb line 106.
mysql-upgrade-2.0.0.sqlの中身を見ると、”ALTER TABLE * ADD“してるから
既存テーブルと重複、失敗してるような気がしますが「動いてるから良いや」で放置してます。