Montag, 4. Januar 2016

Nagios Core 4 service failed on CentOS 7 (EPEL release)

Just a quick post to fix problems with Nagios Core 4 service under CentOS 7.

After installing Nagios Core 4 and making some basic configurations the service cannot be started on CentOS 7.

Full description of the Nagios service:

Name : nagios
Arch : x86_64
Version : 4.0.8
Release : 1.el7

CentOS Linux release 7.2.1511 (Core), yum packages updated today on January 4th, 2016.


Running the nagios.service gives the following error:

systemctl status nagios
● nagios.service - Nagios Network Monitoring
Loaded: loaded (/usr/lib/systemd/system/nagios.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Mon 2016-01-04 09:32:45 EST; 7min ago
Docs: http://www.nagios.org/documentation
Process: 10738 ExecStart=/usr/sbin/nagios /etc/nagios/nagios.cfg (code=exited, status=1/FAILURE)
Process: 10736 ExecStartPre=/usr/sbin/nagios -v /etc/nagios/nagios.cfg (code=exited, status=0/SUCCESS)
Main PID: 10738 (code=exited, status=1/FAILURE)

Jan 04 09:32:45 localhost.localdomain systemd[1]: Starting Nagios Network Monitoring...
Jan 04 09:32:45 localhost.localdomain systemd[1]: Started Nagios Network Monitoring.
Jan 04 09:32:45 localhost.localdomain systemd[1]: nagios.service: main process exited, code=exited, status=1/FAILURE
Jan 04 09:32:45 localhost.localdomain systemd[1]: Unit nagios.service entered failed state.
Jan 04 09:32:45 localhost.localdomain systemd[1]: nagios.service failed.

Reviewing journalctl I can nail down the problem to:

Jan 04 09:41:25 localhost.localdomain nagios[11123]: Nagios 4.0.8 starting... (PID=11123)
Jan 04 09:41:25 localhost.localdomain nagios[11123]: Local time is Mon Jan 04 09:41:25 EST 2016
Jan 04 09:41:25 localhost.localdomain nagios[11123]: qh: Failed to init socket '/var/log/nagios/rw/nagios.qh'. bind() failed: No such file or directory
Jan 04 09:41:25 localhost.localdomain nagios[11123]: Error: Failed to initialize query handler. Aborting

Doing an Internet search got me to the following solution:
bugzilla.redhat.com

Here is my detailed steps on how to fix this:

What happened?
Currently Nagios EPEL version 4.0.8 refuses to start because it cannot create and access its standard socket file located at /var/log/nagios/rw/nagios.qh due to some missing Nagios SELinux policy rule on CentOS 7.

This should be fixed soon (nagios-4.0.8-2.el7 ?) but until then we need to build our own local Nagios Policy module fix:

Open the following file vi /tmp/nagios-socket.te and put in the following content:
module nagios-socket 1.0;
require {
type nagios_t;
type nagios_log_t;
class sock_file { write create unlink };
class unix_stream_socket connectto;
}
allow nagios_t nagios_log_t:sock_file { write create unlink };
allow nagios_t self:unix_stream_socket connectto;

Afterwards install the policycoreutils-python YUM package if not available already on your system and type the following commands to compile, build and install the new local Nagios policy module before restarting Nagios – which should work afterwards:
yum install policycoreutils-python -y
cd /tmp;checkmodule -M -m -o nagios-socket.mod nagios-socket.te
semodule_package -o nagios-socket.pp -m nagios-socket.mod
semodule -i nagios-socket.pp
install -d -m 755 -o nagios -g nagios /var/log/nagios/rw
systemctl restart nagios && systemctl status nagios

Done!