This how to will show how to install the open source software NAGIOS to monitor network devices such as switches, routers, servers, firewalls and UPS, and alert if they have any problems.
The base installation will use Ubuntu Server 11.04 (latest at time of writing) - due to compatibility with the VMware CLI as detailed later.
For the initial install, use Ubuntu Server 11.04 on a Virtual Machine (VM) with at least 512MB of RAM and an 8GB virtual disk, which can be thin provisioned.
After selecting your language, press F4 to select installing a 'Minimal Virtual Machine' if using VMware. Once setup is complete, install VMware tools.
Install NAGIOS according to this guide, using the default options however skip Step 9.
Note: If you plan to monitor 2008+ servers memory or pagefile, you will require the second (text based) version of this plugin with 'checkmem08' included. Don't forget to change owner (chown) to nagios:nagios and mark the file as executable (+x).
A big thanks and all credit for being able to monitor Windows via WMI from NAGIOS to Matthieu Thibault!
First, install the check_wmi plugin according to Matthieu's blog:
Second, if you are using a Microsoft Active Directory domain -
1. Create a new Group called "No Access"
2. Create a new User called "nagios_svc" and set a secure, non-expiring password
3. Make nagios_svc a member of "No Access" group, and remove from Domain Users...for security reasons
NOTE: To monitor an Active Directory domain controller, the user must be a member of the Domain 'Administrators' group, consider the security consequences first!
On the Windows server you wish to monitor, make this user a member of the "Local Administrators" group.
On the NAGIOS monitoring server -
Edit: /usr/local/nagios/etc/nagios.cfg
Uncomment: cfg_file=/usr/local/nagios/etc/objects/windows.cfgEdit:/usr/local/nagios/etc/objects/commands.cfg and add the following lines:
#Check Windows Drivesize
define command{
command_name wmi_drv
command_line /usr/local/nagios/libexec/check_wmi -H $HOSTADDRESS$ -u YOURDOMAIN/nagios_svc -p <Password> -m checkdrivesize -a $ARG1$ -w $ARG2$ -c $ARG3$
}
#Check Windows CPU
define command{
command_name wmi_cpu
command_line /usr/local/nagios/libexec/check_wmi -H $HOSTADDRESS$ -u YOURDOMAIN/nagios_svc -p <Password> -m checkcpu -w $ARG1$ -c $ARG2$
}
#Check Windows Memory <= 2003
define command{
command_name wmi_mem
command_line /usr/local/nagios/libexec/check_wmi -H $HOSTADDRESS$ -u YOURDOMAIN/nagios_svc -p <Password> -m checkmem -a $ARG1$ -w $ARG2$ -c $ARG3$
}
#Check Windows Memory => 2008
define command{
command_name wmi_mem08
command_line /usr/local/nagios/libexec/check_wmi -H $HOSTADDRESS$ -u YOURDOMAIN/nagios_svc -p <Password> -m checkmem08 -a $ARG1$ -w $ARG2$ -c $ARG3$
}
#Check Windows Eventlog
define command{
command_name wmi_eventlog
command_line /usr/local/nagios/libexec/check_wmi -H $HOSTADDRESS$ -u YOURDOMAIN/nagios_svc -p <Password> -m checkeventlog -a $ARG1$,$ARG2$,$ARG3$
}
#Check Windows Services
define command{
command_name wmi_service
command_line /usr/local/nagios/libexec/check_wmi -H $HOSTADDRESS$ -u YOURDOMAIN/nagios_svc -p <Password> -m checkservice -a $ARG1$
}
Edit the file /usr/local/nagios/etc/objects/windows.cfg by deleting default contents and replacing with:
#Windows Test Host 2008 R2
define host{
use windows-server ; Inherit default values from a template
host_name Windows-Test-Host-A-2008R2 ; The name we're giving to this host
alias Test 2008 R2 Windows Server A ; A longer name associated with the host
address 10.3.11.8 ; IP address of the host
}
define service{
use generic-service
host_name Windows-Test-Host-A-2008R2
service_description WinMemory08R2
check_command wmi_mem08!physical!80%!90%
}
define service{
use generic-service
host_name Windows-Test-Host-A-2008R2
service_description WinMemory08R2Pagefile
check_command wmi_mem08!page!70%!85%
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinDriveC
check_command wmi_drv!C:!85%!95%
}
define host{
use windows-server ; Inherit default values from a template
host_name Windows-Test-Host-A-2008R2 ; The name we're giving to this host
alias Test 2008 R2 Windows Server A ; A longer name associated with the host
address 10.3.11.8 ; IP address of the host
}
define service{
use generic-service
host_name Windows-Test-Host-A-2008R2
service_description WinMemory08R2
check_command wmi_mem08!physical!80%!90%
}
define service{
use generic-service
host_name Windows-Test-Host-A-2008R2
service_description WinMemory08R2Pagefile
check_command wmi_mem08!page!70%!85%
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinDriveC
check_command wmi_drv!C:!85%!95%
}
#Windows Test Host A
define host{
use windows-server ; Inherit default values from a template
host_name Windows-Test-Host-A-2003 ; The name we're giving to this host
alias Test 2003 Windows Server A ; A longer name associated with the host
address 10.3.11.32 ; IP address of the host
}
define service{
use generic-service
host_name Windows-Test-Host-A-2003
service_description WinDriveC
check_command wmi_drv!C:!5%!15%
}
define host{
use windows-server ; Inherit default values from a template
host_name Windows-Test-Host-A-2003 ; The name we're giving to this host
alias Test 2003 Windows Server A ; A longer name associated with the host
address 10.3.11.32 ; IP address of the host
}
define service{
use generic-service
host_name Windows-Test-Host-A-2003
service_description WinDriveC
check_command wmi_drv!C:!5%!15%
}
define service{
use generic-service
host_name Windows-Test-Host-A-2003
service_description WinMemory
check_command wmi_mem!physical!70%!80%
}
define service{
use generic-service
host_name Windows-Test-Host-A-2003
service_description WinPagefile
check_command wmi_mem!page!5%!85%
}
use generic-service
host_name Windows-Test-Host-A-2003
service_description WinMemory
check_command wmi_mem!physical!70%!80%
}
define service{
use generic-service
host_name Windows-Test-Host-A-2003
service_description WinPagefile
check_command wmi_mem!page!5%!85%
}
#Windows Test Host B
define host{
use windows-server ; Inherit default values from a template
host_name Windows-Test-Host-B-2003 ; The name we're giving to this host
alias Test 2003 Windows Server B ; A longer name associated with the host
address 10.3.11.31 ; IP address of the host
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinDriveC
check_command wmi_drv!C:!85%!95%
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinService-Printspooler
check_command wmi_service!Spooler
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinService-FileReplication
check_command wmi_service!NtFrs
}
define host{
use windows-server ; Inherit default values from a template
host_name Windows-Test-Host-B-2003 ; The name we're giving to this host
alias Test 2003 Windows Server B ; A longer name associated with the host
address 10.3.11.31 ; IP address of the host
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinDriveC
check_command wmi_drv!C:!85%!95%
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinService-Printspooler
check_command wmi_service!Spooler
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinService-FileReplication
check_command wmi_service!NtFrs
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinMemory
check_command wmi_mem!physical!80%!90%
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinPagefile
check_command wmi_mem!page!30%!70%
}
# Define a hostgroup for Windows machinesuse generic-service
host_name Windows-Test-Host-B-2003
service_description WinMemory
check_command wmi_mem!physical!80%!90%
}
define service{
use generic-service
host_name Windows-Test-Host-B-2003
service_description WinPagefile
check_command wmi_mem!page!30%!70%
}
define hostgroup{
hostgroup_name windows-servers ; The name of the hostgroup
alias Windows Servers ; Long name of the group
members Windows-Test-Host-B-2003,Windows-Test-Host-A-2003
}
define service{
use generic-service
hostgroup windows-servers
service_description WinCPU
check_command wmi_cpu!5%!15%
}
define service{
use generic-service
hostgroup windows-servers
service_description WinMemory
check_command wmi_mem!physical!30%!35%
}
define service{
use generic-service
hostgroup windows-servers
service_description WinPagefile
check_command wmi_mem!page!5%!15%
}
define service{
use generic-service
hostgroup windows-servers
service_description WinEventlogSystem
check_command wmi_eventlog!system!1!24
}
define service{
use generic-service
hostgroup windows-servers
service_description WinEventlogApplication
check_command wmi_eventlog!application!1!24
}
define service{
use generic-service
hostgroup windows-servers
service_description WinEventlogSecurity
check_command wmi_eventlog!security!1!24
}
Comment: Drive space percentages can be defined using a Hostgroup, but 3% free space may be acceptable for a 2TB data partition, but probably isn't for a 20GB boot partition. Memory is the same, 98% memory utilisation might be OK for a SQL server, but not a file server.
Verify your configuration:
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
And restart NAGIOS: /etc/init.d/nagios restart
You should see something like this underneath the 'Current Status': Services view -
Initially, it's suggested you tune the monitoring to show a "sea of green" - assuming the infrastructure you wish to monitor is functioning normally.
The final step is to enable email alerting. This tutorial assumes you have an SMTP server elsewhere on your network that NAGIOS can use to relay.
apt-get install nullmailer
Upon configuration, select the host/domain you want the mail to appear to come from, and your SMTP server that allows relaying.
define contact{
contact_name admin1
use generic-contact
service_notification_options c,r
host_notification_options d,u,r,f,s
alias Admin1
email admin1@example.int
}
define contact{
contact_name admin2
use generic-contact
service_notification_options c,r
host_notification_options d,u,r,f,s
alias Admin2
email admin2@example.int
}
define contactgroup{
contactgroup_name admins
alias Nagios Administrators
members admin1,admin2
}
And add the following lines underneath each host configured in /usr/local/nagios/etc/objects/windows.cfg
contacts admin1,admin2
Add the following lines underneath each service configured, that should receive email notification
contacts admin1,admin2
Restart NAGIOS and email notifications should be received when a condition (eg. c=critical) is reached by a host or service.
Hello,
ReplyDeleteThe link to the check_wmi seems to be broken.
Do you have any way to put the plugin somewhere else ?
Thank you,
hello fred, try it out ,
ReplyDeletehttp://www.edcint.co.nz/checkwmiplus/node/32
Amazing knowledge sharing. Please keep it. Also share something for CPU temperature, Hardware failure, share folder sharing status, cluster , SAN and RAID.
ReplyDeleteThanks