6. Writing your own Nagios plugins

Plugins are executable files run by Nagios to determine the status of a host or service. By default, Nagios comes with a very rich set of official plugins that should cover most people's needs; in addition, you can find lots of contributed plugins on the Monitoring Exchange website, some of which are also available via OpenBSD's packages and ports system.

However, despite the abundance of plugins, there may be occasions in which no existing plugin is suitable for monitoring a particular service, thus forcing you to write a fully custom plugin, tailored to your exact needs. Luckily, this is a very simple task!

Nagios doesn't bind you to a specific programming language: plugins may be either compiled C programs or interpreted scripts, in Perl, shell, Python or any other language. Nagios doesn't mess with the internals of plugins; however, it asks developers to follow a few basic guidelines, just for standard's sake.

6.1 Command line options

A plugin's command line must follow some specific requirements:

6.2 Plugin return codes

Nagios determines the status of a host or service based on the return code of the plugin. Valid return codes are:

Numeric value Service/Host status Service Status description Host status description
0 Ok/Up The plugin was able to check the service and it seemed to work correctly The host is up and replied in acceptable time
1 Warning The plugin was able to check the service, but it didn't seem to work correctly or it exceeded some "warning" threshold The host is up, but some "warning" threshold was exceeded
2 Critical/Down The service was not running or it exceeded some "critical" threshold The host is down or some "critical" threshold was exceeded
3 Unknown Invalid command line arguments were supplied or an internal error occurred Invalid command line arguments were supplied or an internal error occurred

The warning and critical thresholds are usually set via command line options (see above).

6.3 A sample plugin script

Just a couple of notes before moving to a practical example:

Well, so let's see, as an example, what a plugin to monitor the amount of free memory on the local machine could look like:

/usr/local/libexec/nagios/check_free_mem.sh
#!/bin/ksh

################################################################################
# Sample Nagios plugin to monitor free memory on the local machine             #
# Author: Daniele Mazzocchio (http://www.kernel-panic.it/)                     #
################################################################################

VERSION="Version 1.0"
AUTHOR="(c) 2007-2009 Daniele Mazzocchio (danix@kernel-panic.it)"

PROGNAME=`/usr/bin/basename $0`

# Constants
BYTES_IN_MB=$(( 1024 * 1024 ))
KB_IN_MB=1024

# Exit codes
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3

# Helper functions #############################################################

function print_revision {
   # Print the revision number
   echo "$PROGNAME - $VERSION"
}

function print_usage {
   # Print a short usage statement
   echo "Usage: $PROGNAME [-v] -w <limit> -c <limit>"
}

function print_help {
   # Print detailed help information
   print_revision
   echo "$AUTHOR\n\nCheck free memory on local machine\n"
   print_usage

   /bin/cat <<__EOT

Options:
-h
   Print detailed help screen
-V
   Print version information

-w INTEGER
   Exit with WARNING status if less than INTEGER MB of memory are free
-w PERCENT%
   Exit with WARNING status if less than PERCENT of memory is free
-c INTEGER
   Exit with CRITICAL status if less than INTEGER MB of memory are free
-c PERCENT%
   Exit with CRITICAL status if less than PERCENT of memory is free
-v
   Verbose output
__EOT
}

# Main #########################################################################

# Total memory size (in MB)
tot_mem=$(( `/sbin/sysctl -n hw.physmem` / BYTES_IN_MB))
# Free memory size (in MB)
free_mem=$(( `/usr/bin/vmstat | /usr/bin/tail -1 | /usr/bin/awk '{ print $5 }'` / KB_IN_MB ))
# Free memory size (in percentage)
free_mem_perc=$(( free_mem * 100 / tot_mem ))

# Verbosity level
verbosity=0
# Warning threshold
thresh_warn=
# Critical threshold
thresh_crit=

# Parse command line options
while [ "$1" ]; do
   case "$1" in
       -h | --help)
           print_help
           exit $STATE_OK
           ;;
       -V | --version)
           print_revision
           exit $STATE_OK
           ;;
       -v | --verbose)
           : $(( verbosity++ ))
           shift
           ;;
       -w | --warning | -c | --critical)
           if [[ -z "$2" || "$2" = -* ]]; then
               # Threshold not provided
               echo "$PROGNAME: Option '$1' requires an argument"
               print_usage
               exit $STATE_UNKNOWN
           elif [[ "$2" = +([0-9]) ]]; then
               # Threshold is a number (MB)
               thresh=$2
           elif [[ "$2" = +([0-9])% ]]; then
               # Threshold is a percentage
               thresh=$(( tot_mem * ${2%\%} / 100 ))
           else
               # Threshold is neither a number nor a percentage
               echo "$PROGNAME: Threshold must be integer or percentage"
               print_usage
               exit $STATE_UNKNOWN
           fi
           [[ "$1" = *-w* ]] && thresh_warn=$thresh || thresh_crit=$thresh
           shift 2
           ;;
       -?)
           print_usage
           exit $STATE_OK
           ;;
       *)
           echo "$PROGNAME: Invalid option '$1'"
           print_usage
           exit $STATE_UNKNOWN
           ;;
   esac
done

if [[ -z "$thresh_warn" || -z "$thresh_crit" ]]; then
   # One or both thresholds were not specified
   echo "$PROGNAME: Threshold not set"
   print_usage
   exit $STATE_UNKNOWN
elif [[ "$thresh_crit" -gt "$thresh_warn" ]]; then
   # The warning threshold must be greater than the critical threshold
   echo "$PROGNAME: Warning free space should be more than critical free space"
   print_usage
   exit $STATE_UNKNOWN
fi

if [[ "$verbosity" -ge 2 ]]; then
   # Print debugging information
   /bin/cat <<__EOT
Debugging information:
  Warning threshold: $thresh_warn MB
  Critical threshold: $thresh_crit MB
  Verbosity level: $verbosity
  Total memory: $tot_mem MB
  Free memory: $free_mem MB ($free_mem_perc%)
__EOT
fi

if [[ "$free_mem" -lt "$thresh_crit" ]]; then
   # Free memory is less than the critical threshold
   echo "MEMORY CRITICAL - $free_mem_perc% free ($free_mem MB out of $tot_mem MB)"
   exit $STATE_CRITICAL
elif [[ "$free_mem" -lt "$thresh_warn" ]]; then
   # Free memory is less than the warning threshold
   echo "MEMORY WARNING - $free_mem_perc% free ($free_mem MB out of $tot_mem MB)"
   exit $STATE_WARNING
else
   # There's enough free memory!
   echo "MEMORY OK - $free_mem_perc% free ($free_mem MB out of $tot_mem MB)"
   exit $STATE_OK
fi