Once you get used to writing plug-ins in Nagios and the complexity of the plug-ins you write grows, you may encounter this error,
service check timed out.
If some of your service checks have this problem, you can isolate the problem in these 3 values:
1. how slow is the plugin
- This is the first thing you should do. Check if how much time does your plugin needs before it can finish checking and provide an exit status. Log-on to the server you’re monitoring and run the plugin locally. Use the
timecommand to measure.
$ time /usr/lib/nagios/plugins/check_service
2. how short is NRPE’s patience
- Once you have the value (in seconds) in step #1, check your NRPE configuration in that same server . The default location of NRPE’s configuration is
- Find this parameter,
command_timeout. The value of this parameter, in seconds, must be greater than the value that you’ve got in step #1.
- Once the parameter’s been set, restart the NRPE service (
service nrpe restart).
3. how short is Nagios’ patience
- Nagios executes a command,
check_nrpe, to connect to a NRPE service.
check_nrpehas a timeout paramer,
-t. This parameter must have a bigger value than the one you set in step #2.
- Log-on to your Nagios server and you can set this by opening the commands configuration file,
check_nrpe, and edit its
command_lineand set the
-tparameter. For instance, if you want the timeout value to be 500 seconds, it will look like this:
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t 500
- Restart the Nagios service afterwards (
service nagios restart).
In most cases these 3 steps should do🙂