nagios-api - presents a REST-like JSON interface to Nagios
This program provides a simple REST-like interface to Nagios. Run this on your Nagios host and then sit back and enjoy a much easier, more straightforward way to accomplish things with Nagios. You can use the bundled nagios-cli, but you may find it easier to write your own system for interfacing with the API.
Usage is pretty easy:
nagios-api -p 8080 -c /var/lib/nagios3/rw/nagios.cmd \
-s /var/cache/nagios3/status.dat -l /var/log/nagios3/nagios.log
You must at least provide the status file options. If you don’t provide the other options, then we will disable that functionality and error to clients who request it.
The server speaks JSON. You can either GET data from it or POST data to it and take an action. It’s pretty straightforward, here’s an idea of what you can do from the command line:
curl http://localhost:6315/state
That calls the state method and returns the JSON result.
curl -d '{"host": "web01", "duration": 600}' \
http://localhost:6315/schedule_downtime
This POSTs the given JSON object to the schedule_downtime method. You
will note that all objects returned follow a predictable format:
{"content": <object>, "result": <bool>}
The result field is always true or false, allowing you to
determine at a glance if the command succeeded. The content field may
be any valid JavaScript object: an int, string, null, bool, hash, list,
etc etc. What is returned depends on the method being called.
Once your API server is up and running you can access it through the included nagios-cli script. Here are some examples:
nagios-cli state
Dump a very large, raw JSON output. This is hard to read, so I recommend that you do this instead:
nagios-cli state | python -mjson.tool
That should give you an output that shows all of the relevant information to construct a status display. Hosts, services, comments, downtimes, and state information for everything. You can manipulate this information with several commands:
nagios-cli schedule_downtime host=web01 duration=600
This schedules a 10 minute downtime for the host web01. You can do the same with a service:
nagios-cli schedule_downtime host=web01 service=Apache duration=600 \
author=mark "comment=Some comment to make."
The syntax is kind of hard if you want to put spaces in your comments, please note that this CLI script is extremely preliminary. ;)
Cancelling downtime can be done the same:
nagios-cli cancel_downtime host=web01
That removes any downtimes that are on the host. Note that you will need to also remove service downtimes separately. The return code will let you know how things went. Also note that you can’t cancel a downtime until Nagios has acknowledged it being created!
What does this mean? If you have a script that creates a downtime, then does some work and finally cancels the downtime — if that work is too fast, we will not be able to cancel the downtime since it hasn’t actually been created yet. Nagios is an old and slow system and we don’t know the ID of the downtimes we create until it updates the status file — which could be tens of seconds or longer.
If the log file is enabled, you can get the last 1000 items in the Nagios log like this:
nagios-cli log
For now that output is an unparsed list. In the future I want to transform this into a logical system that tells you what is happening in a way that is easy to use so we don’t have to duplicate Nagios log parsing code in every project.
- -p, --port=PORT
-
Listen on port PORT for HTTP requests.
- -c, --command-file=FILE
-
Use FILE to write commands to Nagios. This is where external commands are sent. If your Nagios installation does not allow external commands, do not set this option.
- -s, --status-file=FILE
-
Set FILE to the status file where Nagios stores its status information. This is where we learn about the state of the world and is the only required parameter.
- -l, --log-file=FILE
-
Point FILE to the location of Nagios’s log file if you want to allow people to subscribe to it.
- -o, --allow-origin=ORIGIN
-
Modern web browsers implement the Cross-Origin Resource Sharing specification from W3C. This spec allows you to host your JavaScript/HTML on one host and have it access an endpoint on a different service. This requires setting a header on the endpoint, which this option allows you to do.
You can simply set this header to
*and not worry about it if you want to allow all access. For more information see the CORS specification. - -q, --quiet
-
If present, we will only print warning/critical messages. Useful if you are running this in the background.
This program currently supports only a subset of the Nagios API. More is being added as it is needed. If you need somethin that isn’t here, please consider submitting a patch!
This section is organized into methods and sorted alphabetically. Each method is specified as a URL and may include an integer component on the path. Most data is passed as JSON objects in the body of a POST.
Very simply, this immediately lifts a downtime that is currently in
effect on a host or service. If you know the downtime_id, you can
specify that as a URL argument like this:
curl -d "{}" http://localhost:6315/cancel_downtime/15
That would cancel the downtime with downtime_id of 15. Most of the
time you will probably not have this information and so we allow you to
cancel by host/service as well.
- host=STRING [required]
-
Which host to cancel downtime from. This must be specified if you are not using the
downtime_iddirectly. - service=STRING
-
Optional. If specified, cancel any downtimes on this service.
- services_too=BOOL
-
Optional. If true and you have not specified a
servicein specific, then we will cancel all downtimes on this host and all of the services it has.
Simply returns the most recent 1000 items in the Nagios event log. These are currently unparsed. There is a plan to parse this in the future and return event objects.
This general purpose method is used for creating fixed length downtimes. This method can be used on hosts and services. You are allowed to specify the author and comment to go with the downtime, too. The JSON parameters are:
- host=STRING [required]
-
Which host to schedule a downtime for. This must be specified.
- duration=INTEGER [required]
-
How many seconds this downtime will last for. They begin immediately and continue for
durationseconds before ending. - service=STRING
-
Optional. If specified, we will schedule a downtime for this service on the above host. If not specified, then the downtime will be scheduled for the host itself.
- services_too=BOOL
-
Optional. If true and you have not specified a
servicein specific, then we will schedule a downtime for the host and all of the services on that host. Potentially many downtimes are scheduled. - author=STRING
-
Optional. The name of the author. This is useful in UIs if you want to disambiguate who is doing what.
- comment=STRING
-
Optional. As above, useful in the UI.
The result of this method is a text string that indicates whether or
not the downtimes have been scheduled or if a different error occurred.
We do not have the ability to get the downtime_id that is generated,
unfortunately, as that would require waiting for Nagios to regenerate
the status file.
This method takes no parameters. It returns a large JSON object containing all of the active state from Nagios. Included are all hosts, services, downtimes, comments, and other things that may be in the global state object.
If you are using passive service checks or you just want to submit a result for a check, you can use this method to submit your result to Nagios.
- host=STRING [required]
-
The host to submit a result for. This is required.
- service=STRING
-
Optional. If specified, we will submit a result for this service on the above host. If not specified, then the result will be submitted for the host itself.
- status=INTEGER [required]
-
The status code to set this host/service check to. If you are updating a host’s status: 0 = OK, 1 = DOWN, 2 = UNREACHABLE. For service checks, 0 = OK, 1 = WARNING, 2 = CRITICAL, 3 = UNKNOWN.
- output=STRING [required]
-
The plugin output to be displayed in the UI and stored. This is a single line of text, normally returned by checkers.
The response indicates if we successfully wrote the command to the log.
Written by Mark Smith <[email protected]> while under the employ of Bump Technologies, Inc.