Wednesday, 12 August 2015

Interlude: Monitoring UrbanCode Deployments with nmon

In Part 1, Part 2 and Part 3 of my blog series called "Monitoring UrbanCode Deployments with Docker, Graphite, Grafana, collectd and Chef!" I have been using collectd as the engine for collecting metrics on nodes and feeding them to Graphite. There is another option to collectd metrics and feed it into graphite: nmon (short for Nigel's performance Monitor; one could say we're Making Plans for Nigel's performance monitor)

There is really no great reason to use nmon over collectd in most cases, collectd can collect metrics from far more sources and has a secure client/server architecture for data transmission. So why use nmon at all? Even though it's supported on Linux, there isn't much reason to use it there but it's greatest benefit is for AIX, especially if LPARs are being used. LPARs are a type of virtualization where multiple AIX OS instances can be run on one physical machine. nmon provides per LPAR metrics for things like memory and CPU consumption. This can be useful.

In fact, you can run collectd AND nmon together on the same node and get the best of both worlds. What would an nmon topology with UrbanCode look like?

I haven't created any UrbanCode processes for deploying this nmon solution but it's definitely feasible if not easier as nmon is usually packaged as part of AIX and readily available for Linux. There is also a Chef recipe for it.

How do we get nmon to feed graphite? There is a script called nmon2graphite that runs as daemon and connects via  a pipe to an nmon daemon. and converts the nmon language to the graphite language.

You can download this script for AIX here:

Note that this is part of a broader project hosted here on the nmon2graphite home page which also augments the graphite engine to add a custom page. This page is optional. I have already augmented the bkuschel/graphite-grafana with the suggested tweeks so it will be able to collectd from nmon2graphite. You are free to modify the image to add the custom page, if desired.

The version of this script that I modified for Linux can be downloaded here:

(This Linux version of this script does not work with the custom page found on the nmon2graphite home page without heavy modification.)

To get this script to automatically start and connect to graphite on port 2003 you need to make it start using a cron job. This is outlined in "Client Side" topic on nmon2graphite home page. I'll outline the basic steps here, this will need to be done on every agent machine so this would be a great candidate for an UrbanCode Deploy script step in a process that deploys this solution.

As root, make a directory to save the nmon2graphite script into. Make it root executable. I put mine in /opt/nmon2graphite/

mkdir /opt/nmon2graphite
chmod u+x /opt/nmon2graphite/nmon2graphite

Create a another directory in the directory in which you saved the script called "nmon":

mkdir /opt/nmon2graphite/nmon

As the root user, edit the crontab

crontab -e

You can also sudo it:

sudo crontab -u root -e

Add these lines to the end of it, change graphite host to the docker host, you can leave port 2003 unless you bound the docker container 2003 do a different host port.

0 0 * * * /usr/bin/mkfifo /opt/nmon2graphite/nmon/$(date +\%Y-\%m-\%d-\%H-\%M).$
0 0 * * *  sleep 10 ; /opt/nmon2graphite/nmon2graphite -i graphitehost -p 2003 -l $
0 1 * * * find /opt/nmon2graphite/nmon -type f -mtime +30 | xargs rm -f >/dev/n$

That's it, after this cron job executes nmon should now be feeding the graphite container. (You can also execute these command manually) There should be a section for nmon: