Monday, 10 August 2015

Monitoring UrbanCode Deployments with Docker, Graphite, Grafana, collectd and Chef! (Part 1)

Monitoring an UrbanCode Deploy server (sometimes more in HA setups) and it's agents requires keeping track of resource utilization multiple environments, the UrbanCode deployment server(s) itself and the linkages between (ie. the network).

Typical resources include:

  • CPU
  • Memory
  • Java Heap
  • Threads
  • Disk
  • Network
  • Virtual environment (hypervisor)

In addition to resource utilization, log files should also be monitored for abnormal activity and traffic. There are commercial offerings which do these types of things but since UrbanCode Deploy itself is a deployment solution, it can be used to deliver monitoring to nodes. All that's is needed is monitoring agents and a collector and a means to configure and connect it all together.

In this post I'll demonstrate a quick bootstrap solution for system and JVM resource monitoring using UrbanCode Deploy. It will provide an "out of the box" monitoring dashboard solution, Grafana from data stored in Graphite (running in a Docker container) of metrics collected by collectd that installed on nodes using a Chef recipe that's deployed through UrbanCode Deploy. The end result looking something like this:

Fig. 1 Monitoring Topology
Fig. 1 Monitoring Topology

For the time being this solution is solely for a Linux environments (RHEL, Ubuntu and variants) but this solution can be adapted to other OS's as many of the components have counterparts for Windows, AIX and other OS's.

So how do we get there? Well, one approach is to set it up manually, quite an operation if you have 1000s of agents, so we'll need to do better.

First, the assets need to be installed.

Fig. 2 Installing the Solution
We will need:

  1. An UrbanCode server with a few agents. You'll also need to install the chef plugin from here: https://developer.ibm.com/urbancode/plugin/chef
  2. I also created a plugin with groovy that adds 2 additional steps for components. One step gets the latest version for a component, and the other step gets an ID for a version in the component. You can see the source code here, it's a good example of how to create a custom plugin. It's quite simple.
    Plugin:
    http://www.boriskuschel.com/downloads/ComponentPlus.zip
    Source
  3. Import a component from IBM BlueMix DevOps Service Git found here:
    https://hub.jazz.net/git/kuschel/monitorucd/contents/master/Collectd+Chef+Cookbook.json
    Import it from the Components tab:

    You should now see it listed:


    The component is preconfigured to connect to 
    IBM BlueMix DevOps Service Git and pull the recipe periodically and create a new version, you may change this behaviour in Basic Settings by unchecking the Import Versions Automatically setting.
    All you need to do now is supply a BlueMix username and password in the component properties page. You may need to enter a jazz.net username, if applicable (without a domain).

  4. Now you need to import a generic process (the top level Processes tab. Not the component!) that will be used to deploy the latest version of the component deployment package onto agent nodes. This process is kept in IBM BlueMix DevOps Services's Git and can be found here: https://hub.jazz.net/git/kuschel/monitorucd/contents/master/Install_collectd.json


    Or, you can quickly import this into UrbanCode Deploy by using curl:
    curl -k -X POST -F file=@Install_collectd.json https://<user>:<pass>@<ucd host>/rest/process/import
    
    NOTE: I noticed that after importing the generic template the versionName step in the Generic Import_collectd process design (design tab) had three bullets "•", this needs to be updated to ensure that the Secure Property Value field is blank. If it's not, the fetching of the latest version will fail when version is not specified.
    
  5. We need a metrics collector to store the metrics and a graphing engine to visualize them. We'll be using a Docker image of Graphite/Grafana I put together. You will need to ability to build run a docker container either using boot2docker or the native support available in Linux
    I have put the image up on the public docker registry as bkuschel/graphite-grafana but you can also build it from the Dockerfile in IBM BlueMix DebOps Services's Git at https://hub.jazz.net/git/kuschel/monitorucd/contents/master/Dockerfile
  6. To get the image run:

    docker pull bkuschel/graphite-grafana

    Now run the image and bind the ports 80 and 2003 from the docker container to the hosts ports.

    docker run -p 80:80 -p 2003:2003 -t bkuschel/graphite-grafana

    You can also mount file volumes to the container that contains the collector's database, if you wish that to be persisted. Each time you restart the container, it contains a fresh database. This has its advantages for testing. You can also specify other configurations beyond what are provided as defaults. Look at the Dockerfile for the volumes.

Once the solution is installed all that needs to be done is to execute the process on UrbanCode. Yes, it's that easy.

Go to the Process Tab in UrbanCode Deploy Server, Click on Run Next to the "Install_collectd" process.

A dialog will popup asking for a series of parameters. These will be explained in more depth in a later post regarding the Chef recipe I created. (You can find it here)
  • Component Name: Should be set to the name of the component we imported earlier
  • Version Name (Optional): You can specify the name of a specific version of the component to use, otherwise it will use the latest
  • Is this a collectd Server?: If you look at Fig. 1, you'll see that many collectd clients connect to a central collectd server. If this node is the central collectd, this should be checked. Generally, this should be the main agent in the UrbanCode Server, usually co-located with the server.
  • Collectd Install Directory: The default is good
  • Collectd Username: You can leave this as default. This username is the one used to encrypt traffice between collectd clients and servers.
  • Collectd Password: Set any password. This password is the one used to encrypt traffic between collectd clients and servers. It's a good idea  to encrypt this password with htpasswd utility before pasting it here. For example to set the admin password, the first parameter is the username, the second is the password. The output contains the username, a colon, then the encrypted password. Paste that value in this property:

    > htpasswd -bnm admin admin
    admin:$apr1$qSfx7.W2$xf/2k1mDHnksPXZlrU.b90
    
    
  • Collectd Server (client)/Graphite host (server): if "Is this a collectd Server?" is checked then this is the graphite server host, the host that is running the docker container. Otherwise this the collectd server host.
  • UCD Server: The installation directory of the UrbanCode server (ex. /opt/ibm_ucd/server) if this collectd is to be installed on a node with a server.
  • Java Monitoring Template: If UCD Server is set and is installed on tomcat, select tomcat.conf.erb, otherwise select java.conf.erb.
  • Resource: Select the agent that this process should be executed on. (the host.

Once you click Submit, this should happen:
Deploy Process
Fig 3. Deploy Process
At this point, all the collectd daemons should be started and collecting. Navigate to the docker host at http://<Docker host>/. You should see a tree with metrics, like this:

You can also navigate to Grafana at http://<Docker Host>/grafana. Note that the username and password for both Graphite and Grafana are admin/admin.

This is quite a mouthful for one blog post and there are so many aspects to cover such as:
  • The UrbanCode Deploy Process, how does it work?
  • The Chef Recipe.
  • Collectd Collection Options (and the nmon option!).
  • How to create useful graphs in Graphite and dashboards in Grafana.
I will cover these in subsequent postings. In the meantime, try to set it up and see how it goes. If you're lucky, you end up playing with some cool metrics and graphs in Graphite/Grafana.