Tuesday, 11 August 2015

Monitoring UrbanCode Deployments with Docker, Graphite, Grafana, collectd and Chef! (Part 2: The UCD Process)

Following up to Part 1, in Part 2 I'll cover the Urban Code Deploy Process of deploying collectd. Actually, I'm going to throw in another process as well. If you were able to get Part 1 working, you'll notice that you can only execute the process on one agent at a time. Not cool, especially if you have thousands of agents. There has to be a better way! There is, but first, let's go over the per agent process. The input parameters for this process were described in Part 1.


The best way to go about this is to describe each step.


  1. Set Defaults: This sets up some in process defaults, well, just one for now the nullProperty. This is a property that is used in conditional forks to present an empty string. In other words, an unset property.
  2. component: This get the extended information of the component, in particular, the component id is required for steps further on down. This fetches it given an component name (passed in).
  3. IsVersionNameSet: This checks if the user passed in a specific version to deploy. If so it will skip to 6
  4. LatestVersion: Get the latest version from the component. This is a custom step from the ComponentPlus plugin I created. It takes a component name and returns a version name.
  5. versionName: This sets the request level version name (the one usually passed in) from step 4. Step 6 expects the request level parameter.
  6. version: Given the component name and version name get the version id. This is also a custom step I created using the ComponentPlus plugin.
  7. DownloadArtifacts: Download the artifacts from the component and version passed in. This fetches the Chef cookbook.
  8. Install Chef: A simple bash script that downloads and installs chef:

    echo $JAVA_HOME
    curl -L https://www.opscode.com/chef/install.sh | bash
    
    
  9. Server or Client: Are we installing a collectd Server or Client
  10. Create collectd Server Node: This creates a configuration file for the Chef recipe that is specific collectd server configuration. We will cover the contents of this in a later post.
  11. Create collectd config Directory: We create a directory to store the file created in the next step.
  12. Create auth file: A collectd server authentication file that contains the username password used by the server for collected clients to authenticate against. This file is in the htpasswd format. The username and password passed in as request properties are used to construct this file.
  13. Create collectd Client Node: This creates a configuration file for the Chef recipe that is specific collectd client configuration. We will cover the contents of this in a later post.
  14. Clean old Collectd: Clean up any old collectd configurations.
  15. Install Collectd: Execute the Chef recipe using the configuration file created in earlier steps.
  16. Deploy Server Set: Is the UCD Server request property set to something?
  17. Update UrbanCode Deploy Server JMX Settings: Update the UrbanCode Deploy server JVM so that remote JMX is enabled:

    if ! grep -q com.sun.management.jmxremote.port ${p:deploy_server_dir}/bin/set_env ; then sed -i 's/\(java.awt.headless=true\)/\1 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9010 -Dcom.sun.management.jmxremote.local.only=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false/' ${p:deploy_server_dir}/bin/set_env
    fi
    
    
  18. Manual Task: The JMX settings required restarting of the UrbanCode Server. Restarting of an UrbanCode Server should be done using independent of the process, so it's a manual task.
  19. Get Agent Home: Fine the Agents home directory.
  20. Update Agent Worker JMX Settings: Update the UrbanCode Deploy agent worker jvm so that remote JMX is enabled.

    if ! grep -q com.sun.management.jmxremote ${p:AGENT_HOME}/bin/worker-args.conf ; then sed -i '/java.security.properties/a-Dcom.sun.management.jmxremote\n-Dcom.sun.management.jmxremote.port=9010\n-Dcom.sun.management.jmxremote.local.only=true\n-Dcom.sun.management.jmxremote.authenticate=false\n-Dcom.sun.management.jmxremote.ssl=false' ${p:AGENT_HOME}/bin/worker-args.conf
    fi
    
    
  21. Restart Agent: he JMX settings required restarting of the UrbanCode Agent.

    ${p:AGENT_HOME}/bin/agent stop ; ${p:AGENT_HOME}/bin/agent start
That's it! Most of the heavy lifting of installing and configuring chefd is done by Chef which we will cover in a later blog post.
As mentioned earlier, this only executed on an per agent basis, perhaps it would be good idea to get it execute for multiple agents. To do this we need to set up an UrbanCode application with an application process.

This is self explanatory, for each agent assigned to the application, run the generic process to install the collectd client. You can download the application with this process called Install Collected Client On Agents from:
https://hub.jazz.net/git/kuschel/monitorucd/contents/master/Install_collectd_app.json

There are a few gotchas with this application, there is an environment created called Collectd, this environment will need to be bound to the Resource group that contains the agents to be provisioned with collectd. By default, this is the /Agents/Collectd resource group. I have configured this resource group to automatically include agents that have the deploy_collectd property set to true. This allows the inclusion and exclusion of agents into the deploy process.


One thing I also ran into is when there are duplicate properties in agents. I had two agent properties differing only in case: agent.HOSTNAME and agent.Hostname. This caused problems with the agent loop step in the process. You will see the process fail and the link to the child process non-existant in the request history and the UrbanCode Server deployserver.out log file with something like.

2015-08-11 08:52:30,860 ERROR WorkflowRuntime-7: (wfid=1b143582-e0a3-4a20-9150-d4a7fcc82420) org.hibernate.util.JDBCExceptionReporter - Duplicate entry 'iteration/agent/Hostname-611b38dc-cc5d-4342-828c-6a94cc23b881' for key 'ps_prop_val_uci'

Delete the duplicate property to get this unstuck.

Update: I have created a custom plugin that allows a property to be deleted from all agents. I use this to delete the HOSTNAME property.  I created another generic process called "Remove HOSTNAME Property" with this one step configured with the HOSTNAME property. I supply any resource as the default required Resource request parameter, as this is not used, you can put any valid resource here.
I then added a "Run Generic Process" step into the application process before the "For Every Agent..." loop that points to the Remove HOSTNAME Property, if you supplied a default in the generic process, set the Resource Path parameter to blank for this step. That will fix the  "Duplicate entry" exception.

If you run and it fails complaining about a missing resource id, then try putting in the Resource Path that the environment is bound to in the Install collectd step in the application process. For example:
/Collectd Enabled/${p:iteration/agent.name}

Once the resource group is bound to the Collectd environment and there are agents in it, execute the environment's application process:

This will bring up a dialog containing similar properties as the generic process, in this case we are installing clients, so properties pertaining to a collectd server install are omitted.

To recap:
  • Only Changed Versions: This is ignored as we have our own version logic in the process.
  • Snapshot: Leave Blank
  • Component Name: Should be set to the name of the component.
  • Version Name (Optional): You can specify the name of a specific version of the component to use, otherwise it will use the latest
  • Collectd Install Directory: The default is good
  • Collectd Username: You can leave this as default. This username is the one used to encrypt traffice between collectd clients and servers.
  • Collectd Password: Set any password. This password is the one used to encrypt traffic between collectd clients and servers. It's a good idea  to encrypt this password with htpasswd utility before pasting it here. For example to set the admin password, the first parameter is the username, the second is the password. The output contains the username, a colon, then the encrypted password. Paste that value in this property:

    > htpasswd -bnm admin admin
    admin:$apr1$qSfx7.W2$xf/2k1mDHnksPXZlrU.b90
    
    
  • Collectd Server: The collectd server host.
  • Schedule Deployment: Leave this unchecked if you want to install now
  • Description (Optional): Describe the execution of this install
That's it!

Part 3 is next where I examine the Chef recipe in more detail.