Where does the distribution end?

Yesterday I’ve had an inspiring Twitter conversation with Miah Johnson. The conversation was long, branchy, and restricted by the 140 character limit. It kept me thinking. It seems the main difference we had was about where does the distribution end, and the userspace begin.

It’s reasonable to expect that if a distribution has a mechanism for preconfiguring packages, automated installation and configuration, and (kind of) configuration management, then one can use it as an end tool to configure the system. Or at least one of the tools in pipeline. Why reinvent the wheel?

For both me and Miah, the experience of trying to get things done with the Debian/Ubuntu toolchain turned out to be an uphill battle. Up a steep hill made of yak hair and duct tape, to be precise. Our conclusions were different, though: Miah wants to use the distribution’s toolchain, or switch to a distribution that has usable tools. This is how stuff should work, after all. I respect and admire that, because myself… I just gave up.

I find the clunky duct tape automation and idiosyncratic distro’s solutions workable, but by that I only mean that 98% of the time I can just ignore it, and the remaining 2% needs just a small nudge to disable a piece of setup or tell the system that I really, really want to do stuff myself, yes, thank you, I know what I’m doing.

Case in point: debconf-set-selections, which started the whole conversation. Only time I needed to use these was when I used Ubuntu’s MySQL package, to set the initial root password. Nowadays I prefer to use Percona Server, which doesn’t set initial password, so I can make Chef set it right after package installation. Otherwise, the only nudge is to disable automatic start of services when package is installed, to let Chef configure it and start it when it’s ready.

Case in point: Python and Ruby libraries. In my view, the distribution’s packages of Python packages and Ruby gems are not meant to be used in user’s applications – they are only meant to exist as dependencies for packaged application written in Python or Ruby. For applications, I just use the base language from a package (and with Ruby I prefer to go with Brightbox patched version), and use Bundler or Virtualenv to install libraries needed by my application.

Case in point: init system. Until systemd arrives, if I need to manage a service that is not already packaged (such as my application’s processes), I don’t even try to write init scripts or upstart configuration. I just install Supervisor or Runit and work from there. Systemd may change that, though, and I can’t wait until it’s supported in a stable distro.

And so on. Distribution’s mechanisms are there, but the way I see it, they are there for internal usage within distribution packages, not for poking and configuring it from the outside. I can enjoy a wide range of already built software that more or less fits together, security patches, wide userbase (which means that base system is well tested and much of the time if I have problems, the solution is a search box away). If I need, I can package my own stuff with FPM, ignoring this month’s preferred toolkit for Debian packagers. Since recently, I can keep my sanity points when I internally publish custom packages and pull other packages from a patchwork of PPAs and projects’ official repositories by using Aptly. I can run multiple instances and versions of a service contained by Docker. And I can happily ignore most of the automation that the distribution purportedly provides, because I simply gave up on it — Chef gets that job done.


Access Chef node’s attributes by JSONPath

I have just released chef-helpers 0.0.7. The gem is just a bag of helpers for Chef recipes, without any common theme. This release adds one feature that may be really useful: it overloads Chef::Node#[] to allow deep indexing with JSONPath.

When is it useful? A code snippet is worth a thousand words.

Accessing nested attribute

Often I need to get value of a deeply nested attribute – which may not exist. If it doesn’t exist, its parent and grandparent may not be there as well. I need to code defensively (or resort to a blanket rescue clause) to take care of that:

Accessing by JSONPath makes it simpler:

Avoiding traversal when I don’t know the path upfront

When I generate random passwords, and I want to be chef-solo compatible, I want to be able to say something like:

This is supposed to raise an error when I run on chef-solo and I don’t have the attribute defined, and set them to a secure password by default when I have chef server to store it. The syntax may be a bit nicer and DRYer, but it’s already better than enumerating the attributes line by line. The implementation, though, is quite hairy – the inject method is powerful, but it’s not obvious what is done when you read the code:

With JSONPath, I can replace this line simply by:

Selecting attributes by value

I need to check whether a partition is mounted at a given path. The code used to look like this:

With JSONPath, I can focus on specifying what I expect to find, rather than how to traverse the nested hash:

I hope it will be useful – for sure it simplifies many similar pieces of code I needed to write in my recipes. Happy hacking!


Announcing knife-briefcase

One of the stumbling blocks for teamwork with Chef is sharing secrets; there are some files and other data that are sensitive, should not be committed to the version control repository, and the admins (or administration scripts) need them in daily work. Such files include secrets for encrypted data bags, private SSL keys, AWS keys (and other access keys and passwords), or private SSH keys to log into newly created EC2 instances. Until now, I kept these in .chef directory, and either had some crude script to pack them in a GPG-encrypted tarball stored somewhere (Dropbox, a file accessible via SSH, whatever), or was just sending around GPG-encrypted keys and files over e-mail.

Then I figured, hey, Chef server already has facilities to store and retrieve small pieces of data. Data bags don’t have to be used by recipes! What if I could just GPG-encrypt the files as I used to, but had a plugin to keep these in the Chef server?

This turns out to work quite nicely. The “knife briefcase” plugin (because you don’t store sensitive paperwork in a bag – you need a proper briefcase!) is available at https://github.com/3ofcoins/knife-briefcase and soon on Rubygems (I want to give it more real world testing first). It uses GPG (via ruby-gpgme) to encrypt and sign the content that is then stored in a data bag – and can be retrieved and decrypted. If list of authorized people changes (as you add new team member, or someobody leaves), knife briefcase reload will re-encrypt all the items to make the change easy. Simple, and effective.

I am aware that there’s already a plugin that serves similar purpose: chef-vault. I even use it in some of my cookbooks. But I don’t like it for this particular use case for two reasons.

First is minor: the command line requires me to always provide a node search query. If I want to share a secret only between admins, I need to provide a fake query that will return no results. It also requires some plumbing to re-encrypt the items. It’s not this much of an issue – it just requires some hacking. If it was the only reason, I’d happily send a pull request rather than write a separate new tool.

The second reason is that it’s complicated to set up a new person. Chef-vault uses administrator’s Chef API secret key to encrypt the data. This means that to re-encrypt the data for new team member, they need to create their API user first, and it requires a back-and-forth:

  1. New user gives me public SSH and GPG keys (which are needed anyway)
  2. I configure SSH access for the provided key and ping them to say they can start
  3. New user is able to create Chef API user now
  4. They ping me, or someone else who has access to the data, to re-encrypt it
  5. They are blocked until this is done

With knife-briefcase, the flow is much more async:

  1. New user gives me SSH and GPG public key (which they need to do anyway)
  2. I configure SSH access for the provided key, use GPG key to re-encrypt the data, and ping them to say they can start
  3. New user is able to create Chef API user and start work right away

This way, there’s one back-and-forth less, and the whole setup feels much simpler to the newbie, who can focus on getting to know the project and getting some work done rather than wait for access to be configured. Complicated setup is my pet peeve, and over last months I did quite a bit of work to simplify the agile administrator’s command center. I think I’m getting close to a nice setup – I should sit down and describe it in a separate blog post someday.

Meanwhile, have fun with knife-briefcase!


Chef vs Puppet – my take on the holy war

I’m often asked why did I choose Chef over Puppet for my day-to-day configuration management work. Let me start by stating the now-obvious: the answer to the “Chef or Puppet?” question is “Yes.”

I don’t have much first-hand experience with Puppet. I do my evaluation based mostly on feature set and personal preference. In the long run, both ecosystems do pretty much the same – main difference is philosophy, and some of the features.

Here’s why Chef’s approach works better for me:

Fixed order of execution

Puppet orders resources to apply by explicitly declared dependencies. Chef executes run lists & recipes top to bottom, branching only on explicitly declared notifications/subscriptions. While Puppet’s model has nicer theory to it, in practice I prefer stability of Chef’s approach. Puppet can e.g. randomly reorder resources after adding a new one due to hashing details, which makes me a bit afraid of unexpected side effects

Native Ruby

When I was choosing, Ruby DSL for Puppet was a new, unstable and incomplete feature; it may be better now, but still basic language for writing manifests is a separate declarative language. As before, this is cleaner theory, but it forces me to write configuration files in a separate, limited language. Even though it’s supposedly Turing-complete (if you’re a wizard), it still can lead to either hairy code (I think I’ve heard you can do loops with recursion, Scheme-style), or copy&paste coding. Chef recipes are plain Ruby, so things like looping and mapping over lists or custom library additions are possible.

Chef still has clear separation between “wizard code” (definitions, resources and providers, libraries), and “code for mortals” (recipes themselves, templates, roles), which makes it easy for non-specialist programmers to pick up and modify code.

Data-driven approach and orchestration

I’ve had trouble explaining what’s the deal to Puppet people, they usually resort to “we have facts” (which seem to be equivalent to “automatic attributes” of Chef), and that “there are probably add-ons for this”. I guess this part of Chef’s out-of-the-box feature-set is not trivial to get running with Puppet. Let me explain in detail:

Chef server itself is a thin API over a document database (CouchDB) and full-text search engine (Solr). This makes the server a bit tricky to set up and manage by yourself, as there are quite a few services it depends on (CouchDB, Solr, RabbitMQ, a bunch of processes of the Chef server itself), but has a few deep implications. First is, once you get the idea that Chef server is just a searchable document database, its internal model gets very simple and consistent. What’s a Node? It’s a searchable JSON document. Role? Searchable JSON document. Environment? You guess.

Data bags? These are also searchable JSON documents, but they are a different beast. They are completely custom. A “data bag” is a named bucket for JSON documents – “data bag items”. They are searchable, and can be used by any recipe. Example data bag item from “users” data bag from one of my projects looks like this (in YAML format for simplicity):

The generic-users cookbook does a search on the users data bag to create shell accounts on the machines and populate their ~/.ssh/authorized_keys file. The Nagios and Munin cookbooks search for data bag users items with group:sysadmin to configure list of allowed OpenIDs; Nagios also sets up notifications based on this item. Jenkins cookbook does the same, but for all users, not only sysadmin. I have a central place to configure users, and all other cookbooks pull data from it – setting up the new employee is as simple as putting his data in one place and running chef-client on all machines.

It was just a simple example – this can get much more involved. I use deployments data bag to configure all the projects I’m deploying (usernames, API access keys, etc); Opscode has “application” cookbook that achieves quite deep magic this way, including continuous deployment. Well-written cookbooks make for data-driven setup: most of the minor changes means updating the data bag(s) without even touching recipes themselves.

In Chef 0.10, there is also option of encrypted data bags – with secret key shared between the node that needs to know the data and sysadmin that uploads the data bag, Chef server doesn’t need to even know sensitive details. I’ve used it to protect e.g. AWS access keys with permissions to bring up and destroy RDS database instances.

Second aspect of Chef server being a searchable document database is orchestration. A node can search for other nodes using Solr’s query language. This way, frontend webserver for “production” deployment will know addresses of application servers, which in turn will know address of the database to contact. And the other way around: application server knows IP of frontend server, and database server knows IPs of application servers, allowing them to configure firewall rules automatically. Nodes know each other’s public SSH key, populating /etc/ssh/known_hosts automatically and avoiding host key warnings. Munin & Nagios servers know which clients are out there and what services to expect. And so on, and so on…

Search is also good for looking at and selecting sub-groups of nodes (see Opscode blog post on finding recently created hosts with Chef). There’s also a “knife ssh” command on top of search that executes parallel ssh connection to found nodes – or opens a screen/tmux window to each of them.

Downside

There is one minus of Chef: there is no good web UI for Chef server. There is a panel written by Opscode, but it’s clunky and user-unfriendly. While I sometimes use it to browse server state (usually to debug changes made by others and differences between live state and Git repository), I can’t seriously use it for anything more involved.