Author: Dave Eddy - Operations Engineer

Here at Voxer, Chef is our Configuration Management software of choice. We introduced Chef into our infrastructure a little less than a year ago, and now rely on it today to provision our machines with as little manual intervention as possible. Since introducing Chef, we have gone from from 1 barely functioning cookbook on a small set of nodes, to 30 cookbooks on ~200 nodes.

To minimize the operational cost of rolling out Chef, we used Hosted Chef instead of running our own Chef server. This worked great; Hosted Chef scaled with us, had very little downtime, and made it easy to get started working with Chef. However, the more nodes you add to Hosted Chef, the more money you owe at the end of the month. Seeing the email come in with our receipt every month made us consider whether it would be worth it to run our own Chef Server.

From Hosted Chef To Chef Server

A couple weeks ago we decided to take on the challenge of migrating our Chef setup from Hosted Chef, to a private, self-hosted, setup. In that process we learned a lot about Chef, and the multiple services that make up a Chef server.

For most companies and individuals, setting up Chef server is as simple as running a few apt-get install commands, firing off a few /etc/init.d scripts, and calling it a day. At Voxer, our infrastructure is hosted on Smart Machines in the Joyent Cloud. As such, the operating system we are using is SmartOS, which is maintained by Joyent, and built off of the Illumos Kernel, a "fully open community fork of the OpenSolaris operating system". A lot of software written today makes the assumption that it is running on Linux, or with a GNU userland set of tools. Because of this, installing software that treats non-Linux operating systems as second-class citizens often requires manual steps, finding the correct libraries to link against, and patching the source code.

Luckily for us, we were saved from this, as a couple of days prior to our decision to migrate Joyent released their official Chef Server Image on their cloud. A couple clicks through a web UI and we had a new server provisioned, complete with our public keys, and every component that Chef needs (CouchDB, Solr, RabbitMQ, etc...) already configured.

Setup

We followed the instructions on Joyent's documentation, and had a Chef server up and running with a test node in no time. One of the first things we did was restrict the Chef API and the web UI to localhost, and sit it behind NGINX. With this setup we could tuck both services behind SSL for security, and pass both over the same port! To get that setup working we had to do some inspection of the headers sent to the server, and with the help of this blog post, it was trivial. I've attached the relevant part from the config file at the bottom of this post.

Also, we cheated a little. It turned out to be more difficult than we thought to restrict these services to listen on localhost only. Instead, we just setup firewall rules to block everything except ports 443 and 22 externally. You can do the same by placing the config file /etc/ipf/ipf.conf below on your machine (replacing <externalip> with your external IP)...

# Allow all out going connections
pass out from <externalip> to any keep state

# Allow ports
pass in quick from any to <externalip> port=22
pass in quick from any to <externalip> port=443

# Block the rest
block in from any to <externalip>

...and enabling ipfilter with

svcadm enable ipfilter

Migration

Now that we had a private Chef Server instance running, with 1 test node, it was time to migrate the data off of our Hosted Chef instance. With little searching we found Knife Hacks, a series of scripts to massage Chef into doing what we wanted. In the Knife Plugins section of that repository there are two scripts, backup_export.rb and backup_restore.rb that we used.

First, we ran backup_export.rb against our Hosted Chef instance to dump all of the information stored on the server into JSON files on the filesystem. Next, we ran backup_restore.rb pointed at our private Chef server to load all of that data onto our new Chef server. The last thing we did was push out the new validation.pem file to all nodes, run a sed across our cluster to modify client.rb to point to our new Chef server instance, initiate a chef-client run, and we were up and running, with all of our nodes checked into the new server.

Thoughts

  • If your Chef server is setup correctly from the beginning, migration is relatively painless
  • Use the status page in the Web UI to ensure that all nodes are checking in successfully
  • Use Knife Hacks when migrating, it'll save you a lot of time

Chef Part 2 - Performance

Check out Part 2 where we go over our performance analysis of Chef on Joyent, and speed up our Chef runs by 16x!

Chef Part 2 - Performance

Authors / Credits

Appendix

/opt/local/etc/nginx/nginx.conf

http {
    upstream chef_api_local {
        server localhost:4000;
    }
    upstream chef_webui_local {
        server localhost:4040;
    }

    server {
        ssl on;
        ssl_certificate     /path/to/thing.crt;
        ssl_certificate_key /path/to/thing.key;
        listen 443;

        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto https;

        ssl_session_timeout 5m;

        ssl_protocols SSLv2 SSLv3 TLSv1;
        ssl_ciphers ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP;
        ssl_prefer_server_ciphers on;

        root /var/www;
        location / {
            # API request incoming
            if ( $http_x_ops_timestamp != "" ){
                    proxy_pass http://chef_api_local;
                    break;
            }
            # webui request incoming
            proxy_pass http://chef_webui_local;
        }
    }
}

Tools