Thursday, July 25, 2013

Deployment is not source control

(This is the first post in a series on deployment. See part 2 and part 3.)

A source control manager (or SCM) is the second most-important tool an application team can use, right after a good editor. It preserves history, maintains context, and makes perfect julienne fries, every time. Everything should go into source control - source code, tests, requirements, configuration, build scripts, deployment tools. Everything. Building an application that isn't managed in source control is like trying to cross the Grand Canyon on a high wire - without the high wire.

But, as much as source control is a phenomenal tool, it is not the right tool for every purpose. No-one would replace vim or Sublime with Git or Mercurial. That just makes no sense. Which is why I'm always baffled when I come into a team and see deployments managed with git branches.

Deployment is the process of taking an product from environment A to environment B, usually from test (or beta) to staging (or user-acceptance), then to production. An environment isn't just the application code that lives in a single server. It's the entire stack of processes, such as the database, application(s), third-party libraries, configuration, background jobs, and services that go into providing the features of your product. Ensuring that all the different pieces of that stack are in sync at all times is the major function of deployment.

In order to do this, the deployment tool must understand dependencies. Dependencies between application code and third-party libraries on the same server is just the start of this. Dependency-tracking across server groups, between the application code and the database version, and even configuration changes are all components of this. And everything has to move in lockstep.

There is no single tool that, to my knowledge, manages the entire stack in this holistic fashion. But, an application team can make life a lot simpler for themselves by doing one simple thing - deploy with OS packages and not source control.

OS packaging tools (such as RPM and APT) have been around for decades. They are the way to deploy libraries and applications to Linux (and Windows, thanks to Chocolatey). They manage dependencies, put everything in the right place, update configuration, verify compatibility, and do everything else necessary to make sure that, when they're done, the requested thing works. Often, this means setting specific compilation switches (or even pre-compiling for specific architectures). They encode knowledge that is often hard-won and difficult to rediscover. And, finally, they let a user ask the server what is installed, revert to a previous version, or even uninstall the package (and all downstream dependencies) altogether.

Source control does not do any of those things. Source control is designed to do one and only one thing - track and manage changes between versions of groups of text files. Modern SCMs (such as Git and Mercurial) do this very very well.

Managing a deployment requires a package. When QA approves a specific deployment within their test environment, operations needs to "make prod look like test". The way ops can ensure that production will look exactly like test is to build production exactly as test was built. Server build tools (like Puppet and Chef) help ensure that the servers (or VMs) are built exactly the same every time. The application (and its configuration) needs to have the same treatment.

So, I recommend the following process:

  1. Do your development as you normally do right now. (I will have thoughts on the rest of this later, but those are another set of posts.)
  2. Once a changeset is merged into primary branch (master, for Git or default for Mercurial):
    1. It is tagged with the name of the changeset.
    2. An OS package is built and uploaded to the test package repository.
  3. The OS package is deployed to the test environment.
    1. This can happen either automatically or as a result of a user action.
  4. QA verifies the build.
    1. If it fails, issues are opened and the development process begins anew.
    2. If it fails catastrophically, the environment is reverted.
  5. When QA passes the build, the package is copied into the production package repository.
    1. The commit that was used to build this package is tagged with the date it was promoted to production.
  6. The package is applied to the production environment at the appropriate time.
At the point of merging into the primary branch, the SCM has finished its job. It's now the job of the package manager to replicate that branch out to the various environments in the correct order with the correct dependencies.