Thursday, August 27, 2015

The Packager DSL - The second user story

  1. DSLs - An Overview
  2. The Packager DSL - The first user story
  3. The Packager DSL - The second user story
Our main user is an odd person. So far, all we've made is a DSL that can create empty packages with a name and a version. No files, no dependencies, no before/after scripts - nothing. But, instead of asking for anything useful, the first response we get is "Huh - I didn't think 'abcd' was a legal version number." Some people just think in weird ways.

Our second user story:
Throw an error whenever the version isn't an acceptable version string by whatever criteria are normally employed.
Before we dig into the user story itself, let's talk about why this isn't a "bug" or a "defect" or any of the other terms normally bandied about whenever deployed software doesn't meet user expectations. Every time the user asks us to change something, it doesn't matter whether we call it a "bug", "defect", "enhancement", or any other word. It's still a change to the system as deployed. Underneath all the fancy words, we need to treat every single change with the same processes. Bugfixes don't get a special pass to production. Hotfixes don't get a special pass to production. Everything is "just a change", nothing less and nothing more.

In addition, the "defect" wasn't in our implementation. It was in the first user story if it was anywhere. The first user story didn't provide any restrictions on the version string, so we didn't place any upon it. And that was correct - you should never do more than the user story requires. If you think that a user story is incomplete in its description, you should go back to the user and negotiate the story. Otherwise, the user doesn't know what they're going to receive. Even worse, you might add something to the story that the user does not want.

Really, this shouldn't be considered a defect in specification, either. That concept assumes an all-knowing specifier that is able to lay out fully-formed and fully-correct specifications that never need updating. Which is ridiculous on its face. No-one can possibly be that person and no-one should ever be forced to try. This much tighter feedback loop between specification to production to next specification is one of the key concepts behind Agile development. (The original name for devops was agile systems administration or agile operations.)

All of this makes sense. When you first conceive of a thing, you have a vague idea of how it should work. So, you make the smallest possible thing that could work and use it. While you start with a few ideas of where to go next, the moment you start using it, you realize other things that you never thought of. All of them (older ideas and newer realizations) become user stories and we (the developers and the users together) agree on what each story means, then prioritizes them in collaboration. Maybe the user wants X, but the developers point out Y would be really quick to do, so everyone agrees to do Y first to get it out of the way. It's an ongoing conversation, not a series of dictates.

All of which leads us back to our user story about bad version strings. The first question I have is "what makes a good or bad version string?" The answer is "whatever criteria are normally employed". That means it's up to us to come up with a first pass. And, given that we're building this in a Ruby world, the easiest thing would be to see what Ruby does.

Ruby's interface with version strings would be in its packages - gems. Since everything in Ruby is an object, then we would look at Gem and its version-handling class Gem::Version. Reading through that documentation, it looks like the Ruby community has given a lot of thought to the issue. More thought than I would have realized was necessary, but it's good stuff. More importantly, if we use Gem::Version to do our version string validation, then we have a ready-documented worldview of how we expect version strings to be used.

Granted, we will have to conform to whatever the package formats our users will want to generate require. FreeBSD may require something different from RedHat and maybe neither is exactly what Gem::Version allows. At that point, we'll have failing tests we can write from user stories saying things like "For Debian, allow this and disallow that." For now, let's start with throwing an error on values like "bad" (and "good"). Anything more than that will be another user story.

As always, the first thing is to write a test. Because we can easily imagine needing to add more tests for this as we get more user stories, let's make a new file at spec/dsl/version_spec.rb. That way, we have a place to add more variations.

describe Packager::DSL do
    context "has version strings that" do
        it "rejects just letters" do
            expect {
                Packager::DSL.execute_dsl {
                    package {
                        name 'foo'
                        version 'abcd'
                        type 'whatever'
                    }
                }
            }.to raise("'abcd' is not a legal version string")
        end
    end
end

Once we have our failing test, let's think about how to fix this. We have three possible places to put this validation. We may even choose to put pieces of it in multiple places.
  1. The first is one we've already done - adding a validation to the :package entrypoint. That solution is good for doing validations that require knowing everything about the package, such as the version string and the type together.
  2. The second is to add a Packager::Validator class, similar to the DSL and Executor classes we already have. This is most useful for doing validation of the entire DSL file, for example if multiple packages need to be coordinated.
  3. The third is to create a new type, similar to the String coercion type we're currently using for the version.
Because it's the simplest and is sufficient for the user story, let's go with option #3. I'm pretty sure that, over time, we'll need to exercise option #1 as well, and possibly even #2. But, YAGNI. If we have to change it, we will. That's the beauty of well-built software - it's cheap and easy to change as needed.

class Packager::DSL < DSL::Maker
    add_type(VersionString = {}) do |attr_name, *args|
        unless args.empty?
            begin
                ___set(attr_name, Gem::Version.new(args[0]).to_s)
            rescue ArgumentError
                raise "'#{args[0]}' is not a legal version string" 
            end
        end
        ___get(attr_name)
    end

    add_entrypoint(:package, {
        ...,
        :version => VersionString,
        ...,
    }) ...
end

Note how we use Gem::Version to do the validation, but we don't save it as a Gem::Version object. We could keep the value as such, but there's no real reason (yet) to do so. ___get() and ___set() (with three underscores each) are provided by DSL::Maker to do the sets and gets. The attr_name is provided for us. So, if we wanted, we could reuse this type for many different attributes.

We could add more tests, documenting exactly what we've done. But, I'm happy with this. Ship it!

There's something else we haven't discussed about this option in particular. Type validations happen immediately when the item is set. Since values can be reused within a definition, they are safely reusable. For example (once we have defined how to include files in our packages), we could have something like:

package {
    name 'some-product'
    version '1.2.44'
    files {
        source "/downloads/vendor/#{name}/#{version}/*"
        dest "/place/for/files"
    }
}

If we defer validation until the end, we aren't sure we've caught all the places an invalid value may have been used in another value. This is why we attempt a "defense in depth", a concept normally seen in infosec circles. We want to make sure the version value is as correct as we can make it with just the version string. Then, later, we want to make sure it's even more correct once we can correlate it with the package format (assuming we ever get a user story asking us to do so).

Wednesday, August 26, 2015

Creating the Packager DSL - Retrospective

  1. Why use a DSL?
  2. Why create your own DSL?
  3. What makes a good DSL?
  4. Creating your own DSL - Parsing
  5. Creating your own DSL - Parsing (with Ruby)
  6. Creating the Packager DSL - Initial steps
  7. Creating the Packager DSL - First feature
  8. Creating the Packager DSL - The executor
  9. Creating the Packager DSL - The CLI
  10. Creating the Packager DSL - Integration
  11. Creating the Packager DSL - Retrospective
We've finished the first user-story. To review:
I want to run a script, passing in the name of my DSL file. This should create an empty package by specifying the name, version, and package format. If any of them are missing, print an error message and stop. Otherwise, an empty package of the requested format should be created in the directory I am in.
Our user has a script to run and a DSL format to use. While this is definitely not the end of the project by any means, we can look back on what we've done so far and learn a few things. We're also likely to find a list of things we need to work on and refactor as we move along.

It helps to come up with a list of what we've accomplished in our user story. As we get further along in the project, this list will be much smaller. But, the first story is establishing the walking skeleton. So far, we have:

  1. The basics of a Ruby project (gemspec, Rakefile, Bundler, and project layout)
  2. A DSL parser (using DSL::Maker as the basis)
    1. Including verification of what is received
  3. An intermediate data structure representing the packaging request (using Ruby Structs)
  4. An Executor (that calls out to FPM to create the package)
  5. A CLI handler (using Thor as the basis)
    1. Including verification of what is received
  6. Unit-test specifications for each part (DSL, Executor, and CLI)
    1. Unit-tests by themselves provide 100% code coverage
  7. Integration-test specifications to make sure the whole functions properly
  8. A development process with user stories, TDD, and continuous integration
  9. A release process with Rubygems
That's quite a lot. We should be proud of ourselves. But, there's always improvements to be made.

The following improvements aren't new features. Yes, an empty package without dependencies is almost completely worthless. Those improvements will come through user stories. These improvements are ones we've seen in how we've built the guts of the project. Problems we've noticed along the way that, left unresolved, will become technical debt. We need to list them out now so that, as we work on user stories, we can do our work with an eye to minimizing these issues. In no particular order:
  • No error handling in the call to FPM
    • Calling an external command can be fraught with all sorts of problems.
  • Version strings aren't validated
  • There's no whole-package validator between the DSL and the Executor
  • We probably need a Command structure to make the Executor easier to work with
  • We probably want to create some shared contexts in our specs to reduce boilerplate
    • This should be done as we add more specs
It's important to be continually improving the codebase as you complete each new user story. Development is the act of changing something. Good developers make sure that the environment they work in is as nimble as possible. Development becomes hard when all the pieces aren't working together.

Over the next few iterations, we'll see how this improved codebase works in our favor when we add the next user stories (validating the version string, dependencies, and files).

Tuesday, August 25, 2015

Packager DSL 0.0.1 Released

The Packager DSL has been released to Rubygems. (I was going to name it "packager", but that was already taken.) Please download it and take it for a spin. It'll be under pretty active development, so any bugs you find or missing features you need should be addressed rapidly.

Monday, August 24, 2015

Creating the Packager DSL - Integration

  1. Why use a DSL?
  2. Why create your own DSL?
  3. What makes a good DSL?
  4. Creating your own DSL - Parsing
  5. Creating your own DSL - Parsing (with Ruby)
  6. Creating the Packager DSL - Initial steps
  7. Creating the Packager DSL - First feature
  8. Creating the Packager DSL - The executor
  9. Creating the Packager DSL - The CLI
  10. Creating the Packager DSL - Integration
Our user story:
I want to run a script, passing in the name of my DSL file. This should create an empty package by specifying the name, version, and package format. If any of them are missing, print an error message and stop. Otherwise, an empty package of the requested format should be created in the directory I am in.
Our progress:

  • We can parse the DSL into a Struct. We can handle name, version, and package format. If any of them are missing, we raise an appropriate error message.
  • We can create a package from the parsed DSL
  • We have a script the executes everything
So, we're done, right? Not quite. We have no idea if it actually works. Sure, you can run it manually and see the script works. But, that's useful only when someone remembers to do it and remembers how to interpret it. Much better is a test that runs every single time in our CI server (so no-one has to remember to do it) and knows how to interpret itself. In other words, a spec.

We have tests for each of the pieces in isolation. That was deliberate - we want to make sure that each piece works without involving more complexity than is necessary. But, our user story doesn't care about that. It says the user wants to execute a script and get their package. The user doesn't care about the Parser vs. the Executor (or anything else we've written). Those distinctions are for us, the developers, to accommodate the inevitable changes that will happen. A developer's job (and raison d'etre, frankly) is to create change. Without change, a developer has nothing to do. So, we organize our project to make it as easy as possible to make change.

But, at the end of the day, it's the integration of these pieces that matters. So, we need to create end-to-end integration tests that show how all the pieces will work together. Where the unit tests we've written so far test the inner workings of each unit, the integration tests will test the coordination of all the units together. We are interested in checking that the output of one piece fits the expected input of the next piece.

Said another way, our unit tests should provide 100% code coverage (which you can see with rspec spec/{cli,dsl,executor}). The integration tests will provide 100% user-expectation coverage.

As always, first thing we do is write a test. We have a subdirectory in spec/ for the unit tests for each component. Let's add another one called spec/integration with a file called empty_spec.rb and contents of

require 'tempfile'

describe "Packager integration" do
    let(:dsl_file) { Tempfile.new('foo').path }
    it "creates an empty package" do
        append_to_file(dsl_file, "
            package {
                name 'foo'
                version '0.0.1'
                package_format 'dir'
            }
        ")

        Packager::CLI.start('execute', dsl_file)

        expect(File).to exist('foo.dir')
        expect(Dir['foo.dir/*'].empty?).to be(true)
    end
end

Take a file with an empty package definition, create a package in the directory we're in, then verify. Seems pretty simple. We immediately run into a problem - no package is created. If you remember back when when we were creating the executor, we never actually call out to FPM. It's relatively simple to add in an #execute method to the Executor which does a system() call. That should make this test pass.

But, that's not enough. After you run it, do a git status. You'll immediately see another problem - the package was created in the directory you ran rspec in. Which sucks. But, it's fixable.

In the same way we have tempfiles, we have temp directories. Sysadmins used to bash are familiar with mktemp and mktemp -d. Ruby has Tempfile and Dir.mktmpdir, respectively. So, let's run the test within a temporary directory - that should solve the problem.

require 'tempfile'
require 'tmpdir'

describe "Packager integration" do
    let(:dsl_file) { Tempfile.new('foo').path }

    it "creates an empty package" do
        Dir.mktmpdir do |tempdir|
            Dir.chdir(tempdir) do
                # Rest of test
            end
        end
    end
end

That keeps the mess out of the main directory. Commit and push.

Though, when I look at what's been written, the tempdir handling is both manually-done (the author has to remember to do it every time) and creates two more levels of indentation. The manual part means it's possible for someone to screw up. The indentations part means that it's harder to read what's happening - there's boilerplate in every test. Which is somewhat ironic given that the whole point of this process is to create a DSL - removing the boilerplate. We can do better. Red-Green-Refactor doesn't just apply to the code. (Or, put another way, tests are also code.)

RSpec allows us to do things before-or-after all-or-each of the specs in a context. Let's take advantage of this to ensure that every integration test will always happen within a tempdir.

require 'fileutils'
require 'tempfile'
require 'tmpdir'

describe "Packager integration" do
    let(:dsl_file) { Tempfile.new('foo').path }

    let(:workdir) { Dir.mktmpdir }
    before(:all)  { @orig_dir = Dir.pwd }
    before(:each) { Dir.chdir workdir }
    after(:each) {
        Dir.chdir @orig_dir
        FileUtils.remove_entry_secure workdir
    }

    it "creates an empty package" do
        # Rest of test
    end
end

A few notes here.

  1. When we used the block versions of Dir.mktmpdir and Dir.chdir, the block cleaned up whatever we did (e.g., changed back to the original directory). When we use the direct invocation, we have to do our own cleanup.
  2. before(:all) will always run before before(:each) (guaranteed by RSpec).
  3. We don't want to use let() for the original directory. let() is lazy, meaning it only gets set the first time it's invoked. Instead, we set an attribute of the test (as provided helpfully to us by RSpec).
    1. We could have used let!() instead (which is eager), but it's too easy to overlook the !, so I don't like to use it. Sometimes, subtlety is overly so.
  4. Tests should be runnable in any order. And this includes all the other tests in all the other spec files. You should never assume that any two tests will ever run in a specific order or even that any test will run in a specific test run. So, we always make sure to change directory back to the original directory (whatever that was). There's nothing here that assumes anything about the setup.
  5. FileUtils has many ways to remove something. #remove_entry_secure is the most conservative, so the best for something that needs to be accurate more than it needs to be fast.
  6. We need to leave the tempdir that we're in before trying to remove it. On some OSes, the OS will refuse to remove a directory if a process has it as its working directory.

prev

Friday, August 21, 2015

Creating the Packager DSL - The CLI

  1. Why use a DSL?
  2. Why create your own DSL?
  3. What makes a good DSL?
  4. Creating your own DSL - Parsing
  5. Creating your own DSL - Parsing (with Ruby)
  6. Creating the Packager DSL - Initial steps
  7. Creating the Packager DSL - First feature
  8. Creating the Packager DSL - The executor
  9. Creating the Packager DSL - The CLI
Our user story:
I want to run a script, passing in the name of my DSL file. This should create an empty package by specifying the name, version, and package format. If any of them are missing, print an error message and stop. Otherwise, an empty package of the requested format should be created in the directory I am in.
Our progress:

  • We can parse the DSL into a Struct. We can handle name, version, and package format. If any of them are missing, we raise an appropriate error message.
  • We can create a package from the parsed DSL
We still need to:
  • Provide a script that executes everything
Writing this script, on its face, looks pretty easy. We need to:
  1. Receive the filename from the commandline arguments
  2. Pass the contents of that filename to Packager::DSL.parse_dsl()
  3. Pass the contents of that to Packager::Executor
A rough version (that works) could look like:

#!/usr/bin/env ruby

require 'packager/dsl'
require 'packager/executor'

filename = ARGV[0]
items = Packager::DSL.parse_dsl(IO.read(filename))
Packager::Executor.new.create_package(items)

You can create a file with a package declaration (such as the one in our spec for the DSL) and pass it to this script and you will have an empty package created. All done, right?

Not quite.

The first problem is testing executables is hard. Unlike classes and objects which live in the same process, the testing boundary of a script is a process boundary. Process boundaries are much harder to work with. Objects can be invoked and inspected over and over, in any order. Invocations of an executable are one-shot. Outside the results, there's nothing to inspect once you've invoked the script. If we could minimize the script and move most of the logic into some objects, that would make testing it so much easier. And, we can measure our code coverage of it.

The second (and bigger) problem is writing good executables is hard. Very very hard. Good executables have options, error-handling, and all sorts of other best practices. It is nearly impossible to write a good executable that handles all the things, even if you're an expert.

Again, the good Ruby opensource developers have provided a solution - Thor. With Thor, we can move all the logic into a Packager::CLI class and our executable in bin/packager becomes

#!/usr/bin/env ruby

$:.unshift File.expand_path('../../lib', __FILE__)

require 'rubygems' unless defined? Gem
require 'packager/cli'

Packager::CLI.start

Almost all of that is cribbed from other Ruby projects, meaning we can directly benefit from their experience. The executable is now 8 lines (including whitespace). We can visibly inspect this and be extremely certain of its correctness. Which is good because we really don't want to have to test it. The actual CLI functionality moves into classes and objects, things we can easily test.

First thing first - we need a test. Thor scripts tend to function very similarly to git, with invocations of the form "<script> <command> <flags> <parameters>". So, in our case, we're going to want "packager create <DSL filenames>". This translates into the #create method on the Packager::CLI class. The filenames will be passed in as the arguments to the #create method. We don't have any flags, so we'll skip that part (for now).

A note on organization - we have tests for the DSL, the Executor, and now the CLI. We can see wanting to write many more tests for each of those categories as we add more features, so let's take the time right now to reorganize our spec/ directory. RSpec will recursively descend into subdirectories, so we can create spec/dsl, spec/executor, and spec/cli directories. git mv the existing DSL and Executor specs into the appropriate directories (renaming them to be more meaningful), run RSpec to make sure everything is still found, then commit the change. You can pass rspec the name of a file or a directory, if you want to run just a subset of the tests. So, if you're adding just a DSL piece, you can run those tests to make sure they pass without having to do the entire thing.

Back to the new CLI test. The scaffolding for this looks like

describe Packager::CLI do
    subject(:cli) { Packager::CLI.new }

    describe '#create' do
    end
end

The nested describe works exactly as you'd expect. (RSpec provides many ways to organize things, letting you choose which works best for the situation at hand.)

The first test, as always, is the null test. What happens if we don't provide any filenames? Our script should probably print something and stop, ideally setting the exit code to something non-zero. In Thor, the way to do that is to raise Thor::Error, "Error string". (I wish they'd call that Thor::Hammer, but you can't have everything.) So, the first test should expect the error is raised.

    it "raises an error with no filenames" do
        expect {
            cli.create
        }.to raise_error(Thor::Error, "No filenames provided for create")
    end

Run that, see it fail, then let's create packager/cli.rb to look like

class Packager
    class CLI < Thor
        def create()
            raise Thor::Error, "No filenames passed for create"
        end
    end
end

Again, we're writing just enough code to make the tests pass. Now, let's pass in a filename to #create. Except, where do we get the file?

One possibility is to create a file with what we want, save it somewhere inside spec/, add it to the project, then reference that filename as needed.

There are a few subtle problems with that approach.

  1. The file contents are separated from the test(s) using them.
  2. These files have to be packaged with the gem in order for client-side tests to work.
  3. There could be permission issues with writing to files inside the installation directory.
  4. Developers end up wanting to keep the number of these files small, so shoehorn as many possible cases within each file as possible.
Fortunately, there's a much better approach. Ruby, like most languages, has a library for creating and managing tempfiles. Ruby's is called Tempfile. Adding this to our test file results in

require 'tempfile'

describe Packager::CLI do
    subject(:cli) { Packager::CLI.new }
    let(:package_file) { Tempfile.new('foo').path }

    def append_to_file(filename, contents)
        File.open(filename, 'a+') do |f|
            f.write(content)
            f.flush
        end
    end

    describe '#create' do
        it "raises an error with no filenames" do
            expect {
                cli.create
            }.to raise_error(Thor::Error, "No filenames provided for create")
        end

        it "creates a package with a filename" do
            append_to_file(package_file, "
                package {
                    name 'foo'
                    version '0.0.1'
                    type 'dir'
                }
            ")

            cli.create(package_file)
        end
    end
end

We create a tempfile and the filename stored in the package_file 'let' variable. That's just an empty file, though. We then want to put some stuff in it, so we create the append_to_file helper method. (This highlights something important - we can add methods as needed to our tests.) Then, we use it to fill the file with stuff and pass the filename to Packager::CLI#create.

Note: We have to flush to disk to ensure that when we read from the file, the contents are actually in the file instead of the output buffer.

We have our input filename (and its contents) figured out. What should we expect to happen? We could look at whether a package was created in the directory we invoked the CLI. That is what our user story requires. And, we will want to have that sort of integration test, making sure everything actually functions the way a user will expect it to function. But, that's not this test. (Not to mention the Executor doesn't actually invoke FPM yet!)

These tests are meant to exercise each class in isolation - these are unit tests. Unit tests exercise the insides of one and only one class. If we were to see if a package is created, we're actually testing three classes - the CLI as well as the DSL and Executor classes. That's too many moving parts to quickly figure out what's gone wrong when something fails. By having tests which only focus on the internals of the CLI, DSL, and Executor classes by themselves as well as the integration of all the parts, we can easily see which part of our system is in error when tests start to fail. Is it the integration and CLI tests? Is it just the integration tests? Is it just the DSL? All of these scenarios immediately point out the culprit (or culprits).

Given that the CLI is going to invoke the DSL and Executor, we want to catch the invocation of the #parse_dsl and #create_package methods. We don't want to actually do what those methods do as part of this test. Frankly, the CLI object doesn't care what those methods do. It only cares that the methods function, whatever that means.

RSpec has a concept called stubbing. (This is part of a larger concept in testing called "mocking". RSpec provides mocks, doubles, and stubs, as do many other libraries like mocha.) For our purposes, what we can do is say "The next time method X is called on an instance of class Y, do <this> instead." Stubs (and mocks and doubles) will be uninstalled at the end of every spec, so there's no danger of it leaking or affecting anything else. With stubs, our happy-day test now looks like

        it "creates a package with a filename" do
            contents = "
                package {
                    name 'foo'
                    version '0.0.1'
                    type 'dir'
                }
            "
            append_to_file(package_file, contents)

            expect(Packager::DSL).to receive(:parse_dsl).with(contents).and_return(:stuff)
            expect_any_instance_of(Packager::Executor).to receive(:create_package).with(:stuff).and_return(true)

            cli.create(package_file)
        end

This looks like an awful mouthful. And, it may seem odd to create expectations before we call cli.create. But, if you think about it for a second and read the two expectations out loud, it can make sense. All of our tests so far have been "We expect X is true." What we're now saying is "We expect X will be true." Which works out.

As for the formatting, you can do the following:

      expect(Packager::DSL).to(
          receive(:parse_dsl).
          with(contents).
          and_return(:stuff)
      )

Note the new parentheses for the .to() method and the periods at the end of one line (instead of the beginning of the next). These are required for how Ruby parses. You could also use backslashes, but I find those too ugly. This, to me, is a good compromise. Please feel free to experiment - the goal is to make it readable for you and your maintainers, not me or anyone else in the world.

Our #create method now changes to

        def create(filename=nil)
            raise Thor::Error, "No filenames passed for create" unless filename
            items = Packager::DSL.parse_dsl(IO.read(filename))
            Packager::Executor.new.create_package(items)            
        end

and we're done. Remember to do a git status to make sure you're adding all the new files we've been creating to source control.

prev

Tuesday, August 18, 2015

Creating the Packager DSL - The executor

  1. Why use a DSL?
  2. Why create your own DSL?
  3. What makes a good DSL?
  4. Creating your own DSL - Parsing
  5. Creating your own DSL - Parsing (with Ruby)
  6. Creating the Packager DSL - Initial steps
  7. Creating the Packager DSL - First feature
  8. Creating the Packager DSL - The executor
Our user story:
I want to run a script, passing in the name of my DSL file. This should create an empty package by specifying the name, version, and package format. If any of them are missing, print an error message and stop. Otherwise, an empty package of the requested format should be created in the directory I am in.
Our progress:

  • We can parse the DSL into a Struct. We can handle name, version, and package format. If any of them are missing, we raise an appropriate error message.
We still need to:
  • Create a package from the parsed DSL
  • Provide a script that executes everything
Since the script is the umbrella, creating the package is the next logical step. To create the package, we'll defer to FPM. FPM doesn't have a Ruby API - it is designed to be used by sysadmins and requires you to build a directory of what you want and invoke a script.

The first seemingly-obvious approach is to directly embed the things to do directly in the parser where we are currently creating the Package struct. That way, we do things right away instead of building some intermediate thing we're just going to throw away. Sounds like a great idea. And, it would be horrible.

The best programs are written in reusable chunks that each do one and only one thing and do it well. This is true for operating systems and especially true for programs. In software, we call it coupling, or the degree one unit is inextricably-linked to other units. And, we want our units to be coupled as little as possible.

Our one unit right now (the parser) handles understanding the DSL as a string. We have two other responsibilities - creating a package and handling the interaction with the command-line. Unless we have a good reason otherwise, let's treat each of those as a separate unit. (There are occasionally good reasons to couple things together, but it's best to know why the rule's there before you go about breaking it.)

Now, we have two different units that, when taken together in the right order, will work together to take a DSL string and create a package. They will need to communicate one to the next so that the package-creation unit creates the package described by the DSL-parsing unit. We could come up with some crazy communication scheme, but the parser already produces something (the Package struct). That should be sufficient for now. When that changes, we can refactor with confidence because we will keep our 100% test coverage.

Before anything else, we'll need to install FPM. So, add it to the gemspec and bundle install.

Next, we need to write a test (always write the test first). The first part of the test (the input) is pretty easy to see - we want to create a Packager::Struct::Package with a name, version, and format set. Figuring out what the output should be is . . . a little more complicated. We don't want to test how FPM works - we can assume (barring counterexample) that FPM works exactly as advertised. But, at some point, we need to make sure that our usage of FPM is what we want it to be. So, we will need to test that the output FPM creates from our setup is what we want.

The problem here is FPM delegates the actual construction of the package to the OS tools. So, it uses RedHat's tools to build RPMs, Debian's tools to be DEBs, etc. More importantly, the tools to parse those package formats only exist on those systems. Luckily for this test, we can ignore this problem. The command for creating an empty package is extremely simple - you can test it easily yourself on the commandline. But, we need to keep it in the back of our mind for later - sooner rather than later.

Since we're testing the executor (vs. the parser), we should put our test in a second spec file. The test would look something like:

describe Packager::Executor do
    it "creates an empty package"
        executor = Packager::Executor.new(:dryrun => true)
        input = Packager::DSL::Package.new('foo', '0.0.1', 'unknown')
        executor.create_package(input)
        expect(executor.command).to eq([
            'fpm',
            '--name', input.name,
            '--version', input.version,
            '-s', 'empty',
            '-t', input.package_format,
        ])
    end
end

A few things here:
  1. Unlike Packager::DSL where we run with class methods (because of how DSL::Maker works), we're creating an instance of the Packager::Executor class to work with. This allows us to set some configuration to control how the executor will function without affecting global state.
  2. FPM does not support the "unknown" package format. We're testing that we get out what we put in.
  3. The FPM command already looks hard to work with. Arrays are positional, but the options to FPM don't have to be. We will want to change that to be more testable.
  4. Creating that Packager::DSL::Package object is going to become very confusing very quickly for the same reasons as the FPM command - it's an Array. Positional arguments become hard to work with over time.
You should run the spec to make sure it fails. The Packager::Executor code in lib/packager/executor.rb would look like:

class Packager::DSL
    attr_accessor :dry_run, :command

    def initialize(opts)
        @dry_run = opts[:dry_run] ? true : false
        @command = [] # Always initialize your attributes
    end

    def create_package(item)
        command = [
            'fpm',
            '--name', item.name,
            '--version', item.version,
            '-s', 'empty',
            '-t', item.package_format,
        ]

        return true
    end
end

Make sure to add the appropriate require statement in either lib/packager.rb or spec/spec_helper.rb and rake spec should pass. Add and commit everything, then push. We're not done, but we're one big step closer.

prev

Monday, August 17, 2015

Creating the Packager DSL - Initial steps

  1. Why use a DSL?
  2. Why create your own DSL?
  3. What makes a good DSL?
  4. Creating your own DSL - Parsing
  5. Creating your own DSL - Parsing (with Ruby)
  6. Creating the Packager DSL - Initial steps
  7. Creating the Packager DSL - First feature
The Packager DSL is, first and foremost, a software development project. In order for a software project to function properly, there's a lot of scaffolding that needs to be set up. This post is about that. (Please feel free to skip it if you are comfortable setting up a Ruby project.) When this post is completed, we will have a full walking skeleton. We can then add each new feature quickly. We'll stop when we're just about to add the first DSL-ish feature.

If you're creating your own DSL with your own project name, just put it everywhere you see packager and Packager accordingly.

Repository choice

Distributed version control systems (aka, DVCS) provide significant capabilities in terms of managing developer contributions over previous centralized systems (such as Subversion and CVS). Namely, Git (and Mercurial or Bazaar) have extremely lightweight branches and very good merging strategies. There's a ton of good work on the net about choosing one of these systems and how to use it.

I use Git. For my opensource work, I use GitHub. If I'm within an enterprise, Gitlab or Stash are excellent repository management solutions. The minimum scaffolding for any project in Git is:
  • .gitignore file
    • These files are never to be checked into the repository. Usually intermediate files for various developer purposes (coverage, build artifacts, etc). The contents of this file are usually language-specific.
  • .gitattributes file
    • Ensure line-endings are consistent between Windows, Linux, and OSX.

Repository scaffolding

The minimum scaffolding for a Ruby project is:
  • lib/ directory
    • Where our library code lives. Leave this empty (for now).
  • spec/ directory
    • Where our tests (or specifications) live. Leave this empty (for now).
  • bin/ directory
    • Where our executables live. Leave this empty (for now).
  • .rspec file
    • Default options for rspec (Ruby's specification runner).
  • Rakefile file
    • Ruby's version of Makefile
  • packager.gemspec file
    • How to package our project.
  • Gemfile file
    • Dependency management and repository location. I write mine so that it will delegate to the gemspec file. I hate repeating myself, especially when I don't have to.
You are welcome to copy any or all of these files from the ruby-packager project and edit them accordingly or create them yourself. There is ridiculous amounts of documentation on this process and each of the tools (Rake, Rspec, Gem, and Bundler) all over the web. Ruby, moreso than most software communities, has made a virtue of clear, clean, and comprehensive documentation.

You also need to have installed Bundler using gem install bundler. Bundler will then make sure your dependencies are always up-to-date with the simple bundle install command. I tend to install RVM first, just to make sure I have can upgrade (or downgrade!) my version of Ruby as needed.

At this point, go ahead and add all of these files to your checkout and commit. I like using the message "Create initial commit". (If you use git, the empty lib/ and spec/ directories won't be added. Don't worry - they will be added before we're finished with Day 1.)

Note: If your DSL is for use within an enterprise, please make sure that you know where to download and install your gems from. Depending on the enterprise, you probably already have an internal gems repository and a standard process for requesting new gems and new versions. You can set the location of that internal gems repository within the Gemfile instead of https://rubygems.org. If you have to set a proxy in order to reach rubygems.org, please talk your IT administrator first.

Documentation

It may seem odd to consider documentation before writing anything to document (and we will revisit this when we have an actual API to document), but there are two documentation files we should create right away:
  • README.md
    • This is the file GitHub (and most other repository systems) will display when someone goes to the repository. This should describe your project and give a basic overview of how to use it.
  • Changes / Changelog
    • Development is the act of changing software. Let's document what we've changed and when. I prefer the name Changes, but several other names will work.
I prefer to use Markdown (specifically, GitHub-flavored Markdown) in my README files. Hence, the .md suffix. Please use whatever markup language you prefer and which will work within your repository system.

Your changelog should be a reverse-sorted (newest-first) list of all the changes grouped by version. The goal is to provide a way for users to determine when something was changed and why. So, wherever possible, you should provide a link to the issue or task that describes what changed in detail. The best changelog is just a list of versions with issue descriptions linking to the issue.

Finally, git add . && git commit -am "Create documentation stubs"

SDLC

Before we write any software, let's talk quickly about how we're going to get the changes into the software and out the door. While the first set of changes are going to be driven by what we want, we definitely want a place for our users to open requests. GitHub provides a very simple issue tracker with every project, so packager will use that. If you're within an enterprise, you probably already have a bug tracker to use. If you don't, I recommend Jira (and the rest of the Atlassian suite) if you're a larger organization and Trac if you're not.

Either way, every change to the project should have an associated issue in your issue tracker. That way, there's a description of why the change was made associated with the blob that is the change.

Speaking of changes, use pull requests wherever possible. Pull requests do two very important things:
  1. They promote code reviews, the single strongest guard against bugs.
  2. They make it easy to have a blob that is the change for an issue.
I will confess, however, that when I'm working by myself on a project, I tend to commit to master without creating issues. But, I like knowing that the process can be easily changed to accomodate more than one developer.

Release process

Our release process will be pretty simple - we'll use the gem format. We can add a couple lines to our Rakefile and Ruby will handle all the work for us.
require 'rubygems/tasks'
Gem::Tasks.new
We will also need to add it to our gemspec
s.add_development_dependency 'rubygems-tasks', '~> 0'
and bundle install. That provides us with the release Rake task (among others). Please read the documentation for what else it provides and what it does.

Note: If your DSL is for use within an enterprise, please make sure that you know where and how you are going to release your gem. This will likely be the place you will download your dependencies from, but not always. Please talk your IT administrator first.

First passing spec

I am a strong proponent of TDD. Since we're starting from scratch, let's begin as we mean to continue. So, first, we add a spec/spec_helper.rb file that looks like:
require 'simplecov'
SimpleCov.configure do
    add_filter '/spec/'
    add_filter '/vendor/'
    minimum_coverage 100
    refuse_coverage_drop
end
SimpleCov.start

require 'packager'
Then, add a spec/first_spec.rb file with:
describe "DSL" do
    it "can compile"
        expect(true).to be(true)
    end
end
(This is a throwaway test, and that's okay. We will get rid of it when we have something better to test.)

rake spec fails right away because simplecov isn't installed. You'll need to add simplecov to the packager.gemspec and run bundle install to install simplecov.

We haven't created lib/packager.rb yet, so when we execute rake spec, it will fail (compile error). So, let's create lib/packager.rb with:
class Packager
end
rake spec now passes. git add . && git commit -am "Create initial test" to mark the success.

We also have 100% test coverage from the start. While this isn't a silver bullet that cures cancer, it does tell us when we're writing code that may not always work like we think it will. By forcing every line of code to be executed at least once by something our test suite (which is what 100% coverage guarantees), we are forced to write at least one test to reach every nook and cranny. Hopefully, we'll be adult enough to recognize when there's a use-case that isn't tested, even if the code under question is executed.

Continuous Integration

Finally, before we start adding actual features, let's end with continuous integration. If you're on GitHub, then use Travis. Just copy the one in ruby-packager and it will work properly (JRuby needs some special handling for code coverage). Travis will even run the test suite on pull requests and note the success (or lack thereof) right there. In order to enable Travis to run against your GitHub repository, you will need to register with Travis and point-and-click on its site to set things up. There is plenty of documentation (including StackOverflow Q&A) on the process.

Otherwise, talk to your devops or automation team in order to set up integration with Jenkins, Bamboo, or whatever you're using in your enterprise. Whatever you do, it should be set up to run the whole test suite on every single push on every single branch. More importantly, it should act as a veto for pull requests (if that's supported by your tooling).

Summary

It may not seem like we've actually done anything, but we've done quite a lot. A development project isn't about writing code (though that's a necessary part). It's about managing requests for change and delivering them in a sane and predictable fashion. Everything we've done here is necessary to support exactly that.