Wednesday, August 5, 2015

What makes a good DSL?

  1. Why use a DSL?
  2. Why create your own DSL?
  3. What makes a good DSL?
  4. Creating your own DSL - Parsing
  5. Creating your own DSL - Parsing (with Ruby)
  6. Creating the Packager DSL - Initial steps
  7. Creating the Packager DSL - First feature
If someone wanted to create a human language which was really good at managing reindeer herds, it would make more sense to just use Sami. It has a lot of busat (expressive terseness) in that domain. Likewise, it makes the most sense to just use Inuit terms when building a human language focused on dealing with matters of snow and ice. These domain-specific languages (DSLs) focus their primary areas of terseness on specific areas (reindeer, snow/ice, etc), rendering them less suitable for other areas (tropical seafaring, web application devleopment, etc).

The same goes for programming languages. If you work on modern web applications, you already use several DSLs, such as SQL and CSS (for set manipulation and defining visitors on trees, respectively). These DSLs have two qualities that elevate them to the top of the DSL game. They are:

  1. Intensely Focused (Do only one thing, but do it well)
  2. Expressively Terse (Just the facts)

Intensely Focused

SQL is the standard language for interacting with data in a relational database, such as Oracle or MySQL. Relational databases store their data in sets. At its heart, SQL is a set manipulation DSL. It has roughly 40-60 keywords (depending on the dialect and how you count). But, every single one of those keywords has a single purpose focusing on one (and only one) of the basic set operations. Where SQL has keywords that do non-set things (such as collations or engine-specific hints), that's where people complain the most about how complicated SQL is. While dealing with set theory can be difficult to master, no-one complains about how SQL implements it.

CSS is even more focused. There are hundreds of properties, each with their own specific set of acceptable values (potentially depending on what type of node is affected), and that's what most people see in CSS. But, CSS isn't a DSL for setting values for properties - it's a DSL for creating actions to take when walking trees. In short, CSS is a massive Visitor pattern definer.

But, CSS doesn't allow you to take just any action - you are only allowed to set properties on nodes. It isn't a generic visitor pattern definer - it is focused on one type of visitor action. There may be hundreds of properties, but they all follow the exact same pattern of name: value [modifier];. This allows it to be more generic when it comes to the matching rules for which nodes in the tree are affected, which is the true power of CSS.

Expressively Terse

Terseness is a quality of using as few words as possible to say what you want to say. It's expressive only if every word we use is exactly the right word for the job. Or, if the concept expressed by the word is exactly the right concept. You have to truly understand what your DSL is focused on doing.

Both SQL and CSS are extreme terse. There is one and only one keyword or operand for each concept. Every concept is mapped clearly and cleanly to the problemspace at hand. If you removed any keyword or operand, you would cripple the DSL's ability to solve problems.

CSS, in specific, is extremely terse. Syntactically, the only interesting things happen in the selectors. But, even with that complexity, there is only way to specify a specific path to a node. (Depending on your structure, you may be able to specify multiple ways to get to a node, but there's only one way to specify each way.)

prevnext