Clojure Guides_ Growing a DSL with Clojure
Clojure Guides_ Growing a DSL with Clojure
Clojure Guides_ Growing a DSL with Clojure
Lisps like Clojure are well suited to creating rich DSLs that integrate seamlessly into the language.
You may have heard Lisps boasting about code being data and data being code. In this article we will
define a DSL that benefits handsomely from this fact.
We will see our DSL evolve from humble beginnings, using successively more of Clojure’s powerful and
unique means of abstraction.
The Mission
Our goal will be to define a DSL that allows us to generate various scripting languages. The DSL code
should look similar to regular Clojure code.
For example, we might use this Clojure form to generate either Bash or Windows Batch script output:
(if (= 1 2)
(println "a")
(println "b"))
if [ 1 -eq 2 ]; then
echo "a"
else
echo "b"
fi
IF 1==2 (
ECHO a
) ELSE (
ECHO b
)
We might, for example, use this DSL to dynamically generate scripts to perform maintenance tasks on
server farms.
To start, we need to expose some parallels between Clojure’s core types and our domain language.
Strings and numbers should just simply return their String representation, so we will start with those.
Let’s define a function emit-bash-form that takes a Clojure form and returns a string that represents the
equivalent Bash script.
#'cljs.user/emit-bash-form
The cond expression handles cases for strings and numbers or throws an exception.
(emit-bash-form 1)
"1"
(emit-bash-form "a")
"a"
(emit-bash-form {})
Execution error.
ERROR: #error {:message "Fell through", :data {}}
Now if we want to add some more dispatches, we just need to add a new clause to our cond expression.
Bash prints to the screen using echo . You’ve probably seen it if you’ve spent any time with a Linux shell.
clojure.core also contains a function println that has similar semantics to Bash's echo .
(println "asdf")
asdf
nil
To made an analogy with Java, imagine calling this Java code and expecting the first argument to equal
System.out.println("asdf") .
foo( System.out.println("asdf") );
Java evaluates the arguments before you can even blink, resulting in a function call to println. How can we
stop this evaluation and return the raw code?
Indeed this is an impossible task in Java. Even if this were possible, what could we expect do with the raw
code?(!)
Whatever "type" the raw code System.out.println("asdf") has, it’s not meant to be known by anyone
but compiler writers.
To get to the actual raw code at all, Clojure provides a mechanism to stop evaluation via quote.
Prepending a quote to a code form prevents its evaluation and returns the raw Clojure form.
'(println "a")
(println "a")
cljs.core/List
It's a list!
We can now interrogate the raw code as if it were any old Clojure list (because it is!).
println
(second '(println "a"))
"a"
(emit-bash-form
'(println "a"))
Let’s add this feature to emit-bash-form . We need to add a new clause to the cond form. Which type
should the dispatch value be?
#'cljs.user/emit-bash-form
"echo a"
"echo hello"
Currently, to extend our implementation we add to our function emit-bash-form. Eventually this function will
be too large to manage; we need a mechanism to split this function into more manageable pieces.
Essentially emit-bash-form is dispatching on the type of its argument. This dispatch style is a perfect fit for
an abstraction Clojure provides called a multimethod.
The multimethod handles dispatch in a similar way to cond , but without having to actually write each
case. Let’s compare this multimethod with our previous cond expression. defmulti is used to create a
new multimethod, and associates it with a dispatch function.
(defmulti emit-bash
(fn [form]
(cond
(list? form) :list
(string? form) :string
(number? form) :number
:else (throw (ex-info "Unknown type" form)))))
#'cljs.user/emit-bash
defmethod is used to add methods to an existing multimethod. Here :string is the dispatch value, and
the method returns the form as is.
(defmethod emit-bash
:string
[form]
form)
#object[cljs.core.MultiFn]
(defmethod emit-bash
:number
[form]
(str form))
(defmethod emit-bash
:list
[form]
(case (name (first form))
"println" (str "echo " (second form))))
#object[cljs.core.MultiFn]
Adding new methods has the same result as extending our cond expression, except:
So how can we use emit-bash ? Calling a multimethod is just like calling any Clojure function.
(emit-bash '(println "a"))
"echo a"
(defmulti emit-batch
(fn [form]
(cond
(list? form) :list
(string? form) :string
(number? form) :number
:else (throw (ex-info "Unknown type" form)))))
(defmethod emit-batch
:list
[form]
(case (name (first form))
"println" (str "ECHO " (second form))
nil))
(defmethod emit-batch
:string
[form]
form)
(defmethod emit-batch
:number
[form]
(str form))
#object[cljs.core.MultiFn]
We can now use emit-batch and emit-bash when we want Batch and Bash script output respectively.
"ECHO a"
"echo a"
Ad-hoc Hierarchies
Comparing the two implementations reveals many similarities. In fact, the only dispatch that differs is
clojure.lang.PersistentList!
We can tackle this with a simple mechanism Clojure provides to define global, ad-hoc hierarchies.
When I say this mechanism is simple, I mean non-compound; inheritance is not compounded into the
mechanism to define classes or namespaces but rather is a separate functionality.
Contrast this to languages like Java, where inheritance is tightly coupled with defining a hierarchy of
classes.
We can derive relationships from names to other names, and between classes and names. Names can be
symbols or keywords. This is both very general and powerful!
We will use (derive child parent) to establishes a parent/child relationship between two keywords.
isa? returns true if the first argument is derived from the second in a global hierarchy.
true
Let’s define a hierarchy in which the Bash and Batch implementations are siblings.
nil
(parents ::bash)
#{:cljs.user/common}
(parents ::batch)
#{:cljs.user/common}
The dispatch function returns a vector of two items: the current implementation (either ::bash or
::batch ), and the class of our form (like emit-bash 's dispatch function).
#'cljs.user/*current-implementation*
In our hierarchy, ::common is the parent, which means it should provide the methods in common with its
children. Let's fill in these common implementations.
Remember the dispatch value is now a vector, notated with square brackets. In particular, in each
defmethod the first vector is the dispatch value (the second vector is the list of formal parameters).
#object[cljs.core.MultiFn]
This should look familiar. The only methods that needs to be specialized are those for
clojure.lang.PersistentList, as we identified earlier. Notice the first item in the dispatch value is ::bash or
::batch instead of ::common .
(defmethod emit [::bash :list]
[form]
(case (name (first form))
"println" (str "echo " (second form))
nil))
(defmethod emit [::batch :list]
[form]
(case (name (first form))
"println" (str "ECHO " (second form))
nil))
#object[cljs.core.MultiFn]
The ::common implementation is intentionally incomplete; it merely exists to manage any common
methods between its children.
We can test emit by rebinding *current-implementation* to the implementation of our choice with
binding.
"a"
"ECHO a"
"echo a"
Execution error.
ERROR: Error: No method in multimethod 'cljs.user/emit' for dispatch value: [:clj
Because we didn’t define an implementation for [::common :list] , the multimethod falls through and
throws an Exception.
Multimethods offer great flexibility and power, but with power comes great responsibility. Just because we
can put our multimethods all in one namespace doesn’t mean we should. If our DSL becomes any bigger,
we would probably separate all Bash and Batch implementations into individual namespaces.
This small example, however, is a good showcase for the flexibility of decoupling namespaces and
inheritance.
Icing on the Cake
We’ve built a nice, solid foundation for our DSL using a combination of multimethods, dynamic vars, and
ad-hoc hierarchies, but it’s a bit of a pain to use.
"echo a"
The binding expression is an good candidate. We can reduce the chore of rebinding current-
implementation by introducing with-implementation (which we will define soon).
(with-implementation ::bash
(emit '(println "a")))
That’s an improvement. But there’s another improvement that’s not as obvious: the quote used to delay
our form’s evaluation. Let’s use script, which we will define later, to get rid of this boilerplate:
(with-implementation ::bash
(script
(println "a")))
This looks great, but how do we implement script? Clojure functions evaluate all their arguments before
evaluating the function body, exactly the problem the quote was designed to solve.
To hide this detail we must wield one of Lisp’s most unique forms: the macro.
The macro’s main drawcard is that it doesn’t implicitly evaluate its arguments. This is a perfect fit for an
implementation of script.
#'cljs.user/script
To get an idea what is happening, here’s what a call to script returns and then implicitly evaluates.
It isn’t crucial that you understand the details, rather appreciate the role that macros play in cleaning up
the syntax.
We will also implement with-implementation as a macro, but for different reasons than with script. To
evaluate our script form inside a binding form we need to drop it in before evaluation.
(defmacro with-implementation
[impl & body]
`(binding [cljs.user/*current-implementation* ~impl]
~@body))
#'cljs.user/with-implementation
Roughly, here is the lifecyle of our DSL, from the sugared wrapper to our unsugared foundations.
(with-implementation ::bash
(script
(println "a")))
=>
(with-implementation ::bash
(emit
'(println "a"))
=>
(binding [*current-implementation* ::bash]
(emit
'(println "a")))
(with-implementation ::bash
(script
(println "a")))
"echo a"
(with-implementation ::batch
(script
(println "a")))
"ECHO a"
It’s easy to see how a few well-placed macros can put the sugar on top of strong foundations. Our DSL
really looks like Clojure code!
Conclusion
We have seen many of Clojure’s advanced features working in harmony in this DSL, even though we
incrementally incorported many of them. Generally, Clojure helps us switch our implementation strategies
with minimum fuss.
This is notable when you consider how much our DSL evolved.
We initially used a simple cond expression, which was converted into two multimethods, one for each
implementation. As multimethods are just ordinary functions, the transition was seamless for any existing
testing code. (In this case I renamed the function for clarity).
We then merged these multimethods, utilizing a global hierachy for inheritance and dynamic vars to select
the current implementation.
Finally, we devised a pleasant syntactic interface with a two simple macros, eliminating that last bit of
boilerplate that other languages would have to live with.
I hope you have enjoyed following the evolution of our little DSL. This DSL is based on a simplified version
of Stevedore (https://github.com/pallet/stevedore) by Hugo Duncan (http://hugoduncan.org/). If you are
interested in how this DSL can be extended, you can do no better than browsing the source code of
Stevedore (https://github.com/pallet/stevedore).
Copyright
Copyright Ambrose Bonnaire-Sergeant, 2013
Links
About (/articles/about/)
Table of Contents (/articles/content/)
Getting Started (/articles/tutorials/getting_started/)
Introduction to Clojure (/articles/tutorials/introduction/)
Clojure Editors (/articles/tutorials/editors/)
Clojure Community (/articles/ecosystem/community/)
Basic Web Development (/articles/tutorials/basic_web_development/)
Language: Functions (/articles/language/functions/)
Language: clojure.core (/articles/language/core_overview/)
Language: Collections and Sequences (/articles/language/collections_and_sequences/)
Language: Namespaces (/articles/language/namespaces/)
Language: Java Interop (/articles/language/interop/)
Language: Polymorphism (/articles/language/polymorphism/)
Language: Concurrency and Parallelism (/articles/language/concurrency_and_parallelism/)
Language: Macros (/articles/language/macros/)
Language: Laziness (/articles/language/laziness/)
Language: Glossary (/articles/language/glossary/)
Ecosystem: Library Development and Distribution (/articles/ecosystem/libraries_authoring/)
Ecosystem: Web Development (/articles/ecosystem/web_development/)
Ecosystem: Generating Documentation (/articles/ecosystem/generating_documentation/)
Building Projects: tools.build and the Clojure CLI (/articles/cookbooks/cli_build_projects/)
Data Structures (/articles/cookbooks/data_structures/)
Strings (/articles/cookbooks/strings/)
Mathematics with Clojure (/articles/cookbooks/math/)
Date and Time (/articles/cookbooks/date_and_time/)
Working with Files and Directories in Clojure (/articles/cookbooks/files_and_directories/)
Middleware in Clojure (/articles/cookbooks/middleware/)
Parsing XML in Clojure (/articles/cookbooks/parsing_xml_with_zippers/)
Growing a DSL with Clojure