Clojure Guides_ Growing a DSL with Clojure

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Growing a DSL with Clojure

Lisps like Clojure are well suited to creating rich DSLs that integrate seamlessly into the language.

You may have heard Lisps boasting about code being data and data being code. In this article we will
define a DSL that benefits handsomely from this fact.

We will see our DSL evolve from humble beginnings, using successively more of Clojure’s powerful and
unique means of abstraction.

The Mission
Our goal will be to define a DSL that allows us to generate various scripting languages. The DSL code
should look similar to regular Clojure code.

For example, we might use this Clojure form to generate either Bash or Windows Batch script output:

Input (Clojure form):

(if (= 1 2)
(println "a")
(println "b"))

Output (Bash script):

if [ 1 -eq 2 ]; then
echo "a"
else
echo "b"
fi

Output (Windows Batch script):

IF 1==2 (
ECHO a
) ELSE (
ECHO b
)

We might, for example, use this DSL to dynamically generate scripts to perform maintenance tasks on
server farms.

Baby Steps: Mapping to Our Domain Language


I like Bash, so let’s start with a Bash script generator.

To start, we need to expose some parallels between Clojure’s core types and our domain language.

So which Clojure types have simple analogues in Bash script?

Strings and numbers should just simply return their String representation, so we will start with those.
Let’s define a function emit-bash-form that takes a Clojure form and returns a string that represents the
equivalent Bash script.

(defn emit-bash-form [a]


"Returns a String containing the equivalent Bash script
to its argument."
(cond
(string? a ) a
(number? a) (str a)
:else (throw (ex-info "Fell through" a))))

#'cljs.user/emit-bash-form

The cond expression handles cases for strings and numbers or throws an exception.

(emit-bash-form 1)

"1"

(emit-bash-form "a")

"a"

(emit-bash-form {})

Execution error.
ERROR: #error {:message "Fell through", :data {}}

Now if we want to add some more dispatches, we just need to add a new clause to our cond expression.

Echo and Print


Let’s add a feature.

Bash prints to the screen using echo . You’ve probably seen it if you’ve spent any time with a Linux shell.

ambrose@ambrose-desktop> echo asdf


asdf

clojure.core also contains a function println that has similar semantics to Bash's echo .

(println "asdf")

asdf
nil

Wouldn’t it be cool if we could pass (println "a") to emit-bash-form ?


(emit-bash-form (println "asdf"))

At first, this seems like asking the impossible.

To made an analogy with Java, imagine calling this Java code and expecting the first argument to equal
System.out.println("asdf") .

foo( System.out.println("asdf") );

(Let’s ignore the fact that System.out.println(...) returns a void).

Java evaluates the arguments before you can even blink, resulting in a function call to println. How can we
stop this evaluation and return the raw code?

Indeed this is an impossible task in Java. Even if this were possible, what could we expect do with the raw
code?(!)

System.out.println("asdf") is not a Collection, so we can’t iterate over it; it is not a String , so we


can’t partition it with regular expressions.

Whatever "type" the raw code System.out.println("asdf") has, it’s not meant to be known by anyone
but compiler writers.

Lisp turns this notion on its head.

Lisp Code Is Data


A problem with raw code forms in Java (assuming it is possible to extract them) is the lack of facilities to
interrogate them. How does Clojure get around this limitation?

To get to the actual raw code at all, Clojure provides a mechanism to stop evaluation via quote.
Prepending a quote to a code form prevents its evaluation and returns the raw Clojure form.

'(println "a")

(println "a")

So what is the type of our result?

(type '(println "a"))

cljs.core/List

It's a list!

We can now interrogate the raw code as if it were any old Clojure list (because it is!).

(first '(println "a"))

println
(second '(println "a"))

"a"

This is a result of Lisp’s remarkable property of code being data.

A Little Closer to Clojure


Using quote, we can get halfway to a DSL that looks like Clojure code.

(emit-bash-form
'(println "a"))

Let’s add this feature to emit-bash-form . We need to add a new clause to the cond form. Which type
should the dispatch value be?

So let’s add a clause for lists.

(defn emit-bash-form [a]


"Returns a String containing the equivalent Bash script
to its argument."
(cond
(list? a)
(case (name (first a))
"println" (str "echo " (second a)))

(string? a) a
(number? a) (str a)
:else (throw (ex-info "Fell through" a))))

#'cljs.user/emit-bash-form

As long as we remember to quote the argument, this is not bad.

(emit-bash-form '(println "a"))

"echo a"

(emit-bash-form '(println "hello"))

"echo hello"

Multimethods to Abstract the Dispatch


We’ve made a good start, but I think it’s time for some refactoring.

Currently, to extend our implementation we add to our function emit-bash-form. Eventually this function will
be too large to manage; we need a mechanism to split this function into more manageable pieces.
Essentially emit-bash-form is dispatching on the type of its argument. This dispatch style is a perfect fit for
an abstraction Clojure provides called a multimethod.

Let’s define a multimethod called emit-bash .

The multimethod handles dispatch in a similar way to cond , but without having to actually write each
case. Let’s compare this multimethod with our previous cond expression. defmulti is used to create a
new multimethod, and associates it with a dispatch function.

(defmulti emit-bash
(fn [form]
(cond
(list? form) :list
(string? form) :string
(number? form) :number
:else (throw (ex-info "Unknown type" form)))))

#'cljs.user/emit-bash

defmethod is used to add methods to an existing multimethod. Here :string is the dispatch value, and
the method returns the form as is.

(defmethod emit-bash
:string
[form]
form)

#object[cljs.core.MultiFn]

Similar for numbers and lists:

(defmethod emit-bash
:number
[form]
(str form))

(defmethod emit-bash
:list
[form]
(case (name (first form))
"println" (str "echo " (second form))))

#object[cljs.core.MultiFn]

Adding new methods has the same result as extending our cond expression, except:

multimethods handle the dispatch, instead of writing it manually


anyone can extend the multimethod at any point, without disturbing existing code

So how can we use emit-bash ? Calling a multimethod is just like calling any Clojure function.
(emit-bash '(println "a"))

"echo a"

The dispatch is silently handled under the covers by the multimethod.

Extending our DSL for Batch Script


Let’s say I’m happy with the Bash implementation. I feel like starting a new implementation that generates
Windows Batch script. Let’s define a new multimethod, emit-batch.

(defmulti emit-batch
(fn [form]
(cond
(list? form) :list
(string? form) :string
(number? form) :number
:else (throw (ex-info "Unknown type" form)))))

(defmethod emit-batch
:list
[form]
(case (name (first form))
"println" (str "ECHO " (second form))
nil))

(defmethod emit-batch
:string
[form]
form)

(defmethod emit-batch
:number
[form]
(str form))

#object[cljs.core.MultiFn]

We can now use emit-batch and emit-bash when we want Batch and Bash script output respectively.

(emit-batch '(println "a"))

"ECHO a"

(emit-bash '(println "a"))


"echo a"

"echo a"
Ad-hoc Hierarchies
Comparing the two implementations reveals many similarities. In fact, the only dispatch that differs is
clojure.lang.PersistentList!

Some form of implementation inheritance would come in handy here.

We can tackle this with a simple mechanism Clojure provides to define global, ad-hoc hierarchies.

When I say this mechanism is simple, I mean non-compound; inheritance is not compounded into the
mechanism to define classes or namespaces but rather is a separate functionality.

Contrast this to languages like Java, where inheritance is tightly coupled with defining a hierarchy of
classes.

We can derive relationships from names to other names, and between classes and names. Names can be
symbols or keywords. This is both very general and powerful!

We will use (derive child parent) to establishes a parent/child relationship between two keywords.
isa? returns true if the first argument is derived from the second in a global hierarchy.

(derive ::child ::parent)



(isa? ::child ::parent)

true

Let’s define a hierarchy in which the Bash and Batch implementations are siblings.

(derive ::bash ::common)


(derive ::batch ::common)

nil

Let’s test this hierarchy.

(parents ::bash)

#{:cljs.user/common}

(parents ::batch)

#{:cljs.user/common}

Utilizing a Hierarchy in a Multimethod


We can now define a new multimethod emit that utilizes our global hierarchy of names.
(defmulti emit
(fn [form]
[*current-implementation*
(cond
(list? form) :list
(string? form) :string
(number? form) :number
:else (throw (ex-info "Unknown type" form)))]))

WARNING: Use of undeclared Var cljs.user/*current-implementation* at line 3


#'cljs.user/emit

The dispatch function returns a vector of two items: the current implementation (either ::bash or
::batch ), and the class of our form (like emit-bash 's dispatch function).

*current-implementation* is a dynamic var, which can be thought of as a thread-safe global variable.

(def ^{:dynamic true}


;; The current script language implementation to generate
*current-implementation*)

#'cljs.user/*current-implementation*

In our hierarchy, ::common is the parent, which means it should provide the methods in common with its
children. Let's fill in these common implementations.

Remember the dispatch value is now a vector, notated with square brackets. In particular, in each
defmethod the first vector is the dispatch value (the second vector is the list of formal parameters).

(defmethod emit [::common :string]


[form]
form)

(defmethod emit [::common :number]
[form]
(str form))

#object[cljs.core.MultiFn]

This should look familiar. The only methods that needs to be specialized are those for
clojure.lang.PersistentList, as we identified earlier. Notice the first item in the dispatch value is ::bash or
::batch instead of ::common .
(defmethod emit [::bash :list]
[form]
(case (name (first form))
"println" (str "echo " (second form))
nil))

(defmethod emit [::batch :list]
[form]
(case (name (first form))
"println" (str "ECHO " (second form))
nil))

#object[cljs.core.MultiFn]

The ::common implementation is intentionally incomplete; it merely exists to manage any common
methods between its children.

We can test emit by rebinding *current-implementation* to the implementation of our choice with
binding.

(binding [*current-implementation* ::common]


(emit "a"))

"a"

(binding [*current-implementation* ::batch]


(emit '(println "a")))

"ECHO a"

(binding [*current-implementation* ::bash]


(emit '(println "a")))

"echo a"

(binding [*current-implementation* ::common]


(emit '(println "a")))

Execution error.
ERROR: Error: No method in multimethod 'cljs.user/emit' for dispatch value: [:clj

Because we didn’t define an implementation for [::common :list] , the multimethod falls through and
throws an Exception.

Multimethods offer great flexibility and power, but with power comes great responsibility. Just because we
can put our multimethods all in one namespace doesn’t mean we should. If our DSL becomes any bigger,
we would probably separate all Bash and Batch implementations into individual namespaces.

This small example, however, is a good showcase for the flexibility of decoupling namespaces and
inheritance.
Icing on the Cake
We’ve built a nice, solid foundation for our DSL using a combination of multimethods, dynamic vars, and
ad-hoc hierarchies, but it’s a bit of a pain to use.

(binding [*current-implementation* ::bash]


(emit '(println "a")))

"echo a"

Let’s eliminate the boilerplate. But where is it?

The binding expression is an good candidate. We can reduce the chore of rebinding current-
implementation by introducing with-implementation (which we will define soon).

(with-implementation ::bash
(emit '(println "a")))

That’s an improvement. But there’s another improvement that’s not as obvious: the quote used to delay
our form’s evaluation. Let’s use script, which we will define later, to get rid of this boilerplate:

(with-implementation ::bash
(script
(println "a")))

This looks great, but how do we implement script? Clojure functions evaluate all their arguments before
evaluating the function body, exactly the problem the quote was designed to solve.

To hide this detail we must wield one of Lisp’s most unique forms: the macro.

The macro’s main drawcard is that it doesn’t implicitly evaluate its arguments. This is a perfect fit for an
implementation of script.

(defmacro script [form]


`(emit '~form))

#'cljs.user/script

To get an idea what is happening, here’s what a call to script returns and then implicitly evaluates.

(macroexpand '(script (println "a")))

(cljs.user/emit '(println "a"))

It isn’t crucial that you understand the details, rather appreciate the role that macros play in cleaning up
the syntax.

We will also implement with-implementation as a macro, but for different reasons than with script. To
evaluate our script form inside a binding form we need to drop it in before evaluation.
(defmacro with-implementation
[impl & body]
`(binding [cljs.user/*current-implementation* ~impl]
~@body))

#'cljs.user/with-implementation

Roughly, here is the lifecyle of our DSL, from the sugared wrapper to our unsugared foundations.

(with-implementation ::bash
(script
(println "a")))
=>
(with-implementation ::bash
(emit
'(println "a"))
=>
(binding [*current-implementation* ::bash]
(emit
'(println "a")))

Let's see it in action for Bash:

(with-implementation ::bash
(script
(println "a")))

"echo a"

And for Windows:

(with-implementation ::batch
(script
(println "a")))

"ECHO a"

It’s easy to see how a few well-placed macros can put the sugar on top of strong foundations. Our DSL
really looks like Clojure code!

Conclusion
We have seen many of Clojure’s advanced features working in harmony in this DSL, even though we
incrementally incorported many of them. Generally, Clojure helps us switch our implementation strategies
with minimum fuss.

This is notable when you consider how much our DSL evolved.

We initially used a simple cond expression, which was converted into two multimethods, one for each
implementation. As multimethods are just ordinary functions, the transition was seamless for any existing
testing code. (In this case I renamed the function for clarity).
We then merged these multimethods, utilizing a global hierachy for inheritance and dynamic vars to select
the current implementation.

Finally, we devised a pleasant syntactic interface with a two simple macros, eliminating that last bit of
boilerplate that other languages would have to live with.

I hope you have enjoyed following the evolution of our little DSL. This DSL is based on a simplified version
of Stevedore (https://github.com/pallet/stevedore) by Hugo Duncan (http://hugoduncan.org/). If you are
interested in how this DSL can be extended, you can do no better than browsing the source code of
Stevedore (https://github.com/pallet/stevedore).

Copyright
Copyright Ambrose Bonnaire-Sergeant, 2013

« Parsing XML in Clojure (/articles/cookbooks/parsing_xml_with_zippers/)

Links
About (/articles/about/)
Table of Contents (/articles/content/)
Getting Started (/articles/tutorials/getting_started/)
Introduction to Clojure (/articles/tutorials/introduction/)
Clojure Editors (/articles/tutorials/editors/)
Clojure Community (/articles/ecosystem/community/)
Basic Web Development (/articles/tutorials/basic_web_development/)
Language: Functions (/articles/language/functions/)
Language: clojure.core (/articles/language/core_overview/)
Language: Collections and Sequences (/articles/language/collections_and_sequences/)
Language: Namespaces (/articles/language/namespaces/)
Language: Java Interop (/articles/language/interop/)
Language: Polymorphism (/articles/language/polymorphism/)
Language: Concurrency and Parallelism (/articles/language/concurrency_and_parallelism/)
Language: Macros (/articles/language/macros/)
Language: Laziness (/articles/language/laziness/)
Language: Glossary (/articles/language/glossary/)
Ecosystem: Library Development and Distribution (/articles/ecosystem/libraries_authoring/)
Ecosystem: Web Development (/articles/ecosystem/web_development/)
Ecosystem: Generating Documentation (/articles/ecosystem/generating_documentation/)
Building Projects: tools.build and the Clojure CLI (/articles/cookbooks/cli_build_projects/)
Data Structures (/articles/cookbooks/data_structures/)
Strings (/articles/cookbooks/strings/)
Mathematics with Clojure (/articles/cookbooks/math/)
Date and Time (/articles/cookbooks/date_and_time/)
Working with Files and Directories in Clojure (/articles/cookbooks/files_and_directories/)
Middleware in Clojure (/articles/cookbooks/middleware/)
Parsing XML in Clojure (/articles/cookbooks/parsing_xml_with_zippers/)
Growing a DSL with Clojure

Copyright © 2024 Multiple Authors


Powered by Cryogen (https://cryogenweb.org)

You might also like