Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $9.99/month after trial. Cancel anytime.

DSLs in Boo: Domain Specific Languages in .NET
DSLs in Boo: Domain Specific Languages in .NET
DSLs in Boo: Domain Specific Languages in .NET
Ebook748 pages6 hours

DSLs in Boo: Domain Specific Languages in .NET

Rating: 0 out of 5 stars

()

Read preview

About this ebook

A general-purpose language like C# is designed to handle all programming tasks. By contrast, the structure and syntax of a Domain-Specific Language are designed to match a particular applications area. A DSL is designed for readability and easy programming of repeating problems. Using the innovative Boo language, it's a breeze to create a DSL for your application domain that works on .NET and does not sacrifice performance.

DSLs in Boo shows you how to design, extend, and evolve DSLs for .NET by focusing on approaches and patterns. You learn to define an app in terms that match the domain, and to use Boo to build DSLs that generate efficient executables. And you won't deal with the awkward XML-laden syntax many DSLs require. The book concentrates on writing internal (textual) DSLs that allow easy extensibility of the application and framework. And if you don't know Boo, don't worry-you'll learn right here all the techniques you need.

Purchase of the print book comes with an offer of a free PDF, ePub, and Kindle eBook from Manning. Also available is all code from the book.
LanguageEnglish
PublisherManning
Release dateDec 31, 2009
ISBN9781638354215
DSLs in Boo: Domain Specific Languages in .NET

Related to DSLs in Boo

Related ebooks

Programming For You

View More

Related articles

Reviews for DSLs in Boo

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    DSLs in Boo - Oren Eini

    Copyright

    For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact

    Special Sales Department

    Manning Publications Co.

    Sound View Court 3B

    Greenwich, CT 06830

    Email: [email protected]

    ©2010 by Manning Publications Co. All rights reserved.

    No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher.

    Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps.

    Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine

    Printed in the United States of America

    1 2 3 4 5 6 7 8 9 10 – MAL – 15 14 13 12 11 10

    Dedication

    For Mom who told me it would take longer than I expected

    Brief Table of Contents

    Copyright

    Brief Table of Contents

    Table of Contents

    Preface

    Acknowledgments

    About this Book

    About the Author

    About the Cover Illustration

    Chapter 1. What are domain-specific languages?

    Chapter 2. An overview of the Boo language

    Chapter 3. The drive toward DSLs

    Chapter 4. Building DSLs

    Chapter 5. Integrating DSLs into your applications

    Chapter 6. Advanced complier extensibility approaches

    Chapter 7. DSL infrastructure with Rhino DSL

    Chapter 8. Testing DSLs

    Chapter 9. Versioning DSLs

    Chapter 10. Creating a professional UI for a DSL

    Chapter 11. DSLs and documentation

    Chapter 12. DSL implementation challenges

    Chapter 13. A real-world DSL implementation

    Appendix A. Boo basic reference

    Appendix B. Boo language syntax

    Index

    List of Figures

    List of Tables

    List of Listings

    Table of Contents

    Copyright

    Brief Table of Contents

    Table of Contents

    Preface

    Acknowledgments

    About this Book

    About the Author

    About the Cover Illustration

    Chapter 1. What are domain-specific languages?

    1.1. Striving for simplicity

    1.1.1. Creating simple code

    1.1.2. Creating clear code

    1.1.3. Creating intention-revealing code

    1.2. Understanding domain-specific languages

    1.2.1. Expressing intent

    1.2.2. Creating your own languages

    1.3. Distinguishing between DSL types

    1.3.1. External DSLs

    1.3.2. Graphical DSLs

    1.3.3. Fluent interfaces

    1.3.4. Internal or embedded DSLs

    1.4. Why write DSLs?

    1.4.1. Technical DSLs

    1.4.2. Business DSLs

    1.4.3. Automatic or extensible DSLs

    1.5. Boo’s DSL capabilities

    1.6. Examining DSL examples

    1.6.1. Brail

    1.6.2. Rhino ETL

    1.6.3. Bake (Boo Build System)

    1.6.4. Specter

    1.7. Summary

    Chapter 2. An overview of the Boo language

    2.1. Why use Boo?

    2.2. Exploring compiler extensibility

    2.3. Basic Boo syntax

    2.4. Boo’s built-in language-oriented features

    2.4.1. String interpolation

    2.4.2. Is, and, not, and or

    2.4.3. Optional parentheses

    2.4.4. Anonymous blocks

    2.4.5. Statement modifiers

    2.4.6. Naming conventions

    2.4.7. Extension methods

    2.4.8. Extension properties

    2.4.9. The IQuackFu interface

    2.5. Summary

    Chapter 3. The drive toward DSLs

    3.1. Choosing the DSL type to build

    3.1.1. The difference between fluent interfaces and DSLs

    3.1.2. Choosing between a fluent interface and a DSL

    3.2. Building different types of DSLs

    3.2.1. Building technical DSLs

    3.2.2. Building business DSLs

    3.2.3. Building Extensibility DSLs

    3.3. Fleshing out the syntax

    3.4. Choosing between imperative and declarative DSLs

    3.5. Taking a DSL apart—what makes it tick?

    3.6. Combining domain-driven design and DSLs

    3.6.1. Language-oriented programming in DDD

    3.6.2. Applying a DSL in a DDD application

    3.7. Implementing the Scheduling DSL

    3.8. Running the Scheduling DSL

    3.9. Summary

    Chapter 4. Building DSLs

    4.1. Designing a system with DSLs

    4.2. Creating the Message-Routing DSL

    4.2.1. Designing the Message-Routing DSL

    4.3. Creating the Authorization DSL

    4.3.1. Exploring the Authorization DSL design

    4.3.2. Building the Authorization DSL

    4.4. The dark side of using a DSL

    4.5. The Quote-Generation DSL

    4.5.1. Building business-facing DSLs

    4.5.2. Selecting the appropriate medium

    4.6. Summary

    Chapter 5. Integrating DSLs into your applications

    5.1. Exploring DSL integration

    5.2. Naming conventions

    5.3. Ordering the execution of scripts

    5.3.1. Handling ordering without order

    5.3.2. Ordering by name

    5.3.3. Prioritizing scripts

    5.3.4. Ordering using external configuration

    5.4. Managing reuse and dependencies

    5.5. Performance considerations when using a DSL

    5.5.1. Script compilation

    5.5.2. Script execution

    5.5.3. Script management

    5.5.4. Memory pressure

    5.6. Segregating the DSL from the application

    5.6.1. Building your own security infrastructure

    5.6.2. Segregating the DSL

    5.6.3. Considerations for securing a DSL in your application

    5.7. Handling DSL errors

    5.7.1. Handling runtime errors

    5.7.2. Handling compilation errors

    5.7.3. Error-handling strategies

    5.8. Administrating DSL integration

    5.9. Summary

    Chapter 6. Advanced complier extensibility approaches

    6.1. The compiler pipeline

    6.2. Meta-methods

    6.3. Quasi-quotation

    6.4. AST macros

    6.4.1. The unroll macro

    6.4.2. Building macros with the MacroMacro

    6.4.3. Analyzing the using macro

    6.4.4. Building an SLA macro

    6.4.5. Using nested macros

    6.5. AST attributes

    6.6. Compiler steps

    6.6.1. Compiler structure

    6.6.2. Building the implicit base class compiler step

    6.7. Summary

    Chapter 7. DSL infrastructure with Rhino DSL

    7.1. Understanding a DSL infrastructure

    7.2. The structure of Rhino DSL

    7.2.1. The DslFactory

    7.2.2. The DslEngine

    7.2.3. Creating a custom IDslEngineStorage

    7.3. Codifying DSL idioms

    7.3.1. ImplicitBaseClassCompilerStep

    7.3.2. AutoReferenceFilesCompilerStep

    7.3.3. AutoImportCompilerStep

    7.3.4. UseSymbolsStep

    7.3.5. UnderscoreNamingConventionsToPascalCaseCompilerStep

    7.3.6. GeneratePropertyMacro

    7.4. Batch compilation and compilation caches

    7.5. Supplying external dependencies to our DSL

    7.6. Summary

    Chapter 8. Testing DSLs

    8.1. Building testable DSLs

    8.2. Creating tests for a DSL

    8.2.1. Testing the syntax

    8.2.2. Testing the DSL API

    8.2.3. Testing the DSL engine

    8.3. Testing the DSL scripts

    8.3.1. Testing DSL scripts using standard unit testing

    8.3.2. Creating the Testing DSL

    8.4. Integrating with a testing framework

    8.5. Taking testing further

    8.5.1. Building an application-testing DSL

    8.5.2. Mandatory testing

    8.6. Summary

    Chapter 9. Versioning DSLs

    9.1. Starting from a stable origin

    9.2. Planning a DSL versioning story

    9.2.1. Implications of modifying the DSL engine

    9.2.2. Implications of modifying the DSL API and model

    9.2.3. Implications of modifying the DSL syntax

    9.2.4. Implications of modifying the DSL environment

    9.3. Building a regression test suite

    9.4. Choosing a versioning strategy

    9.4.1. Abandon-ship strategy

    9.4.2. Single-shot strategy

    9.4.3. Additive-change strategy

    9.4.4. Tower of Babel strategy

    9.4.5. Adapter strategy

    9.4.6. The great-migration strategy

    9.5. Applying versioning strategies

    9.5.1. Managing safe, additive changes

    9.5.2. Handling required breaking change

    9.6. DSL versioning in the real world

    9.6.1. Versioning Brail

    9.6.2. Versioning Binsor

    9.6.3. Versioning Rhino ETL

    9.7. When to version

    9.8. Summary

    Chapter 10. Creating a professional UI for a DSL

    10.1. Creating an IDE for a DSL

    10.1.1. Using Visual Studio as your DSL IDE

    10.1.2. Using #develop as your DSL IDE

    10.2. Integrating an IDE with a DSL application

    10.2.1. Extending #develop highlighting for our DSLs

    10.2.2. Adding code completion to our DSL

    10.2.3. Adding contextual code completion support for our DSL

    10.3. Creating a graphical representation for a textual DSL

    10.3.1. Displaying DSL execution

    10.3.2. Creating a UI dialect

    10.3.3. Treating code as data

    10.4. DSL code generation

    10.4.1. The CodeDOM provider for Boo

    10.4.2. Specific DSL writers

    10.5. Handling errors and warnings

    10.6. Summary

    Chapter 11. DSLs and documentation

    11.1. Types of documentation

    11.2. Writing the Getting Started Guide

    11.2.1. Begin with an introduction

    11.2.2. Provide examples

    11.3. Writing the User Guide

    11.3.1. Explain the domain and model

    11.3.2. Document the language syntax

    11.3.3. Create the language reference

    11.3.4. Explain debugging to business users

    11.4. Creating the Developer Guide

    11.4.1. Outline the prerequisites

    11.4.2. Explore the DSL’s implementation

    11.4.3. Document the syntax implementation

    11.4.4. Documenting AST transformations

    11.5. Creating executable documentation

    11.6. Summary

    Chapter 12. DSL implementation challenges

    12.1. Scaling DSL usage

    12.1.1. Technical—managing large numbers of scripts

    12.1.2. Performing precompilation

    12.1.3. Compiling in the background

    12.1.4. Managing assembly leaks

    12.2. Deployment—strategies for editing DSL scripts in production

    12.3. Ensuring system transparency

    12.3.1. Introducing transparency to the Order-Processing DSL

    12.3.2. Capturing the script filename

    12.3.3. Accessing the code at runtime

    12.3.4. Processing the AST at runtime

    12.4. Changing runtime behavior based on AST information

    12.5. Data mining your scripts

    12.6. Creating DSLs that span multiple files

    12.7. Creating DSLs that span multiple languages

    12.8. Creating user-extensible languages

    12.8.1. The basics of user-extensible languages

    12.8.2. Creating the Business-Condition DSL

    12.9. Summary

    Chapter 13. A real-world DSL implementation

    13.1. Exploring the scenario

    13.2. Designing the order-processing system

    13.3. Thinking in tongues

    13.4. Moving from an acceptable to an excellent language

    13.5. Implementing the language

    13.5.1. Exploring the treatment of statement’s implementation

    13.5.2. Implementing the upon and when keywords

    13.5.3. Tracking which file is the source of a policy

    13.5.4. Bringing it all together

    13.6. Using the language

    13.7. Looking beyond the code

    13.7.1. Testing our DSL

    13.7.2. Integrating with the user interface

    13.7.3. Limited DSL scope

    13.8. Going beyond the limits of the language

    13.9. Summary

    Appendix A. Boo basic reference

    A.1. Prerequisites

    A.2. The Boo interactive shell, interpreter, and compiler

    A.2.1. Expressions

    A.2.2. Boolean values and Boolean expressions

    A.3. Comments

    A.4. Control statements

    A.4.1. If statement

    A.4.2. While statement

    A.4.3. For statement

    A.5. Types

    A.5.1. Lists

    A.5.2. Range

    A.5.3. Arrays

    A.5.4. Hashes

    A.5.5. Strings

    A.5.6. Slicing

    A.5.7. Declaring types explicitly

    A.6. Creating real programs

    A.6.1. Methods

    A.6.2. Classes and objects

    A.6.3. Imports

    A.7. Generators

    Appendix B. Boo language syntax

    B.1. Interesting keywords

    B.2. Conditionals

    B.3. Loops and iterations

    B.4. Type declarations

    B.5. Methods, properties, and control structures

    B.6. Useful macros

    Index

    List of Figures

    List of Tables

    List of Listings

    Preface

    In 2007, I gave a talk about using Boo to build your own domain-specific languages (DSLs) at JAOO (http://jaoo.dk), a software conference in Denmark. I had been working with Boo and creating DSLs since 2005, but as I prepared for the talk, I was surprised to see just how easy it was to build DSLs with Boo. (I find that teaching something gives you a fresh perspective on it.)

    That experience, and the audience’s response, convinced me that you don’t have to be a compiler expert or a parser wizard to build your own mini-languages. I realized that I needed to formalize the practices I had been using and make them publicly available.

    One of the most challenging problems in the industry today is finding a way of clearly expressing intent in a particular domain. A lot of time and effort has been spent tackling that problem. A DSL is usually a good solution, but there is a strong perception in the community that writing your own language for a particular task is an extremely difficult task.

    The truth is different from the perception. Creating a language from scratch would be a big task, but you don’t need to start from scratch. Today, there are lots of tools and plenty of support for creating languages. When you decide to make an internal DSL—one that is hosted inside an existing programming language (such as Boo)—the cost of building that language drops significantly.

    I routinely build new languages during presentations (onstage, within 5 or 10 minutes), because once you understand the basic principles, it is easy. Easy enough that it deserves to be a standard part of your toolset, ready to be used whenever you spot a problem that is suitable for a DSL solution.

    That 2007 JAOO talk was the start of the journey that led to the creation of this book. Finishing up this project took longer than expected, but I am very happy to say that I have been successful in what I set out to do.

    This book is meant to be an actionable guide, not a theory book. I go over the theory in the relevant places, but my goal is that, by the time you are halfway through the book, you’ll be able to write your own DSLs.

    Acknowledgments

    Like most books, this wasn’t a solo effort. I would like to send my heartfelt gratitude to the people who made this book possible.

    Thanks to Rodrigo B. de Oliveira, for creating the Boo language in the first place, and Cedric Vivier, Daniel Grunwald, Dmitry Malyshev, Greg Nagel, Joao Braganca, Martinho Fernandes, Paul Lang, and Avishay Lavie for helping to create such a wonderful language.

    To the people who worked on and extended the Rhino DSL project, Simone Busoli, Nathan Stott, Jason Meckley, Craig Neuwirt, Tobias Hertkorn, Markus Zywitza, Adam Tybor, Paul Barriere, and Leonard Smith, thanks for making my job so much easier.

    To everyone at Manning, especially publisher Marjan Bace and associate publisher Mike Stephens, thanks for your guidance, support, and patience. To development editor Tom Cirtin, copyeditor Andy Carroll, and proofreader Katie Tennant, thanks for being so patient with me, even when I took too long to get things done. Special thanks to technical proofreader Justin Chase for carefully reading the final manuscript once it was in production and for checking the code.

    To the reviewers who read the manuscript numerous times during development, thanks for your comments and valuable feedback: Andrew Glover, Jon Skeet, Derik Whittaker, Freedom Dumlao, Justin Lee, Paul King, Matthew Pope, Craig Neuwirt, Mark Seemann, Steven Kelly, Robert Wenner, Garabed Garo Yeriazarian, and Avishay Lavie.

    About this Book

    This book is meant for intermediate to advanced .NET developers who are interested in using domain-specific languages in their applications.

    If you are new to language-oriented programming, this book will teach you how to create, build, and maintain your own languages.

    If you are experienced with language-oriented programming, this book will give you all the practical knowledge necessary to easily build DSLs using the Boo programming language.

    Note, however, that this book is focused on the practical side of building DSLs. While I talk about the theory underlying this field, I focus on practical aspects. If you are interested in learning more about DSLs, I also recommend reading Martin Fowler’s forthcoming book on the topic: http://www.martinfowler.com/bliki/DomainSpecificLanguage.html. The book isn’t finished yet, but much of the content can already be found on his site.

    Roadmap

    This book has five main sections.

    Chapters 1–2 discuss DSLs in general, introduce the Boo language, and explain why I chose to use it as the basis for my DSL adventures.

    Chapters 3–5 walk through the implementation of several different DSLs, their integration into applications, and all the various concerns you’ll have to deal with when you add a DSL to your project.

    Chapters 6–7 dive into advanced language manipulation and the infrastructure required to build an industry-strength DSL.

    Chapters 8–11 go into the details surrounding a production-worthy DSL implementation: building testable languages and test languages, creating versionable DSLs, working with user interfaces for the languages, and documenting them.

    Chapters 12–13 talk about implementation challenges for DSLs and walk through the steps of building a full real-world DSL example.

    Two appendixes conclude the book. Appendix A is a basic Boo reference, familiarizing you with how to use Boo as a programming language, while appendix B covers the Boo language syntax.

    Code conventions and downloads

    All source code in listings or in text is in a fixed-width font like this to separate it from ordinary text. Source code for all working examples in this book is available for download from the publisher’s website at www.manning.com/DSLsinBoo.

    You can download the binary distribution of Boo from the Boo website at http://boo.codehaus.org. For more information on using Boo once you’ve downloaded it, please see page 23.

    Author Online

    The purchase of DSLs in Boo includes free access to a private web forum run by Manning Publications, where you can make comments about the book, ask technical questions, and receive help from the author and from other users. To access the forum and subscribe to it, point your web browser to www.manning.com/DSLsinBoo. This page provides information about how to get on the forum once you’re registered, what kind of help is available, and the rules of conduct on the forum.

    Manning’s commitment to our readers is to provide a venue where a meaningful dialogue between individual readers and between readers and the author can take place. It’s not a commitment to any specific amount of participation on the part of the author, whose contribution to the book’s forum remains voluntary (and unpaid). We suggest you try asking him some challenging questions, lest his interest stray!

    The Author Online forum and the archives of previous discussions will be accessible from the publisher’s website as long as the book is in print.

    About the Author

    Oren Eini is an independent consultant based in Israel. He is a frequent blogger at www.ayende.com/Blog/ under his pseudonym Ayende Rahien, and he’s an internationally known presenter, having spoken at conferences such as DevTeach, JAOO, Oredev, NDC, and Progressive.NET.

    Oren’s main focus is on architecture and best practices that promote quality software and zero-friction development. He is the author of Rhino Mocks, one of the most popular mocking frameworks on the .NET platform, and he’s also a leading figure in other well-known open source projects, including the Castle project and NHibernate.

    Oren’s hobbies include reading fantasy novels, reviewing code, and writing about himself in the third person. Oren is also a Microsoft MVP, a fact that he tends to forget when writing a bio.

    About the Cover Illustration

    The figure on the cover of DSLs in Boo is captioned Le Dauber, which means art student. The illustration is taken from a 19th-century edition of Sylvain Maréchal’s four-volume compendium of regional dress customs published in France. Each illustration is finely drawn and colored by hand. The rich variety of Maréchal’s collection reminds us vividly of how culturally apart the world’s towns and regions were just 200 years ago. Isolated from each other, people spoke different dialects and languages. In the streets or in the countryside, it was easy to identify where they lived and what their trade or station in life was just by their dress.

    Dress codes have changed since then and the diversity by region, so rich at the time, has faded away. It is now hard to tell apart the inhabitants of different continents, let alone different towns or regions. Perhaps we have traded cultural diversity for a more varied personal life-certainly for a more varied and fast-paced technological life.

    At a time when it is hard to tell one computer book from another, Manning celebrates the inventiveness and initiative of the computer business with book covers based on the rich diversity of regional life of two centuries ago, brought back to life by Maréchal’s pictures.

    Chapter 1. What are domain-specific languages?

    In this chapter

    Understanding domain-specific languages

    Distinguishing between domain-specific language types

    Why write a domain-specific language?

    Why use Boo?

    Examining domain-specific language examples

    In the beginning, there was the bit. And the bit shifted left, and the bit shifted right, and there was the byte. The byte grew into a word, and then into a double word. And the developer saw the work, and it was good. And the evening and the morning were the first day. And on the next day, the developer came back to the work and spent the whole day trying to figure out what he had been thinking the day before.

    If this story rings any bells, you’re familiar with one of the most fundamental problems in computer science. The computer does what it is told, not what the programmer meant to tell it. Often enough, what the programmer tells it to do is in direct contradiction to what the programmer meant it to do. And that’s a problem. I’ve experienced this myself many times, and I’m not particularly incompetent. How, then, did I reach that point?

    1.1. Striving for simplicity

    Take a look at this piece of code:

    for (p = freelist, oldp = 0;

        p && p != (struct chunk *)brkval;

        oldp = p, p = p->next) {

        if (p->len > nelems) {

            p->len -= nelems;

            q = p + p->len;

            q->next = 0;

            q->len = nelems;

            q++;

            return (void *)q;

        }

        if (p->len == nelems) {

            if (oldp == 0)

                freelist = p->next;

            else

                oldp->next = p->next;

            p->next = 0;

            p++;

            return (void *)p;

        }

    }

    You’re among a decided minority if you can take a single glance at this code and deduce immediately what it’s doing. Most developers would have to decipher this piece of code.

    How does this connect to my difficulty in telling the computer what I want it to do? The problem is the level at which I instruct the computer what to do. If I am working down at the assembly level (or near assembly), I have to instruct the machine what to do in excruciating detail. The preceding piece of code was taken from the FreeBSD boot loader’s malloc method, and there are good reasons it looks the way it does, but writing at this level has a big cost in productivity and flexibility.

    Alternatively, I can instruct the computer to do things in higher-level terms, where it can better interpret what I want it to do.

    As developers, we always want to achieve the simplest, clearest way to talk to the computer, regardless of the task at hand. Different tasks (low-level memory manipulation, for example) require us to work at different levels, but we always strive for readable, easily maintainable code. Within a given context, we may need to sacrifice those goals for other, more important goals (usually performance), but that should only done very cautiously.

    And when we are not working on low-level code, we will, at some point, have to leave general-purpose programming languages behind to get the desired level of clarity. Building our own languages, each focused specifically on a single task, is a great way to achieve this simplicity and clarity.

    What we’d like to find are clear, concise, and simple ways to instruct the machine what we want it to do, rather than to laboriously micromanage it.

    1.1.1. Creating simple code

    Producing code that’s readable, maintainable, and simple is a great goal. But simple code is much harder to write than complex code. It’s easy to throw code at a problem until it goes away. Simple code, on the other hand, is what you get when you remove all the complexity from the code. That isn’t to say that it’s complicated to write simple code; it’s just that writing complex code is easy. The amount of effort it takes to decipher what a piece of code does is a good indication of how simple the code is.

    Consider these two examples of getting the date in two weeks’ time. Which is more readable?

    C# code: DateTime.Now.AddDays(14);

    C code: time() + 1209600;

    I don’t think there’s any question about which is more readable. In fact, an even better solution would be this:

    DateTime.Now.AddWeeks(2);

    But this isn’t part of .NET’s Base Class Library (BCL) DateTime API.

    Using higher-level concepts means you can concentrate more on what you want to be done, and less on how it should be done at the machine level. When using .NET or Java, for instance, I rarely need to concern myself with memory allocation.

    That’s helpful, but more often than not, you’ll need to do more interesting things than merely calculate the date two weeks from now. You’ll need to express concepts and algorithms in ways that make sense, and you’ll need to be able to use them in projects of significant size and complexity. Having clearer ways to express those concepts translates directly into a more maintainable code base, which means reduced maintenance costs and an easier time changing and growing the system.

    Note

    It’s considered polite to express intent in code in a manner that will make sense to the next developer who works with your code, particularly because that poor person may be you. A good suggestion that I take to heart is to assume that the next developer to touch your code will be an axe murderer who knows where you live and has a short fuse.

    1.1.2. Creating clear code

    Code may be clear about how it’s doing things, but it might not be clear about what it’s doing or why. Because we’re assuming that the next developer will be a vicious killer with a nasty temper, we should make it easy to figure out what we’ve done and what we meant.

    We can make our code easier to understand by using intention-revealing programming and concepts taken from domain-driven design, a design approach that says that your API, code structure, and the code itself should express intent, be expressed in the language of the domain, and generally have a high correlation with the problem domain that the application is trying to solve.

    Even then, we quickly reach a point where our ability to express intent is hampered by the syntax of the language that we’re using.

    1.1.3. Creating intention-revealing code

    Programming languages make it easy to tell the computer what it should do, but they can be less effective at expressing developer intent. For that matter, most general-purpose languages (such as C# or Java) are far less suited for a host of other tasks.

    Let’s consider text processing, for example. Suppose you want to validate an Israeli phone number like this: 03-9876543. You might do this with the code shown in listing 1.1.

    Listing 1.1. Validating a phone number

    public bool ValidatePhoneNumber(string input)

    {

        if (input.Length != 10)

            return false;

        for (int i = 0; i < input.Length; i++)

        {

            if (i == 2 && input[i] != '-')

                return false;

            else if (char.IsDigit(input[i]) == false)

                return false;

        }

        return true;

    }

    Can you look at this code and understand what input it will accept without deciphering it? If you haven’t noticed by now, I consider the need to decipher code bad.

    Now let’s look at a tool that’s dedicated to text processing: regular expressions. Validating the phone number using a regular expression is as simple as the one-liner in listing 1.2.

    Listing 1.2. Validating a phone number using regular expressions

    public bool ValidatePhoneNumber(string input)

    {

        return Regex.IsMatch(input, @^\d{2}-\d{7}$);

    }

    In this case, the use of a specialized tool for text processing has made the intent much easier to understand, but you need to understand the tool. Anyone who knows regular expressions can glance at the code and figure out what input it will accept.

    Another approach is to use masked input to define a mask for certain input, which would result in code like this:

    Mask.Validate(input, ##-########);

    The challenges of specialized tools

    Regular expressions are notorious for being write-only tools because the results can be difficult to read, particularly if you don’t write them carefully.

    Using special tools to handle specialized tasks requires that you understand how to use the tools. If you don’t understand regular expressions, and I hand you listing 1.2, how will you deal with it?

    We’ll touch on this topic later in the book; most of chapter 11 is dedicated to techniques that can help people come to grips with custom languages.

    Assuming you know that # is the character for matching a numeral, this is even easier to understand than the regular expression approach. (The .NET framework doesn’t have any masked-input validation facilities beyond WinForms’ MaskedTestBox.)

    Querying and filtering are other situations where code is no longer sufficient. Let’s say we want to retrieve data for all the customers in London. This isn’t a query that you’ll want to handle by yourself. Building an optimized query plan, instructing the data store which section of the data should be scanned, building manual filters for each individual query ... all of that can be quite tedious. It’s quite a complex task, particularly if you want to handle it efficiently and in a transaction-safe manner.

    It’s far easier to send a SQL statement to the database and let it sort out how it wants to handle the request on its own. This allows us to speak at a much higher level of abstraction and ignore the details of how the data is retrieved.

    So far, I have been consciously avoiding the use of the term domain-specific languages, but it’s time we started discussing it.

    1.2. Understanding domain-specific languages

    Martin Fowler defines a domain-specific language (DSL) as a computer language that’s targeted to a particular kind of problem, rather than a general purpose language that’s aimed at any kind of software problem (http://martinfowler.com/bliki/DomainSpecificLanguage.html).

    Domain-specific languages aren’t a new idea by any means. DSLs have been around since long before the start of computing. People have always developed specialized vocabularies for specialized tasks. That’s why sailors use terms like port and starboard and are not particularly afraid of gallows. Doctors similarly have a vocabulary that is baffling to the uninitiated, and weather forecasters have specific terms for various types of clouds, winds, and storms.

    Regular expressions and SQL are similarly specialized languages:

    Both are languages designed for a narrow domain—text processing and database querying, respectively.

    Both are focused on letting you express what you mean, not how the implementation should work—that’s left to some magic engine in the background.

    The reason these languages are so successful is that the focus they offer is incredibly useful. They reduce the complexity that you need to handle, and they’re flexible in terms of what you can make them do.

    1.2.1. Expressing intent

    From the beginning of computer programming, it was recognized that trying to express what you mean in natural language isn’t a viable approach. A clearer, much more focused, way to express intent was needed—that’s why we have code, which is unambiguous (most of the time) and easy for the computer to understand.

    But while code may be unambiguous to a computer, it can certainly be incomprehensible to people. Understanding code can be a big problem. You tend to write the code once, and read it many more times. Clarity is much more important than brevity. By ensuring that our code is readable, clear, and concise, we make an investment that will benefit us both in the immediate future (producing software that is simpler and easier to change) and in the long term (providing easier maintainability and a clearer path for extensibility and growth).

    But, as we’ve seen, code isn’t always the clearest way to express intent. This is where intention-revealing programming comes into play, and one of the tools in that category is creating a DSL to clearly and efficiently express intent and meaning in code.

    1.2.2. Creating your own languages

    Most people assume that creating your own computer language is a fairly complex matter. This is because most of the literature out there assumes that you want to build a full-blown general language. This puts a lot of burden on you, as the language author.

    It isn’t simple to create a general language, but it’s certainly possible. It just isn’t something you’d want to do on a rainy afternoon or over a long weekend. The experience is out there, but the initial cost remains nontrivial.

    But you don’t always have to write your own language from scratch. You can utilize an existing language (called the host language or base language) to provide built-in language and runtime facilities, and then add more syntax and behavior on top of it. A popular example is Ruby on Rails, which is, in essence, a DSL for building web applications.

    Building your own compiler

    I stated that building your own compiler or interpreter isn’t hard. This is true, to some extent. The main difficulties in going that route are the scope of the work and the fact that most of the work is arcane at worst and tedious at best. This is particularly true if you want to write a full-fledged language.

    Writing a general-purpose language is a big task. You need to deal with the details of the syntax and worry about creating an execution engine (for interpreted languages) or generating IL (Intermediate Language) or machine code (for compiled languages). I don’t consider it to be a complex task, but it is a big one.

    Building a single-purpose language is a far easier (and smaller) task, because the scope is much reduced. A good example of that can be seen in rSpec, a Ruby library for creating behavior-driven specifications. One of its capabilities is a story runner that accepts specifications written in English (http://blog.davidchelimsky.net/articles/2007/10/21/story-runner-in-plain-english). I suggest looking at how it works. It’s quite ingenious in its simplicity.

    The problem with that approach for natural language processing is that you hit its limits quickly. It works only when the statements follow a rigid format, so although it may look like natural language, it is, in fact, nothing of the sort. If you want to make the language more intelligent, you have to accept the additional complexity of building a more full-featured language.

    I once consulted for a company that had built a DSL for defining business rules. They had over 100,000 lines of C++ code that they needed to maintain, and performance was a big concern. It became apparent that they could have switched the whole thing to an internal DSL (a DSL that’s hosted in an existing language, which we’ll talk about shortly) and saved quite a bit of time, effort, and pain.

    The tools for language-oriented programming had been improving for quite a while, but it was the introduction of Ruby on Rails—a wildly popular DSL that was recognized as such—that really started to get things rolling.

    1.3. Distinguishing between DSL types

    In the world of DSLs, we often distinguish between several types:

    External DSLs

    Graphical DSLs

    Fluent interfaces

    Internal or embedded DSLs

    We’ll discuss those types in turn, and look at their properties and uses.

    1.3.1. External DSLs

    When we talk about external DSLs, we’re discussing DSLs that exist outside the confines of an existing language. SQL and regular expressions are two examples of external DSLs.

    Building an external DSL means starting work from a blank slate. You need to define the syntax and required capabilities, and start working from there. This means that you have a lot of power in your hands, but you also need to handle everything yourself. And by everything, I do mean everything, from defining operator precedence semantics to specifying how an if statement works.

    Common tools for building external DSLs include Lex, Yacc, ANTLR, GOLD Parser, and Coco/R, among others. Those tools handle the first stage, translating text in a known syntax to a format that a computer program can consume to produce executable output. The part about producing executable output is usually left as an exercise for the reader. There are few tools to help you with that.

    Note

    One tool that comes to mind for producing executable output is the Dynamic Language Runtime (DLR), a Microsoft project that aims to give us dynamic languages in .NET. One basic underpinning of this project is a set of classes that specify the behavior of a program (the abstract syntax tree, or AST) that the DLR can turn into an executable. There are other such tools, for sure, but the DLR is the only one I know of in the .NET space.

    Building rich external DSLs is similar to building a general purpose language. You need to understand compiler theory before starting on that path. If you’re interested in that, I recommend reading Compilers: Principles, Techniques, and Tools, by Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman, which is a classic book on the subject.

    This book focuses on building languages on top of existing languages, not starting from scratch and going the whole way. Nevertheless, some background in compiler theory is certainly helpful, even when building a DSL that uses an existing language, so let’s take a quick look at the process of building a language from scratch.

    First, the grammar and syntax are often defined using a notation such as BNF (Backus-Naur Form) or a derivative, and then you use a tool to generate a parser. Once you’ve done that, you can run the parser over a code string, which will produce an abstract syntax tree (AST), which is the representation of the original string as an AST based on your definition of the language.

    An example will make this clearer. Consider the code in listing 1.3, written in a fictional language.

    Listing 1.3. An if statement in a fictional language

    if 1 equals 2:

      print 1 = 2

    else:

      print 1 != 2

    The AST that was generated from the code in listing 1.3 is shown in figure 1.1.

    Figure 1.1. A hierarchical representation of the AST generated from a simple if statement

    You can then either build an interpreter that understands this AST and can execute it, or output an executable from the AST. Another common approach is to transform the AST into a semantic model that’s easier

    Enjoying the preview?
    Page 1 of 1