Effective Kotlin Item 34: Consider defining a DSL for complex object creation
This is a chapter from the book Effective Kotlin. You can find it on LeanPub or Amazon.
A set of Kotlin features used together allows us to make a configuration-like Domain Specific Language (DSL). Such DSLs are useful when we need to define more complex objects or a hierarchical structure of objects. They are not easy to define, but once this has been done they hide boilerplate code and the complexity of our code, therefore a developer can express his or her intentions more clearly.
For instance, Kotlin DSL is a popular way to express both classic HTML and React HTML. This is how this could look:
Views on other platforms can also be defined using DSLs. Here is a simple Android view defined using the Anko library:
It is similar with desktop applications. Here is a view defined in TornadoFX (that is built on top of JavaFX):
DSLs are also often used to define data or configurations. Here is an API definition in Ktor which also uses a DSL:
Here are test case specifications defined in Kotlin Test:
We can even use Gradle DSL to define Gradle configuration:
Creating complex and hierarchical data structures is easier with DSLs. Inside these DSLs, we can use everything that Kotlin offers, and we have useful hints as DSLs in Kotlin are fully type-safe (unlike Groovy). It is likely that you have already used some Kotlin DSLs, but it is also important to know how to define them yourself so you can use them better and autonomously.
Defining your own DSL
To understand how to make your own DSLs, it is important to understand the notion of function types with a receiver. Before that, we’ll first briefly review the notion of function types themselves. A function type is a type that represents an object that can be used as a function. For instance, the filter
function contains a function type to represent a predicate that decides if an element can be accepted or not.
Here are a few examples of function types:
()->Unit
- Function with no arguments that returnsUnit
.(Int)->Unit
- Function that takesInt
and returnsUnit
.(Int)->Int
- Function that takesInt
and returnsInt
.(Int, Int)->Int
- Function that takes two arguments of typeInt
and returnsInt
.(Int)->()->Unit -
Function that takesInt
and returns another function. This other function has no arguments and returnsUnit
.(()->Unit)->Unit
- Function that takes another function and returnsUnit
. This other function has no arguments and returnsUnit
.
The basic ways of creating instances of function types are:
Using lambda expressions
Using anonymous functions
Using function references
For instance, think about the following function:
Analogous functions can be created in the following ways:
In the above example, property types are specified, therefore argument types in the lambda expression and in the anonymous function can be inferred. However, it could be the other way around: if we specify the argument types, then the function type can be inferred.
Function types are there to represent objects that represent functions. An anonymous function even looks the same as a normal function but it has no name. A lambda expression is a shorter notation for an anonymous function.
However, if we have function types to represent functions, what about extension functions? Can we express them as well?
It was mentioned before that we create an anonymous function in the same way as a normal function but without a name. So, anonymous extension functions are defined the same way as well:
What type does myPlus
have? The answer is that there is a special type to represent extension functions that is called function type with a receiver. It looks similar to a normal function type, but it additionally specifies the receiver type before its arguments, and they are separated using a dot:
Such a function can be defined using a lambda expression, specifically a lambda expression with receiver, since inside its scope the this
keyword references the extension receiver (an instance of type Int
in this case):
An object created using an anonymous extension function or lambda expression with a receiver can be invoked in 3 ways:
- Like a standard object, using the
invoke
method. - Like a non-extension function.
- Same as a normal extension function.
The most important trait of the function type with a receiver is that it changes what this
refers to. To see how this trait can be used, think of a class that needs to be defined property by property:
Referencing the dialog repeatedly is not very convenient, but if we were to use a lambda expression with receiver, it would be this
, and we would be able to just skip it (because a receiver can be used implicitly):
Following this path, someone might define a function that takes all the common parts of dialog creation and displaying and leaves only the setting of properties to the user:
This is our simplest DSL example. Since most of this builder function is repeatable, it has been extracted into an apply
function that can be used instead of defining a DSL builder for setting properties.
A function type with a receiver is the most basic building block of Kotlin DSLs. Let’s create a very simple DSL that allows us to make the following HTML table:
Starting from the beginning of this DSL, we can see a function table
. We are at the top-level without any receivers, so it needs to be a top-level function; however, inside its function argument you can see that we use tr
. The tr
function should be allowed only inside the table definition. This is why the table
function argument should have a receiver with such a function. Similarly, the tr
function argument needs to have a receiver that will contain a td
function.
How about this statement:
What is that? It is only a unary plus operator on a String
, and it needs to be defined inside TdBuilder
:
Now our DSL is well defined. To make it work properly, at every step we need to create a builder and initialize it using a function from the functional parameter (init
in the example below). Then, the builder will contain all the data specified in this init
function argument. This is the data we need. Therefore, we can either return this builder, or we can produce another object that holds this data. In this example, we’ll just return the builder. This is how the table
function could be defined:
Notice that we can use the apply
function, as shown before, to shorten this function:
Similarly, we can use it in other parts of this DSL to make them more concise:
This is a simple (but functional) DSL builder for HTML table creation. It could be improved using a DslMarker
, as explained in Item 14: Consider referencing receivers explicitly.
When should we use DSLs?
DSLs give us a way to express any kind of information you want, in a clear and structured way. The problem is that it is never clear to users how this information will be used later. In Anko, TornadoFX, or HTML DSL, we trust that the view will be correctly built based on our definitions, but it is often hard to track exactly how. Some more complicated uses can be hard to discover. The usage of DSLs can be also confusing to those not used to them, not to mention their maintenance. How they are defined can be a cost in terms of both performance and developer confusion. DSLs are overkill when we can use other simpler features instead. However, they are very useful when we need to express:
- complicated data structures,
- hierarchical structures,
- a huge amount of data.
Everything can be expressed without DSL-like structures by using builders or just constructors instead. DSLs are about boilerplate elimination of such structures. You should consider using DSLs when you see repeatable boilerplate code1 and there are no simpler Kotlin features that can help.
Summary
A DSL is a special language inside a language. It can make it really simple to create complex objects and even whole object hierarchies, like HTML code or complex configuration files. On the other hand, DSL implementations might be confusing or difficult for new developers. They are also hard to define. This is why they should be only used when they offer real value, such as the creation of really complex objects, or for complex object hierarchies. This is why they are also preferably defined in libraries rather than in projects. It is not easy to make a good DSL, but a well-defined one can make a project much better.
Repeatable code that does not contain any important information for a reader.