Kotlin offers a large number of features for building your own domain-specific language. In this article, we’ll discover the developer’s toolkit and implement a DSL for a real-world domain.
Introduction
SQL, RegExp, Gradle — what do they have in common? All of them represent an example of using domain-specific languages, or DSLs. Languages of this type aim to solve a specific problem, such as database querying, finding matches in the text, or build process description. Kotlin offers a large number of features for building your own domain-specific language. In this article, we’ll discover the developer’s toolkit and implement a DSL for a real-world domain.
I'll try to explain the language syntax as simply as possible, however, the article still appeals to developers who consider Kotlin as a language for custom DSL building. At the end of the article, I'll mention Kotlin drawbacks worth taking into account. The presented code snippets are relevant for Kotlin version 1.2.0 and are available on GitHub.
What is a DSL?
All programming languages can be divided into general-purpose and domain-specific languages. SQL, regular expressions, and build.gradle are often cited as examples of DSLs. These languages are limited in functionality, but they are able to effectively address a certain problem. They allow you to write imperative code (we shouldn't explain how to solve the problem) but in more or less a declarative way (we just declare the task) in order to obtain the solution based on the given data.
Let's say you have the standard process definition that can be eventually changed and enhanced but generally, you want to use it with different data and result formats. By creating a DSL, you create a flexible tool for solving various problems within one subject domain, no matter how the solution is obtained. So, you create a sort of API that, if mastered, can simplify your life and make it easier to keep the system up-to-date in the long-term.
The article deals with building an "embedded" DSL in Kotlin as a language implemented on the general-purpose language syntax. You can read more about it here.
Implementation Area
To my mind, one of the best ways to use and demonstrate Kotlin DSL is testing.
Suppose you've come from the Java world. How often have you been faced with declaring entity instances of an extensive data model? You've been likely using some builders or, even worse, special utility classes to fill the default values under the hood. How many overridden methods have you had? How often do you have to make little changes from default values, and how much effort does this require today?
If these questions stir up nothing but negative feelings, this article is for you.
That's the way we've been doing it for a long time in our project in the area of education: We used builders and utility classes to cover one of our most important modules (school timetable scheduling) with tests. Now this approach has given way to the Kotlin language and DSL, which is used to describe test scenarios and check the results. Throughout the article, you can see how we took advantage of Kotlin so that testing of the scheduling subsystem is not a torture anymore.
In this article, we will dive into details of constructing a DSL that helps to test an algorithm building teachers and students schedules.
Key Tools
Here are the basic language features that allow you to write cleaner code in Kotlin and create your own DSL. The table below demonstrates the main syntax enhancements that are worth using. Take a look at it carefully. If most these tools are unfamiliar to you, you might want to read the whole article. If you don't know one or two of them, feel free to fast forward to corresponding sections. In case there's nothing new for you, just skip to the DSL drawbacks review at the end of the article. Feel free to propose more tools in the comments.
Tool | DSL syntax | General syntax |
Operators overloading | collection += element | collection.add(element) |
Type aliases | typealias Point = Pair | Creating empty inheritor classes and other duct tape |
get /set methods convention | map["key"] = "value" | map.put("key", "value") |
Destructuring declaration | val (x, y) = Point(0, 0) | val p = Point(0, 0)
val x = p.first
val y = p.second |
Lambda out of parentheses | list.forEach { ... } | list.forEach({...}) |
Extension functions | mylist.first(); // there isn’t first() method in mylist collection | Utility functions |
Infix functions | 1 to "one " | 1.to("one") |
Lambda with receiver | Person().apply { name = "John" } | N/A |
Context control | @DslMarker | N/A |
Found anything new? If so, let’s move on.
I omitted delegated properties intentionally, as, in my opinion, they are useless for building DSLs, at least in our case. Using the features above, we can write cleaner code and get rid of voluminous "noisy" syntax, making development even more pleasant (could it be?).
I liked the comparison I met in the "Kotlin in Action" book: In natural languages, like English, sentences are built out of words, and grammar rules define the way of combining these words.
Similarly, in DSLs, one operation can be constructed from several method calls, and the type check guarantees that the construction makes sense. For sure, the order of callings cannot always be obvious, but it is entirely up to the DSL designer.
It is important to stress that this article examines an "embedded DSL", so it is based on a general-purpose language, which is Kotlin.
Final Result Example
Before we begin to design our own domain-specific language, I'd like to show you an example of what you'll be able to create after reading this article. The whole code is available on the GitHub repository at this link.
The DSL-based code below is designed to test the allocation of a teacher for students for the defined disciplines. In this example, we have a fixed timetable, and we check if classes are placed in both teacher's and students' schedules.
schedule {
data {
startFrom("08:00")
subjects("Russian",
"Literature",
"Algebra",
"Geometry")
student {
name = "Ivanov"
subjectIndexes(0, 2)
}
student {
name = "Petrov"
subjectIndexes(1, 3)
}
teacher {
subjectIndexes(0, 1)
availability {
monday("08:00")
wednesday("09:00", "16:00")
}
}
teacher {
subjectIndexes(2, 3)
availability {
thursday("08:00") + sameDay("11:00") + sameDay("14:00")
}
}
} assertions {
for ((day, lesson, student, teacher) in scheduledEvents) {
val teacherSchedule: Schedule = teacher.schedule
teacherSchedule[day, lesson] shouldNotEqual null
teacherSchedule[day, lesson]!!.student shouldEqual student
val studentSchedule = student.schedule
studentSchedule[day, lesson] shouldNotEqual null
studentSchedule[day, lesson]!!.teacher shouldEqual teacher
}
}
}
Toolkit
All the features for building a DSL have been listed above. Each of them is used in the example from the previous section. You can examine how to define such DSL constructs in my project on GitHub.
We will refer to this example again below in order to demonstrate the usage of different tools. Please bear in mind that the described approaches are for illustrative purposes only, and there may be other options to achieve the desired result.
So, let's discover these tools one by one. Some language features are most powerful when combined with the others, and the first in this list is the lambda out of parentheses.
Lambda Out of Parentheses
Documentation
A lambda expression is a code block that can be passed into a function, saved or called. In Kotlin, the lambda type is defined in the following way: (list of param types) -> returned type. Following this rule, the most primitive lambda type is () -> Unit, where Unit is an equivalent of Void with one important exception. At the end of the lambda, we don't have to write the return...
construction. Therefore, we always have a returned type, but in Kotlin, this is done implicitly.
Below is a basic example of assigning a lambda to a variable:
val helloPrint: (String) -> Unit = { println(it) }
Usually, the compiler tries to infer the type from the already known ones. In our case, there is a parameter. This lambda can be invoked as follows:
helloPrint("Hello")
In the example above, the lambda takes one parameter. Inside the lambda, this parameter is called by default, but if there were more, you would have to specify their names explicitly or use the underscore to ignore them. See such a case below:
val helloPrint: (String, Int) -> Unit = { _, _ -> println("Do nothing") }
helloPrint("Does not matter", 42)
The base tool — which you may already know from Groovy — is the lambda out of parentheses. Look again at the example from the very beginning of the article: Almost every use of curly brackets, except the standard constructions, is a lambda. There are at least two ways of making an x { ... }:-like construction:
- The object
x
and its unary operator invoke
(we'll discuss it later) - The function
x
that takes a lambda
In both cases, we use lambdas. Let's suppose there is a function x()
. In Kotlin, if a lambda is the last argument of a function, it can be placed out of parentheses. Furthermore, if a lambda is the only function's parameter, the parentheses can be omitted. As a result, the construction x({...})
can be transformed into x() {}
, and then, by omitting the parentheses, we get x {}
. This is how we declare such functions:
fun x( lambda: () -> Unit ) { lambda() }
In a concise form, a single-line function can be also written like this:
fun x( lambda: () -> Unit ) = lambda()
But what if x
is a class instance or an object instead of a function? Below is another interesting solution based on a fundamental domain-specific concept: operator overloading.
Operator Overloading
Documentation
Kotlin provides a wide, but somewhat limited variety of operators. The operator
modifier enables to define functions by conventions that will be called under certain conditions. As an obvious example, the plus function is executed if you use the "+
" operator between two objects. The complete list of operators can be found in the docs by the link above.
Let's consider a less trivial operator: invoke
. This article's main example starts with the schedule { }
construct that defines the code block responsible for testing the schedule. This construct is built in a slightly different way to the one mentioned above: We use the invoke
operator +
"lambda out of parentheses".
Having defined the invoke
operator, we can now use the schedule(...)
construct, although schedule
is an object. In fact, when you call schedule(...)
, the compiler interprets it as schedule.invoke(...)
. Let's see how schedule
is declared:
object schedule {
operator fun invoke(init: SchedulingContext.() -> Unit) {
SchedulingContext().init()
}
}
The schedule
identifier refers us to the only schedule
class instance (singleton) that is marked by the special keyword object
(you can find more information about such objects here). Thus, we call the invoke
method of the schedule
instance, receiving a lambda as a single parameter and placing it outside of the parentheses. As a result, the schedule {... }
construction matches the following:
schedule.invoke( { code inside lambda } )
However, if you look at the invoke
method carefully, you'll see not a common lambda, but a "lambda with receiver" or "lambda with context", whose type is defined as:
SchedulingContext.() -> Unit
Let's examine it in detail.
Lambda With Receiver
Documentation
Kotlin enables us to set a context for lambda expressions (context and receiver mean same thing here). Context
is just an object. The context type is defined together with the lambda expression type. Such a lambda acquires the properties of a non-static method in the context
class, but it only has access to the public
methods of this class.
While the type of a normal lambda is defined like ()
-> Unit, the type of a lambda with X context is defined as follows: X.()
-> Unit. And, if normal, lambdas can be called in a usual way:
val x : () -> Unit = {}
x()
Meanwhile, a lambda with context requires a context:
class MyContext
val x : MyContext.() -> Unit = {}
val c = MyContext()
c.x()
x(c)
I'd like to remind you that we have defined the invoke
operator in the schedule
object (see the preceding paragraph) that allows us to use the construct:
schedule { }
The lambda we are using has the context of the SchedulingContext
type. This class has a data method in it. As a result, we get the following construct:
schedule {
data {
}
}
As you have probably guessed, the data
method also takes a lambda with context. However, it is a different context. Thus, we get nested structures having several contexts inside simultaneously. To get the idea of how it works, let's remove all syntactic sugar from the example:
schedule.invoke({
this.data({ })
})
As you can see, it's all fairly simple. Let's take a look at the invoke operator implementation.
operator fun invoke(init: SchedulingContext.() -> Unit) {
SchedulingContext().init()
}
We call the constructor for the context SchedulingContext()
, and then, with the created object (context
), we call the lambda with the init
identifier that we passed as a parameter. This resembles a general function call quite a bit.
As a result, in one single line — SchedulingContext().init()
— we create the context and call the lambda passed to the operator. For more examples, consider the apply
and with
methods from Kotlin standard library.
In the last examples, we discovered the invoke
operator and its combination with other tools. Next, we will focus on the tool that is formally an operator and makes the code cleaner — the get/set methods convention.
get/set Method Conventions
Documentation
When creating a DSL, we can implement a way to access maps by one or more keys. Let's look at the example below:
availabilityTable[DayOfWeek.MONDAY, 0] = true
println(availabilityTable[DayOfWeek.MONDAY, 0])
In order to use square brackets, we need to implement the get
or set
methods (depending on what we need — read or update) with an operator
modifier. You can find an example of such an implementation in the Matrix
class on GitHub. It is a simple wrapper for matrix operations. Below, you can see a code snippet on the subject:
class Matrix(...) {
private val content: List>
operator fun get(i: Int, j: Int) = content[i][j]
operator fun set(i: Int, j: Int, value: T) { content[i][j] = value }
}
You can use any get
and set
parameter types, the only limit is your imagination. You are free to use one or more parameters for get
/set
functions to provide a convenient syntax for data access. Operators in Kotlin provide lots of interesting features that are described in the documentation.
Surprisingly, there is a Pair
class in the Kotlin standard library. A larger part of the developer community finds Pair
harmful: When you use Pair
, the logic of linking two objects is lost, and thus it is not transparent why they are paired. The two tools I'll show you next will demonstrate how to keep the pair making sense without creating additional classes.
Type Aliases
Documentation
Suppose we need a wrapper class for a geo point with integer coordinates. Actually, we could use the Pair
class, but having such a variable, we can lose the understanding of why we have paired these values.
A straightforward solution is either to create a custom class or something even worse. Kotlin enriches the developer's toolkit via type aliases with the following notation:
typealias Point = Pair
In fact, it is nothing but renaming a construct. Due to this approach, we don't need to create the Point
class anymore, as it would only duplicate the Pair
. Now we can create a point
in this way:
val point = Point(0, 0)
However, the Pair
class has two attributes, first
and second
, that we need to rename somehow to blur any differences between the needed Point
and the initial Pair
class. For sure, we are not able to rename the attributes themselves (however, you can create extension properties), but there is one more notable feature in our toolkit called destructuring declarations
.
Destructuring Declarations
Documentation
Let's consider a simple case: Suppose we have an object of the Point
type that is, as we already know, just a renamed type Pair. If we look at the Pair
class implementation in the standard library, we'll see that it has a data modifier that directs the compiler to implement componentN
methods within this class. Let's learn more about it.
For any class, we can define the componentN
operator that will be in charge of providing access to one of the object attributes. That means that calling point.component1
will be equal to calling point.first
. Why do we need such a duplication?
A destructuring declaration is a means of "decomposing" an object to variables. This functionality allows us to write constructions of the following kind:
val (x, y) = Point(0, 0)
We can declare several variables at once, but what values will they be assigned? That's why we need the generated componentN
methods: Using the index starting from 1
instead of N
, we can decompose an object to a set of its attributes. So, the above construct is equal to the following:
val pair = Point(0, 0)
val x = pair.component1()
val y = pair.component2()
Which, in turn, is equal to:
val pair = Point(0, 0)
val x = pair.first
val y = pair.second
Where first
and second
are the Point object attributes.
The for
loop in Kotlin looks as follows, where x
takes the values 1
, 2
, and 3
:
for(x in listOf(1, 2, 3)) { ... }
Pay attention to the assertions block in the DSL from the main example. I'll repeat a part of it for convenience:
for ((day, lesson, student, teacher) in scheduledEvents) { ... }
This line should be evident. We iterate through a collection of scheduledEvents
, each element of which is decomposed into four attributes.
Extension Functions
Documentation
Adding new methods to objects from third-party libraries or to the Java Collection Framework is what a lot of developers have been dreaming about. Now we have such opportunity. This is how we declare extension functions:
fun AvailabilityTable.monday(from: String, to: String? = null)
Compared to the standard method, we add the class name as a prefix to define the class we extend. In the example above, AvailabilityTable
is an alias for the Matrix
type and, as aliases in Kotlin are nothing but renaming, this declaration is equal to the one below, which is not always convenient:
fun Matrix.monday(from: String, to: String? = null)
Unfortunately, there's nothing we can do here except not using the tool or adding methods only to a specific context class. In this case, the magic only appears where you need it. Moreover, you can use such functions even for extending interfaces. As a good example, the first
method extends any iterable object
:
fun Iterable.first(): T
In essence, any collection based on the Iterable
interface, despite the element type, gets the first
method. It is worth mentioning that we can place an extension method in the context class and thereby have access to the extension method only in this very context (similarly to a lambda with a context). Furthermore, we can create extension functions for Nullable
types (the explanation of Nullable
types is out of the scope here, but for more details, see this link). For example, that's how we can use the function isNullOrEmpty
from the standard Kotlin library that extends the CharSequence
type:
val s: String? = null
s.isNullOrEmpty()
Below is this function's signature:
fun CharSequence?.isNullOrEmpty(): Boolean
When working with such Kotlin extension functions from Java, they are accessible as static
functions.
Infix Functions
Documentation
One more way to sugarcoat our syntax is to use infix functions. Simply said, this tool helps us to get rid of excessive code in simple cases. The assertions block from the main snippet demonstrates this tool's use case:
teacherSchedule[day, lesson] shouldNotEqual null
This construction is equivalent to the following:
teacherSchedule[day, lesson].shouldNotEqual(null)
In some cases, brackets and dots can be redundant. For such cases, we can use the infix
modifier for functions.
In the code above, the construct teacherSchedule[day, lesson]
returns a schedule
element, and the function shouldNotEqual
checks that this element is not null
.
To declare an infix
function, you need to:
- Use the infix modifier.
- Use only one parameter.
Combining the last two tools, we can get the code below:
infix fun T.shouldNotEqual(expected: T)
Note that the generic type, by default, is an Any
inheritor (not Nullable). However, in such cases, we cannot use null
— that's why you should explicitly define the type Any
.
Context Control
Documentation
When we use a lot of nested contexts, in the lower level, we risk getting a wild mix. Due to the lack of control, the following meaningless construct becomes possible:
schedule {
data {
data { }
}
}
Before Kotlin v.1.1, there had already been a way to avoid such a mess. It lies in creating custom method data in a nested context, DataContext
, and then marking it with the Deprecated annotation with the ERROR
level.
class DataContext {
@Deprecated(level = DeprecationLevel.ERROR, message = "Incorrect context")
fun data(init: DataContext.() -> Unit) {}
}
This approach eliminates the possibility of building an incorrect DSL. Nevertheless, a big number of methods in SchedulingContext
would have made us do a lot of routine work discouraging people from any context control.
Kotlin 1.1 offers a new control tool—- the @DslMarker
annotation. It is applied to your own annotations, which, in turn, are used for marking your contexts. Let's create an annotation and mark it with the new tool from our toolkit:
@DslMarker annotation class MyCustomDslMarker
Now we need to mark up the contexts. In the main example, these are SchedulingContext
and DataContext
. As far as we annotate both classes with the common DSL marker, the following happens:
@MyCustomDslMarker
class SchedulingContext { ... }
@MyCustomDslMarker
class DataContext { ... }
fun demo() {
schedule {
data {
}
}
}
With all the benefits of this cool approach saving so much time and effort, one problem still remains. Take a look at the main example, or, more precisely, to this part of the code:
schedule {
data {
student {
name = "Petrov"
}
...
}
}
On the third nesting level, we get the new context, Student
, which is, in fact, an entity class. So we are expected to annotate part of the data model with @MyCustomDslMarker
, which is incorrect in my opinion. In the Student
context, the data {}
calls are still forbidden, as the external DataContext
is still in its place, but the following constructions remain valid:
schedule {
data {
student {
student { }
}
}
}
Attempts to solve the problem with annotations will lead to mixing business logic and testing code, and that is certainly not the best idea. Three solutions are possible here:
@MyCustomDslMarker
class StudentContext(val owner: Student = Student()): IStudent by owner
- Using an extra context for creating a student — for example,
StudentContext
. This smells like madness and outweighs the benefits of @DslMarker
. - Creating interfaces for all entities — for example,
IStudent
(no matter the name), then creating stub contexts that implement these interfaces, and, finally, delegating the implementation to the student
objects. That verges on madness, too. - Using the
@Deprecated
annotation, as in the examples above. In this case, it looks like the best solution to use: We just add a deprecated extension method for all Identifiable
objects.
@Deprecated("Incorrect context", level = DeprecationLevel.ERROR)
fun Identifiable.student(init: () -> Unit) {}
To sum it up, combining various tools empowers you to build a very convenient DSL for your real-world purposes.
Cons of DSL Use
Let's try to be more objective concerning the use of DSLs in Kotlin and find the drawbacks of using DSLs in your project.
Reuse of DSL Parts
Imagine you have to reuse a part of your DSL. You want to take a piece of your code and make it easy to replicate. Of course, in the simplest cases with a single context, we can hide the repeatable part of a DSL in an extension function, but this will not work in most cases.
Perhaps you could point me toward some better options in the comments because for now, only two solutions come to my mind: adding "named callbacks" as a part of the DSL or spawning lambdas. The second one is easier but can result in a living hell when you try to understand the call sequence. The problem is that the more imperative behavior we have, the fewer benefits remain from a DSL approach.
This?! It?!
Nothing's easier than losing the meaning of the current "this" and "it" while working with your DSL. If you use "it" as a default parameter name where it can be replaced by a meaningful name, you'd better do so. It's better to have a bit of obvious code than non-obvious bugs in it.
The notion of context can confuse someone who has never faced it. Now, as you have "lambdas with receiver" in your toolkit, unexpected methods inside DSLs are less likely to appear. Just remember, in worst case, you can set the context to a variable, for example, val mainContext = this
.
Nesting
This issue relates closely to the first drawback in this list. The use of nested in nested in nested constructions shifts all your meaningful code to the right. Up to a certain limit, this shift may be acceptable, but when it's shifted too much, it would be reasonable to use lambdas. Of course, this will not decrease your DSL's readability, but it can be a compromise in case your DSL implies not only compact structures, but also some logic. When you create tests with a DSL (the case covered by this article), this issue is not acute, as the data is described with compact structures.
Where are the Docs, Lebowski?
When you first try to cope with somebody's DSL, you will almost certainly wonder where the documentation is. At this point, I believe that if your DSL is to be used by others, usage examples will be the best docs. Documentation itself is important as an additional reference, but it is not very friendly to a reader. A domain-specific practitioner will normally start with the question "What do I call to get the result?" So in my experience, the examples of similar cases will better speak for themselves.
Conclusion
We've got an overview of the tools that enable you to design your own custom domain-specific language with ease. I hope you now see how it works. Feel free to suggest more tools in the comments.
It is important to remember that DSL is not a panacea. Of course, when you get such a powerful hammer, everything looks like a nail, but it isn't. Start small, create a DSL for tests, learn from your mistakes, and then, once you're more experienced, consider other usage areas.
About the Author
I'm a software developer at Haulmont. Last year, my area of responsibility was table scheduling in education. I have applied the described above technologies while developing applications based on CUBA Platform Java framework. I spend free time with Spring, with Kotlin DSL for Telegram bot development and of course, with my wife.
History
- 26th December, 2017: Initial version