Collection processing in Kotlin: Basic functions
This is a chapter from the book Functional Kotlin. You can find it on LeanPub or Amazon. It is also available as a course.
One of the most useful applications of functional programming is collection processing: operations on collections of elements. This is generally one of the most common tasks in programming. This should come as no surprise. Just look at any advanced programming project, and you will likely see plenty of collections. An online shop? Products, sellers, delivery methods, payment methods... A bank application? Accounts, transactions, contacts, offers... it goes on and on. Consider internet search results, folder structures, task managers, topics, and answers on forums... Collections are everywhere in nearly all the services we use.
These collections often need to be transformed, either to other collections or to some aggregate results. This is what we need collection processing methods for: to transform collections.
Collection processing is not a small deal. For years, it has been a primary selling point of Functional Programming0. Even the name of the Lisp programming language1 stands for "list processing". Likewise, Haskell is famous for its powerful collection processing methods. These amazing capabilities are also a selling point of Scala, where even Option
, a type used for null safety, can be viewed as a collection of zero or one element to be processed as a part of a list comprehension structure. Scala has strongly influenced the Java community and promoted a functional style, especially for processing collections. This is one of the biggest reasons why so many previously Object-Oriented languages introduced support for Functional Programming features: they wanted to support functional-style collection processing. Nowadays, most modern languages support such processing. This includes Kotlin, which has a huge library of collection processing methods that help us make processing effective and efficient.
To see the power of collection processing methods in a practical case, consider a situation in which we need to fetch a list of news items but we need to show only those that are visible, have the correct order, and are mapped to the proper view elements. Without functional-style collection processing, this is how these transformations look like:
With collection processing2, this can be replaced with the following code:
Such a notation is not only shorter, but also more readable. Every step performs a concrete transformation on the list of elements. Here is a visualization of the above process:
Being proficient in using functional-style collection processing is one of the hallmarks of a good Kotlin developer. It requires knowing useful methods and having experience in using them for a variety of problems. In this chapter, we will learn about the methods I find most useful, and then we will look at how they can be used together to achieve powerful collection processing.
Most collection processing functions are very simple under the hood. For the simplest ones, I will show their simplified implementations before their explanations so that you can enjoy figuring out how these functions work before learning about them.
forEach
and onEach
The forEach
function is an alternative to a simple for-loop - both invoke an operation on every element. Choosing between these two is often a matter of personal preference. The advantage of forEach
over for-loop is that forEach
can be called conditionally with a safe-call (?.
) and is better suited to multiline expressions. For-loop is generally considered more intuitive for less experienced developers.
Methods like
filter
orflatMap
will be covered later.
forEach
returns Unit
, so it is a terminal operation. This means no further steps are possible in the pipeline. However, in some situations, we need to invoke an operation on each element in the middle of collection processing. In such cases, we use onEach
, which also invokes an operation on each element, but it returns the same collection it is invoked on.
filter
Very often, we are interested in only certain elements in a collection. For instance, when we have a list of all users but are interested only in those that are active. Alternatively, we have a list of articles but we want to show only those that are public. In such cases, we use the filter
method, which returns a collection of only the elements that satisfy its predicate.
The filter
method can limit the number of elements; therefore, the new collection might be smaller or even empty, but the elements in it are the same elements as in the original one.
The name "filter" is a bit tricky because in English, we often use it in the meaning "filter out" (like "sediment filter" or "UV filter"). When we use a filter
in programming, we are interested not in what is filtered out but in what is retained. I understand the filter
function as "filter to keep the elements that...". For instance, in the above example, I would read "filter to keep the elements that are in the range from 2 to 10". You can also think of filtering water - when you do that, you want to get clear water as a result.
There is also filterNot
, which works similarly but keeps the elements that do not satisfy its predicate. So, filterNot(op)
gives the same result as filter { !op(it) }
.
map
One of the most popular collection processing functions is map
, which we use to transform all elements in a collection.
map
produces a collection of the same size, but the elements might be transformed and their type might be different from the original collection.
This transformation might be a simple modification, but often it is a transformation from one type to another. For instance, let's say that you are implementing an online shop: you have a list of offers to display, but you need to transform these simple data holders into some view elements that you can display.
mapNotNull
Whenever I need to optimize collection processing in performance-critical code, one of my greatest friends is mapNotNull
. It is basically the same as map
, but also skips null
values. It is useful when you want to implement one processing step that is both transforming and filtering. For instance, you have a list of strings, and you want to transform them into integers, but you want to skip those that cannot be parsed.
Here are a few practical examples of how this function can be used:
flatMap
Among collection processing functions, there is a famous quartet of functions every developer should know: forEach
, filter
, map
and... flatMap
. These are as idiomatic to functional collection processing as for and while loops are to imperative programming
flatMap
first maps elements into another collection of elements, then it flattens them. To make it possible to flatten elements, flatMap
requires its transformation to return something that is iterable, for instance a list or a set.
In practice, the only difference between flatMap
and map
is this flattening. So, if map
returns List<List<T>>
, flatMap
returns List<T>
. This difference can be eliminated with the flatten
method on Iterable<Iterable<T>>
(so flatMap(tr)
gives the same result as map(tr).flatten()
).
String.toList()
transforms a string into a list of characters.
We typically use flatMap
to extract elements from an object that holds a list of elements. For instance, we have a list of schools, each of which has a list of students, but we are interested in all the students. Another example might be if we have a list of departments, each of which has a list of employees, but we're interested in the employees.
There is an influential paper from 1991 Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire that pushed the idea of common recursion schemes (map, fold, etc.) to separate the "what" from the "how" of processing using functional algebra.
Lisp is one of the oldest programming languages still in widespread use today. Often known as the father of all functional programming languages. Today, the best-known general-purpose Lisp dialects are Clojure, Common Lisp, and Scheme.
In this chapter, I will use the term "collection processing" as shorthand for "functional-style collection processing".