Collection processing in Kotlin: Basic functions

This is a chapter from the book Functional Kotlin. You can find it on LeanPub or Amazon. It is also available as a course.

One of the most useful applications of functional programming is collection processing: operations on collections of elements. This is generally one of the most common tasks in programming. This should come as no surprise. Just look at any advanced programming project, and you will likely see plenty of collections. An online shop? Products, sellers, delivery methods, payment methods... A bank application? Accounts, transactions, contacts, offers... it goes on and on. Consider internet search results, folder structures, task managers, topics, and answers on forums... Collections are everywhere in nearly all the services we use.

These collections often need to be transformed, either to other collections or to some aggregate results. This is what we need collection processing methods for: to transform collections.

Collection processing is not a small deal. For years, it has been a primary selling point of Functional Programming⁰. Even the name of the Lisp programming language¹ stands for "list processing". Likewise, Haskell is famous for its powerful collection processing methods. These amazing capabilities are also a selling point of Scala, where even Option, a type used for null safety, can be viewed as a collection of zero or one element to be processed as a part of a list comprehension structure. Scala has strongly influenced the Java community and promoted a functional style, especially for processing collections. This is one of the biggest reasons why so many previously Object-Oriented languages introduced support for Functional Programming features: they wanted to support functional-style collection processing. Nowadays, most modern languages support such processing. This includes Kotlin, which has a huge library of collection processing methods that help us make processing effective and efficient.

To see the power of collection processing methods in a practical case, consider a situation in which we need to fetch a list of news items but we need to show only those that are visible, have the correct order, and are mapped to the proper view elements. Without functional-style collection processing, this is how these transformations look like:

val visibleNews = mutableListOf<News>() for (n in news) { if (n.visible) { visibleNews.add(n) } } Collections.sort(visibleNews) { n1, n2 -> n2.publishedAt - n1.publishedAt } val newsItemAdapters = mutableListOf<NewsItemAdapter>() for (n in visibleNews) { newsItemAdapters.add(NewsItemAdapter(n)) }

With collection processing², this can be replaced with the following code:

val newsItemAdapters = news .filter { it.visible } .sortedByDescending { it.publishedAt } .map(::NewsItemAdapter)

Such a notation is not only shorter, but also more readable. Every step performs a concrete transformation on the list of elements. Here is a visualization of the above process:

Being proficient in using functional-style collection processing is one of the hallmarks of a good Kotlin developer. It requires knowing useful methods and having experience in using them for a variety of problems. In this chapter, we will learn about the methods I find most useful, and then we will look at how they can be used together to achieve powerful collection processing.

Most collection processing functions are very simple under the hood. For the simplest ones, I will show their simplified implementations before their explanations so that you can enjoy figuring out how these functions work before learning about them.

`forEach` and `onEach`

// forEach implementation from Kotlin stdlib inline fun <T> Iterable<T>.forEach(action: (T) -> Unit) { for (element in this) action(element) } // simplified onEach implementation from Kotlin stdlib inline fun <T, C : Iterable<T>> C.onEach( action: (T) -> Unit ): C { for (element in this) action(element) return this }

The forEach function is an alternative to a simple for-loop - both invoke an operation on every element. Choosing between these two is often a matter of personal preference. The advantage of forEach over for-loop is that forEach can be called conditionally with a safe-call (?.) and is better suited to multiline expressions. For-loop is generally considered more intuitive for less experienced developers.

// Without variable, this code would be hard to read val messagesToSend = users.filter { it.isActive } .flatMap { it.remainingMessages } .filter { it.isToBeSent } for (message in messagesToSend) { sendMessage(message) } // better users.filter { it.isActive } .flatMap { it.remainingMessages } .filter { it.isToBeSent } .forEach { sendMessage(it) }

Methods like filter or flatMap will be covered later.

forEach returns Unit, so it is a terminal operation. This means no further steps are possible in the pipeline. However, in some situations, we need to invoke an operation on each element in the middle of collection processing. In such cases, we use onEach, which also invokes an operation on each element, but it returns the same collection it is invoked on.

users .filter { it.isActive } .onEach { log("Sending messages for user $it") } .flatMap { it.remainingMessages } .filter { it.isToBeSent } .forEach { sendMessage(it) }

`filter`

// simplified filter implementation from Kotlin stdlib inline fun <T> Iterable<T>.filter( predicate: (T) -> Boolean ): List<T> { val destination = ArrayList<T>() for (element in this) { if (predicate(element)) { destination.add(element) } } return destination }

Very often, we are interested in only certain elements in a collection. For instance, when we have a list of all users but are interested only in those that are active. Alternatively, we have a list of articles but we want to show only those that are public. In such cases, we use the filter method, which returns a collection of only the elements that satisfy its predicate.

val activeUsers = users .filter { it.isActive } val publicArticles = articles .filter { it.visibility == PUBLIC }

The filter method can limit the number of elements; therefore, the new collection might be smaller or even empty, but the elements in it are the same elements as in the original one.

fun main() { val old = listOf(1, 2, 6, 11) val new = old.filter { it in 2..10 } println(new) // [2, 6] }

The name "filter" is a bit tricky because in English, we often use it in the meaning "filter out" (like "sediment filter" or "UV filter"). When we use a filter in programming, we are interested not in what is filtered out but in what is retained. I understand the filter function as "filter to keep the elements that...". For instance, in the above example, I would read "filter to keep the elements that are in the range from 2 to 10". You can also think of filtering water - when you do that, you want to get clear water as a result.

There is also filterNot, which works similarly but keeps the elements that do not satisfy its predicate. So, filterNot(op) gives the same result as filter { !op(it) }.

fun main() { val old = listOf(1, 2, 6, 11) val new = old.filterNot { it in 2..10 } println(new) // [1, 11] }

`map`

// simplified map implementation from Kotlin stdlib inline fun <T, R> Iterable<T>.map( transform: (T) -> R ): List<R> { val size = if (this is Collection<*>) this.size else 10 val destination = ArrayList<R>(size) for (element in this) { destination.add(transform(element)) } return destination }

One of the most popular collection processing functions is map, which we use to transform all elements in a collection.

fun main() { val old = listOf(1, 2, 3, 4) val new = old.map { it * it } println(new) // [1, 4, 9, 16] }

map produces a collection of the same size, but the elements might be transformed and their type might be different from the original collection.

fun main() { val names: List<String> = listOf("Alex", "Bob", "Carol") val nameSizes: List<Int> = names.map { it.length } println(nameSizes) // [4, 3, 5] }

This transformation might be a simple modification, but often it is a transformation from one type to another. For instance, let's say that you are implementing an online shop: you have a list of offers to display, but you need to transform these simple data holders into some view elements that you can display.

// Make users that are 1 year older than before val olderUsers = users .map { it.copy(age = it.age + 1) } // Transform offers into offer views val offerViews = offers .map { OfferView(it) }

`mapNotNull`

// simplified mapNotNull implementation from Kotlin stdlib inline fun <T, R> Iterable<T>.mapNotNull( transform: (T) -> R ): List<R> { val size = if (this is Collection<*>) this.size else 10 val destination = ArrayList<R>(size) for (element in this) { val result = transform(element) if (result != null) destination.add(result) } return destination }

Whenever I need to optimize collection processing in performance-critical code, one of my greatest friends is mapNotNull. It is basically the same as map, but also skips null values. It is useful when you want to implement one processing step that is both transforming and filtering. For instance, you have a list of strings, and you want to transform them into integers, but you want to skip those that cannot be parsed.

fun main() { val old = listOf("1", "A", "2", "3", "B", "4") println(old.mapNotNull { it.toIntOrNull() }) // [1, 2, 3, 4] val numbers = listOf(-1, 2, -3, 4) println(numbers.mapNotNull { prod(it) }) // [2, 24] println(numbers.mapNotNull { if (it > 0) it else null }) // [2, 4] } fun prod(num: Int): Int? { if (num <= 0) return null // Can be simplified with fold, that we will learn later var res = 1 for (i in 1..num) { res *= i } return res }

Here are a few practical examples of how this function can be used:

// Transforming a list of products into their categories, // but skipping those that do not have a category. val categories: Set<Category> = products .mapNotNull { productCategories[it] } .toSet() // Getting exchange urls from a list of exchanges, // but skipping those that cannot be found. // toMap transforms a list of pairs into a map. val exchangeUrls: Map<Exchange, String> = exchanges .mapNotNull { exchange -> val url = getExchangeUrl(exchange) ?: return@mapNotNull null exchange to url } .toMap()

`flatMap`

// simplified flatMap implementation from Kotlin stdlib inline fun <T, R> Iterable<T>.flatMap( transform: (T) -> Iterable<R> ): List<R> { val size = if (this is Collection<*>) this.size else 10 val destination = ArrayList<R>(size) for (element in this) { destination.addAll(transform(element)) } return destination }

Among collection processing functions, there is a famous quartet of functions every developer should know: forEach, filter, map and... flatMap. These are as idiomatic to functional collection processing as for and while loops are to imperative programming

flatMap first maps elements into another collection of elements, then it flattens them. To make it possible to flatten elements, flatMap requires its transformation to return something that is iterable, for instance a list or a set.

fun main() { val old = listOf(1, 2, 3) val new = old.flatMap { listOf(it, it + 10) } println(new) // [1, 11, 2, 12, 3, 13] }

In practice, the only difference between flatMap and map is this flattening. So, if map returns List<List<T>>, flatMap returns List<T>. This difference can be eliminated with the flatten method on Iterable<Iterable<T>> (so flatMap(tr) gives the same result as map(tr).flatten()).

fun main() { val names = listOf("Ann", "Bob", "Cale") val chars1: List<Char> = names.flatMap { it.toList() } println(chars1) // [A, n, n, B, o, b, C, a, l, e] val mapRes: List<List<Char>> = names.map { it.toList() } println(mapRes) // [[A, n, n], [B, o, b], [C, a, l, e]] val chars2 = mapRes.flatten() println(chars2) // [A, n, n, B, o, b, C, a, l, e] println(chars1 == chars2) // true }

String.toList() transforms a string into a list of characters.

We typically use flatMap to extract elements from an object that holds a list of elements. For instance, we have a list of schools, each of which has a list of students, but we are interested in all the students. Another example might be if we have a list of departments, each of which has a list of employees, but we're interested in the employees.

val allStudents = schools .flatMap { it.students } val allEmployees = department .flatMap { it.employees }

There is an influential paper from 1991 Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire that pushed the idea of common recursion schemes (map, fold, etc.) to separate the "what" from the "how" of processing using functional algebra.

Lisp is one of the oldest programming languages still in widespread use today. Often known as the father of all functional programming languages. Today, the best-known general-purpose Lisp dialects are Clojure, Common Lisp, and Scheme.

In this chapter, I will use the term "collection processing" as shorthand for "functional-style collection processing".