I've noticed that when people seek my help with collection processing, what they are often missing is the fact that elements can be grouped. Here are a few tasks that require this operation:
Counting the number of users in a city, based on a list of users.
Finding the number of points received by each team, based on a list of players.
Finding the best option in each category, based on a list of options.
There are two ways to group elements from an iterable. The first one is easier, but the second one is faster. Let's discuss them both.
groupBy
The easiest way to solve this problem is by using the groupBy function, which returns a Map<K, List<V>>, where V is the type of the elements in the collection we started from, and K is the type we are mapping to. So, if we have a User list that we group by an id of type String, then the returned map is Map<String, List<User>>. In other words, groupBy divides our collection into multiple small collections: one for each unique key. This is how this function can be used to solve the above problems:
// Count the number of users in each city
val usersCount: Map<City, Int> = users
.groupBy { it.city }
.mapValues { (_, users) -> users.size }
// Find the number of points received by each team
val pointsPerTeam: Map<Team, Int> = players
.groupBy { it.team }
.mapValues { (_, players) ->
players.sumOf { it.points }
}
// Find the best option in each category
val bestFormatPerQuality: Map<Quality, Resolution> =
formats.groupBy { it.quality }
.mapValues { (_, formats) ->
formats.maxByOrNull { it.resolution }!!
// it is fine to use !! here, because
// this collection cannot be empty
}
These are good solutions. When we use groupBy, we receive a Map as a result, and we can use all the different methods defined on it. This makes groupBy a really nice intermediate step.
groupingBy
On the other hand, if we are dealing with some performance-critical parts of our code, groupBy is not the best choice because it takes some time to create a collection for each category we have, especially since these group sizes are not known in advance. Instead, we could use the groupingBy function, which does not do any additional operations: it just wraps the iterable together with the specified key selector.
public inline fun <T, K> Iterable<T>.groupingBy(
crossinline keySelector: (T) -> K
): Grouping<T, K> {
return object : Grouping<T, K> {
override fun sourceIterator(): Iterator<T> =
this@groupingBy.iterator()
override fun keyOf(element: T): K =
keySelector(element)
}
}
The returned Grouping can be considered a bit like a map from a key to a list of elements, but it supports far fewer operations. However, since using it might be an important optimization, let's analyze the options.
The first problem (counting users per city) can be solved easily. The Kotlin Standard Library already has the eachCount function, which easily gives us a map from each city to its number of users.
val usersCount = users.groupingBy { it.city }
.eachCount()
Finding the number of points received by each team is a bit harder. We can use the fold function, which is like a fold on an iterable, but it has a separate accumulator for each unique key. So, calculating the number of points per team is very similar to calculating the number of points in a collection.
Finally, the last problem: we need to find the biggest element in the group. We might use fold, but this would require a "zero" value, which we don't have. Instead, we can use reduce, which just starts from the first element. Its lambda has one additional parameter: the reference to the key of the group (we don't use it in the example below, so there is _ instead).
Now, you might have noticed that we could also have used reduce in the previous problem. If so, you’re right and such a solution would be more efficient. I just wanted to present both options.
Again, we can extract an extension function.
// Could be optimized to keep accumulator selector
inline fun <T, K> Grouping<T, K>.eachMaxBy(
selector: (T) -> Int
): Map<K, T> =
reduce { _, acc, elem ->
if (selector(acc) > selector(elem)) acc else elem
}
val bestFormatPerQuality = formats
.groupingBy { it.quality }
.eachMaxBy { it.resolution }
The last important function from the stdlib that is defined on Grouping is aggregate, which is very similar to fold and reduce. It iterates over all the elements and aggregates for each key. Its operation has 4 parameters: the key of the current element; an accumulator (also per element) or null for the first element with this key; a reference to the element; and a boolean, which is true if this element is the first element for this key. This is how our last problem can be solved using aggregate:
The groupBy function is part of many collection processing operations. It is convenient to use as it returns a Map that has plenty of useful functions. Its alternative is groupingBy, which is better for performance but is generally harder to use. It currently supports the following functions: eachCount, fold, reduce, and aggregate. Using them, we can define other functions we might need, just as we defined eachSumBy and eachMaxBy in this chapter.
Marcin Moskala is a highly experienced developer and Kotlin instructor as the founder of Kt. Academy, an official JetBrains partner specializing in Kotlin training, Google Developers Expert, known for his significant contributions to the Kotlin community. Moskala is the author of several widely recognized books, including "Effective Kotlin," "Kotlin Coroutines," "Functional Kotlin," "Advanced Kotlin," "Kotlin Essentials," and "Android Development with Kotlin."
Beyond his literary achievements, Moskala is the author of the largest Medium publication dedicated to Kotlin. As a respected speaker, he has been invited to share his insights at numerous programming conferences, including events such as Droidcon and the prestigious Kotlin Conf, the premier conference dedicated to the Kotlin programming language.
Nicola Corti is a Google Developer Expert for Kotlin. He has been working with the language since before version 1.0 and he is the maintainer of several open-source libraries and tools.
He's currently working as Android Infrastructure Engineer at Spotify in Stockholm, Sweden.
Furthermore, he is an active member of the developer community.
His involvement goes from speaking at international conferences about Mobile development to leading communities across Europe (GDG Pisa, KUG Hamburg, GDG Sthlm Android).
In his free time, he also loves baking, photography, and running.