article banner

Collection processing in Kotlin: Windowing, zipping and chunking

This is a chapter from the book Functional Kotlin. You can find it on LeanPub or Amazon. It is also available as a course.

zip is used to connect two collections into one in a way that forms pairs of elements that are in the same positions. So, zip between List<T1> and List<T2> returns List<Pair<T1, T2>>. The result list ends when the shortest zipped collection ends.

fun main() { val nums = 1..4 val chars = 'A'..'F' println(nums.zip(chars)) // [(1, A), (2, B), (3, C), (4, D)] val winner = listOf("Ashley", "Barbara", "Cyprian", "David") val prizes = listOf(5000, 3000, 1000) val zipped = winner.zip(prizes) println(zipped) // [(Ashley, 5000), (Barbara, 3000), (Cyprian, 1000)] zipped.forEach { (person, price) -> println("$person won $price") } // Ashley won 5000 // Barbara won 3000 // Cyprian won 1000 }

The zip function reminds me of polonaise - a traditional Polish dance. One feature of this dance is that a line of pairs is separated down the middle, then these pairs reform when they meet again.

A still from the movie Pan Tadeusz, directed by Andrzej Wajda, presenting the polonaise dance.

We can reverse zip operation using unzip, that transform a list of pairs into a pair of lists.

fun main() { // zip can be used with infix notation val zipped = (1..4) zip ('a'..'d') println(zipped) // [(1, a), (2, b), (3, c), (4, d)] val (numbers, letters) = zipped.unzip() println(numbers) // [1, 2, 3, 4] println(letters) // [a, b, c, d] }

When we need to connect adjacent elements of a collection into pairs, there is zipWithNext.

fun main() { println((1..4).zipWithNext()) // [(1, 2), (2, 3), (3, 4)] val person = listOf("Ashley", "Barbara", "Cyprian") println(person.zipWithNext()) // [(Ashley, Barbara), (Barbara, Cyprian)] }

There is also a variant of zipWithNext, that produces a list of results from a custom transformation, instead of a list of pairs.

fun main() { val person = listOf("A", "B", "C", "D", "E") println(person.zipWithNext { prev, next -> "$prev$next" }) // [AB, BC, CD, DE] }

Windowing

To connect adjacent elements into collections, the universal method is windowed, which returns a list of sublists of our list, where each is the next window of a given size. These sublists are made by sliding along this collection with the given step. In simpler words, you might imagine that windowed has a trolley of size size that makes a snapshot (a copy) of the elements below it and then makes a step of size step. When the end of the trolley falls off the collection, the process ends. However, suppose partialWindows is set to true. In that case, our trolley needs to fully fall off the collection for the process to stop (with partialWindows for the process to stop, our trolley can extend past the end of the collection to include any remaining elements).

fun main() { val person = listOf("Ashley", "Barbara", "Cyprian", "David") println(person.windowed(size = 1, step = 1)) // [[Ashley], [Barbara], [Cyprian], [David]] // so similar to map { listOf(it) } println(person.windowed(size = 2, step = 1)) // [[Ashley, Barbara], [Barbara, Cyprian], // [Cyprian, David]] // so similar to zipWithNext().map { it.toList() } println(person.windowed(size = 1, step = 2)) // [[Ashley], [Cyprian]] println(person.windowed(size = 2, step = 2)) // [[Ashley, Barbara], [Cyprian, David]] println(person.windowed(size = 3, step = 1)) // [[Ashley, Barbara, Cyprian], [Barbara, Cyprian, David]] println(person.windowed(size = 3, step = 2)) // [[Ashley, Barbara, Cyprian]] println( person.windowed( size = 3, step = 1, partialWindows = true ) ) // [[Ashley, Barbara, Cyprian], [Barbara, Cyprian, David], // [Cyprian, David], [David]] println( person.windowed( size = 3, step = 2, partialWindows = true ) ) // [[Ashley, Barbara, Cyprian], [Cyprian, David]] }

The windowed method is really universal but also complicated. So, one function that builds on it is chunked.

// chunked implementation from Kotlin stdlib fun <T> Iterable<T>.chunked(size: Int): List<List<T>> = windowed(size, size, partialWindows = true)

chunked divides our collection into chunks that are sub-collections of a certain size. It does not lose elements, so the last chunk might be smaller than the argument value.

fun main() { val person = listOf("Ashley", "Barbara", "Cyprian", "David") println(person.chunked(1)) // [[Ashley], [Barbara], [Cyprian], [David]] println(person.chunked(2)) // [[Ashley, Barbara], [Cyprian, David]] println(person.chunked(3)) // [[Ashley, Barbara, Cyprian], [David]] println(person.chunked(4)) // [[Ashley, Barbara, Cyprian, David]] }