Effective Kotlin Item 52: Consider associating elements to a map

This is a chapter from the book Effective Kotlin. You can find it on LeanPub or Amazon.

It is not uncommon to have a bigger set of elements, in which we need to find elements by some key many times. It might be:

  • A class storing configuration, loaded from some file or files.
  • A network repository storing downloaded data.
  • An in-memory repository (such repositories are often used for tests).

Those data might represent some list of users, ids, configurations, etc. They are generally fetched as a list, and it is tempting to represent them in our memory the same way:

class ConfigurationsRepository( private val configurations: List<Configuration> ) { fun getByName(name: String) = configurations .firstOrNull { it.name == name } } class NetworkUserRepo( private val userService: UserService ) : UserRepo { private var users: List<User>? = null suspend fun loadUsers() { users = userService.getUsers() } override fun getUser(id: UserId): User? = users ?.firstOrNull { it.id == id } } class InMemoryUserRepo : UserRepo { private val users: MutableList<User> = mutableListOf() override fun getUser(id: UserId): User? = users .firstOrNull { it.id == id } fun addUser(user: User) { user.add(user) } }

Using Map

However, this is rarely the best way to store those elements. Notice how the data we load are used. We most often access an element by some identifier or name (it is typically connected to how we design data to have unique keys in databases). Finding an element in a list has linear complexity (O(n), where n is the size of the list, or more concretely, it takes on average n/2 comparisons to find an element in a list). It is especially problematic for bigger lists because finding each element requires comparing it with many other elements. A better solution to this problem is to use a Map instead of a List. Kotlin by default uses a hash map (concretely LinkedHashMap), and as we described in Item 43: Respect the contract of hashCode, the performance of finding an element when we use the hash map is much better. On the JVM finding an element will take one comparison only. This is thanks to the fact that the size of the used hash map is adjusted to the size of the map itself (given that the hashCode function is implemented properly).

For example, this is the InMemoryRepo from before, but implemented to use a map instead of a list:

class InMemoryUserRepo : UserRepo { private val users: MutableMap<UserId, User> = mutableMapOf() override fun getUser(id: UserId): User? = users[id] fun addUser(user: User) { user.put(user.id, user) } }

Most other operations, like modifying or iterate over those data (likely using collection processing methods like filter, map, flatMap, sorted, sum, etc.) have more or less the same performance for the standard map and list. But finding an element by key is much faster.

Associating elements to keys

The crucial point is how do we transform from a list to a map and vice versa. For that, we use the associate* functions. The most common one is associateBy that builds a map by associating each element to the key specified by the key selector function. This key selector function (typically lambda expression next to associateBy) might point to the property we use to identify the element.

import kotlin.* data class User(val id: Int, val name: String) fun main() { val users = listOf(User(1, "Michal"), User(2, "Marek")) val byId: Map<Int, User> = users.associateBy { it.id } println(byId) // {1=User(id=1, name=Michal), // 2=User(id=2, name=Marek)} val byName: Map<String, User> = users .associateBy { it.name } println(byName) // {Michal=User(id=1, name=Michal), // Marek=User(id=2, name=Marek)} }

Notice, that keys in a map must be unique or duplicates are removed. This is why we should only associate by a unique identifier (to group by something that is not unique, use groupBy function).

import kotlin.* data class User(val id: Int, val name: String) fun main() { val users = listOf(User(1, "Michal"), User(2, "Michal")) val byName = users.associateBy { it.name } println(byName) // {Michal=User(id=2, name=Michal)} val group: Map<String, List<User>> = users.groupBy { it.name } println(group) // {Michal=[User(id=1, name=Michal), // User(id=2, name=Michal)]} }

To transform from the map to list instead, you can use the values property of Map:

import kotlin.* data class User(val id: Int, val name: String) fun main() { val users = listOf(User(1, "Michal"), User(2, "Michal")) val byId = users.associateBy { it.id } println(byId.values) // [User(id=1, name=Michal), User(id=2, name=Michal)] }

Here are some sample usages of the technique:

class ConfigurationsRepository( configurations: List<Configuration> ) { private val configurations: Map<String, Configuration> = configurations.associateBy { it.name } fun getByName(name: String) = configurations[name] } class NetworkUserRepo( private val userService: UserService ) : UserRepo { private var users: Map<UserId, User>? = null suspend fun loadUsers() { users = userService.getUsers() .associateBy { it.id } } override fun getUser(id: UserId): User? = users?.get(id) }

This technique is important, but it is not for all cases. It is more useful when we need to access those elements frequently. This is why it is especially important on the backend, where those collections might be accessed many times per second. It is not so important on the frontend (by that I mean also Android or iOS) where a user will access this repository at most a few times. We also need to remember that mapping from a list to a map takes some time as well, so if we do it often, it might hurt our performance as well.