
Any that we can override is hashCode. First, let’s explain why we need it. The hashCode function is used in a popular data structure called a hash table, which is used under the hood in a variety of different collections or algorithms.- Is fast.
- Returns different values for unequal elements (or at least has enough variation to limit collisions to a minimum).
| --------- | --------- |
| "How much wood would a woodchuck chuck" | 3 |
| "Peter Piper picked a peck of pickled peppers" | 2 |
| "Betty bought a bit of butter" | 1 |
| "She sells seashells by the seashore" | 2 |
| --------- | --------- |
| 0 | [] |
| 1 | ["Betty bought a bit of butter"] |
| 2 | ["Peter Piper picked a peck of pickled peppers", "She sells seashells by the seashore" ] |
| 3 | ["How much wood would a woodchuck chuck"] |
LinkedHashSet) and the default map (LinkedHashMap) use it (they both create more buckets than there are elements, so the complexity of checking if elements is a set or a key in a map is O(1) if hash code is implemented well). To produce a hash code, we use the hashCode function[^43_1].LinkedHashSet and LinkedHashMap will not behave properly when an object mutates after it has been added:data class FullName( var name: String, var surname: String ) val person = FullName("Maja", "Markiewicz") val s = mutableSetOf<FullName>() s.add(person) person.surname = "Moskała" print(person) // FullName(name=Maja, surname=Moskała) print(s.contains(person)) // false print(person in s) // false print(s.first() == person) // true
hashCode for, it should be clear how we expect it to behave. The formal contract is as follows (Kotlin 1.9.0):- Whenever it is invoked on the same object more than once, the
hashCodemethod must consistently return the same integer, provided no information used inequalscomparisons on the object is modified. - If two objects are equal according to the
equals()method, then calling thehashCodemethod on each of the two objects must produce the same integer result.
hashCode to be consistent. The second is the one that developers often forget about and which therefore needs to be highlighted is that hashCode always needs to be consistent with equals, and equal elements must have the same hash code. If they don’t, elements will be lost in collections that use a hash table under the hood:class FullName( var name: String, var surname: String ) { override fun equals(other: Any?): Boolean = other is FullName && other.name == name && other.surname == surname } val s = mutableSetOf<FullName>() s.add(FullName("Marcin", "Moskała")) val p = FullName("Marcin", "Moskała") print(p in s) // false print(p == s.first()) // true
hashCode when you have a custom equals implementation.
hashCode to be useful: hashCode should spread elements as widely and variously as possible as different elements should have the highest possible probability of having different hash values.hashCode always return the same number. Such a function would always place all elements into the same bucket. This fulfills the formal contract, but it is completely useless. There is no advantage to using a hash table when hashCode always returns the same value. Just take a look at the examples below, where you can see a properly implemented hashCode and one that always returns 0. For each equals we add, a counter counts how many times it was used. You can see this when we operate on sets with values of both types: the second class, named Terrible, requires many more comparisons:class Proper(val name: String) { override fun equals(other: Any?): Boolean { equalsCounter++ return other is Proper && name == other.name } override fun hashCode(): Int { return name.hashCode() } companion object { var equalsCounter = 0 } } class Terrible(val name: String) { override fun equals(other: Any?): Boolean { equalsCounter++ return other is Terrible && name == other.name } // Terrible choice, DO NOT DO THAT override fun hashCode() = 0 companion object { var equalsCounter = 0 } } val properSet = List(10000) { Proper("$it") }.toSet() println(Proper.equalsCounter) // 0 val terribleSet = List(10000) { Terrible("$it") }.toSet() println(Terrible.equalsCounter) // 50116683 Proper.equalsCounter = 0 println(Proper("9999") in properSet) // true println(Proper.equalsCounter) // 1 Proper.equalsCounter = 0 println(Proper("A") in properSet) // false println(Proper.equalsCounter) // 0 Terrible.equalsCounter = 0 println(Terrible("9999") in terribleSet) // true println(Terrible.equalsCounter) // 4324 Terrible.equalsCounter = 0 println(Terrible("A") in terribleSet) // false println(Terrible.equalsCounter) // 10001
hashCode in Kotlin only when we define a custom equals. When we use the data modifier, it generates both equals and a consistent hashCode. When you do not have a custom equals method, do not define a custom hashCode unless you are sure you know what you are doing and you have a good reason for it. When you have a custom equals, implement a hashCode that always returns the same value for equal elements.equals that checks the equality of significant properties, then a typical hashCode should be calculated using the hash codes of these properties. How can we make a single hash code out of this many hash codes? A typical way is to accumulate them all in a result, and every time we add the next one we multiply the result by the number 31. It doesn’t need to be exactly 31, but this number’s characteristics make it good for this purpose. It is used this way so often that now we can treat it as a convention. Hash codes generated by the data modifier are consistent with this convention. Here is an example implementation of a typical hashCode, together with its equals:class DateTime( private var millis: Long = 0L, private var timeZone: TimeZone? = null ) { private var asStringCache = "" private var changed = false override fun equals(other: Any?): Boolean = other is DateTime && other.millis == millis && other.timeZone == timeZone override fun hashCode(): Int { var result = millis.hashCode() result = result * 31 + timeZone.hashCode() return result } }
Objects.hash, which calculates the hash of multiple objects using the same algorithm as presented above:override fun hashCode(): Int = Objects.hash(timeZone, millis)
override fun hashCode(): Int = hashCodeFrom(timeZone, millis) inline fun hashCodeOf(vararg values: Any?) = values.fold(0) { acc, value -> (acc * 31) + value.hashCode() }
hashCode ourselves. For instance, in the DateTime class presented above, instead of implementing equals and hashCode ourselves, we can just use the data modifier:data class DateTime2( private var millis: Long = 0L, private var timeZone: TimeZone? = null ) { private var asStringCache = "" private var changed = false }
hashCode, remember that the most important rule is that it always needs to be consistent with equals, and it should always return the same value for elements that are equal.hashCodeis used to calculate a hash code for an object. It is used by hash tables to place elements in buckets, so we can find them faster.hashCodeshould be consistent withequals. This means that if two objects are equal, they should have the same hash code. So whenever you overrideequals, you should also overridehashCode.hashCodeshould be fast and should spread elements as widely as possible.- We rarely need to implement
hashCodeourselves. When we do, we can useObjects.hashfunction on Kotlin/JVM. On other platforms, we can implement a similar function ourselves.
hashCode returns Int, which is a 32-bit signed integer (i.e., 4, 294, 967, 296 buckets), which is too much for a set that might contain only one element. To solve this problem, there is a transformation that makes this number much smaller. When it is needed, the algorithm transforms the previous hash table into a new one with new buckets.