Effective Kotlin Item 42: Respect the contract of equals
In Kotlin, every object extends
Any, which has a few methods with well-established contracts. These methods are:
Their contract is described in their comments and elaborated in the official documentation, and as I described in Item 32: Respect abstraction contracts, every subtype of a type with a contract should respect this contract. Mentioned methods have an important position in Kotlin, as they have been defined since the beginning of Java, and therefore many objects and functions depend on their contract. Breaking it will often lead to some objects or functions not working properly. This is why in the current and next items we will talk about overriding these functions and about their contracts. Let’s start with
In Kotlin, there are two types of equality:
Structural equality - checked by the
==operator (and its negated counterpart
a == btranslates to
ais not nullable, or otherwise to
a?.equals(b) ?: (b === null).
Referential equality - checked by the
===operator (and its negated counterpart
truewhen both sides point to the same object.
equals is implemented in
Any, which is the superclass of every class, we can check the equality of any two objects. Although using operators to check equality is not allowed when objects are not of the same type:
Objects either need to have the same type or one needs to be a subtype of another:
It is because it does not make sense to check equality of two objects of a different type. It will get clear when we will explain the contract of equals.
Why do we need equals?
The default implementation of
equals coming from
Any checks if another object is exactly the same instance. Just like the referential equality (
===). It means that every object is unique by default:
Such behavior is useful for many objects. It is perfect for active elements, like a database connection, a repository, or a thread. However, there are objects where we need to represent equality differently. A popular alternative is a data class equality, that checks if all primary constructor properties are equal:
Such behavior is perfect for classes that are represented by the data they hold, and so we often use the data modifier in data model classes or in other data holders.
Notice that data class equality also helps when we need to compare some, but not all properties. For instance when we want to skip cache or other redundant properties. Here is an example of an object representing date and time having properties
changed that should not be compared by equality check:
The same can be achieved by using data modifier:
Just notice that
copy in such case will not copy those properties that are not declared in the primary constructor. Such behavior is correct only when those additional properties are truly redundant (the object will behave correctly when they will be lost).
Thanks to those two alternatives, default and data class equality, we rarely need to implement equality ourselves in Kotlin.
An example in which we might need to implement the equality is when just concrete a property should decide if objects are equal. For instance, a
User class might have an assumption that two users are equal when their
id is identical.
As you can see, we implement
equals ourselves when:
- We need its logic to differ from the default one.
- We need to compare only a subset of properties.
- We do not want our object to be a data class or properties we need to compare are not in the primary constructor.
The contract of equals
This is how
equals is described in its comments (Kotlin 1.3.11, formatted):
Indicates whether some other object is "equal to" this one. Implementations must fulfill the following requirements:
Reflexive: for any non-null value
Symmetric: for any non-null values
trueif and only if
Transitive: for any non-null values
Consistent: for any non-null values
y, multiple invocations of
trueor consistently return
false, provided no information used in
equalscomparisons on the objects is modified.
Never equal to null: for any non-null value
Additionally, we expect
hashCode to be fast. It is not a part of the official contract, but it would be highly unexpected to wait a few seconds to check if two elements are equal.
All those requirements are important. They are assumed from the beginning, also in Java, and so now many objects depend on those assumptions. Don’t worry if they sound confusing right now, we’ll describe them in detail.
- Object equality should be reflexive, meaning that
true. Sounds obvious, but this can be violated. For instance, someone might want to make a
Timeobject that can also represent the current time, and compares milliseconds:
Notice that here the result is inconsistent, so it also violates the last principle.
When an object is not equal to itself, it might not be found in most collections even if it is there when we check using the
contains method. It will not work correctly in most unit test assertions either.
When the result is not constant, we cannot trust it. We can never be sure if the result is correct or is it just a result of inconsistency.
How should we improve it? A simple solution is checking separately if the object represents the current time and if not, then whether it has the same timestamp. Though it is a typical example of tagged class, and as described in Item 40: Prefer class hierarchies to tagged classes, it would be even better to use class hierarchy instead:
- Object equality should be symmetric, meaning that the result of
x == yand
y == xshould always be the same. It can be easily violated when in our equality we accept objects of a different type. For instance, let’s say that we implemented a class to represent complex numbers and made its equality accept
The problem is that
Double does not accept equality with
Complex. Therefore, the result depends on the order of the elements:
Lack of symmetry means, for instance, unexpected results on collections
contains or on unit tests assertions.
When equality is not symmetric, and it is used by another object, we cannot trust the result because it depends on whether this object compares
x. This fact is not documented, and it is not a part of the contract as object creators assume that both should work the same (they assume symmetry). It can also change at any moment - creators during some refactorization might change the order of those values. If your object is not symmetric, it might lead to unexpected and really hard to debug errors in your implementation. This is why when we implement
equals, we should always consider symmetry.
The general solution is that we should not accept equality between different classes. I’ve never seen a case where it would be reasonable. Notice that in Kotlin similar classes are not equal to each other. 1 is not equal to 1.0, and 1.0 is not equal to 1.0F. Those are different types, and they are not even comparable. In Kotlin we cannot use the
== operator between two different types that do not have a common superclass other than
- Object equality should be transitive, meaning that for any non-null reference values
true. The biggest problem with transitivity is when we implement different kinds of equality that check a different subtype of properties. For instance, let’s say that we have
DateTimedefined this way:
The problem with the above implementation is that when we compare two
DateTime, we check more properties than when we compare
Date. Therefore, two
DateTime with the same day, but a different time, will not be equal to each other, but they’ll both be equal to the same
Date. As a result, their relation is not transitive:
Notice that here the restriction to compare only objects of the same type didn’t help because we’ve used inheritance. Such inheritance violates the Liskov substitution principle, and should not be used. In this case, use composition instead of inheritance (Item 36: Prefer composition over inheritance). When you do, do not compare two objects of different types. These classes are perfect examples of objects holding data and representing them this way is a good choice:
Equality should be consistent, meaning that the method invoked on two objects should always return the same result, unless one of those objects was modified. For immutable objects, the result should be always the same. In other words, we expect
equalsto be a pure function (do not modify the state of an object) for which result always depends only on input and state of its receiver. We’ve seen the
Timeclass, that violated this principle. This rule was also famously violated in
Never equal to null: for any non-null value
false. It is important because
nullshould be unique, and no object should be equal to it.
Problem with equals in java.net.URL
One example of a really poorly designed
equals is the one from
java.net.URL. Equality of two
java.net.URL objects depends on a network operation, as two hosts are considered equivalent if both hostnames can be resolved into the same IP addresses. Take a look at the following example:
Should it return true or false? According to the contract, it should be
true, but the result is inconsistent. In normal conditions, it should print
true because their IP address is resolved as the same, although if you have your internet turned off, it will print
false. You can check it yourself. This is a big mistake! Equality should not be network dependent.
Here are the most important problems with this solution:
This behavior is inconsistent. For instance, two URLs could be equal when a network is available and unequal when it is not. Also, the network may change. The IP address for a given hostname varies over time and by the network. Two URLs could be equal on some networks and unequal on others.
The network may be slow, and we expect
hashCodeto be fast. A typical problem is when we check if a URL is present in a list. Such an operation would require a network call for each element on the list. Also, on some platforms, like Android, network operations are prohibited on the main thread. As a result, even adding to a set of URLs needs to be started on a separate thread.
The defined behavior is known to be inconsistent with virtual hosting in HTTP. Equal IP addresses do not imply equal content. Virtual hosting permits unrelated sites to share an IP address. This method could report two otherwise unrelated URLs to be equal because they're hosted on the same server.
In Android, this problem was fixed in Android 4.0 (Ice Cream Sandwich). Since that release, URLs are only equal if their hostnames are equal. When we use Kotlin/JVM on other platforms, it is recommended to use
java.net.URI instead of
I recommend against implementing equals yourself unless you have a good reason. Instead, use the default one or data class equality. If you do need custom equality, always consider if your implementation is reflexive, symmetric, transitive, and consistent. Make such class final, or beware that subclasses should not change how equality behaves. It is hard to make custom equality and support inheritance at the same time. Some even say it is impossible1. This is one of the reasons why data classes are final.
As Effective Java by Joshua Bloch, third edition claims in Item 10: Obey the general contract when overriding equals: "There is no way to extend an instantiable class and add a value component while preserving the equals contract, unless you’re willing to forgo the benefits of object-oriented abstraction".