article banner (priority)

Effective Kotlin Item 47: Avoid unnecessary object creation

This is a chapter from the book Effective Kotlin. You can find it on LeanPub or Amazon.

Object creation always costs something and can sometimes be expensive. This is why avoiding unnecessary object creation can be an important optimization. It can be done on many levels. For instance, in JVM it is guaranteed that a string object will be reused by other code running in the same virtual machine that happens to contain the same string literal2:

val str1 = "Lorem ipsum dolor sit amet" val str2 = "Lorem ipsum dolor sit amet" print(str1 == str2) // true print(str1 === str2) // true

Boxed primitives (Integer, Long) are also reused in JVM when they are small (by default, the Integer Cache holds numbers in the range from -128 to 127).

val i1: Int? = 1 val i2: Int? = 1 print(i1 == i2) // true print(i1 === i2) // true, because i2 was taken from cache

Reference equality (===) shows that this is the same object. However, if we use a number that is either smaller than -128 or bigger than 127, different objects will be created:

val j1: Int? = 1234 val j2: Int? = 1234 print(j1 == j2) // true print(j1 === j2) // false

Notice that a nullable type is used to force Integer instead of int under the hood. When we use Int, it is generally compiled to the primitive int, but if we make it nullable or when we use it as a type argument, Integer is used instead. This is because a primitive cannot be null and cannot be used as a type argument.

Knowing that such mechanisms are available in Kotlin, you might wonder how significant they are. Is object creation expensive?

Is object creation expensive?

Wrapping something into an object has 3 costs:

  • Objects take additional space. In a modern 64-bit JDK, an object has a 12-byte header that is padded to a multiple of 8 bytes, so the minimum object size is 16 bytes. For 32-bit JVMs, the overhead is 8 bytes. Additionally, object references also take space. Typically, references are 4 bytes on 32-bit or 64-bit platforms up to -Xmx32G, and they are 8 bytes for memory allocation pool set above 32Gb (-Xmx32G). These are relatively small numbers, but they can add up to a significant cost. When we think about small elements like integers, they make a difference. Int as a primitive fit in 4 bytes, but when it is a wrapped type on the 64-bit JDK we mainly use today, it requires 16 bytes (it fits in the 4 bytes after the header), and its reference requires 4 or 8 bytes. In the end, it takes 5 or 6 times more space1. This is why an array of primitive integers (IntArray) takes 5 times less space than an array of wrapped integers (Array<Int>), as explained in the Item 55: Consider Arrays with primitives for performance-critical processing.

  • Access requires an additional function call when elements are encapsulated. Again, this is a small cost as function use is very fast, but it can add up when we need to operate on a huge pool of objects. We will see how this cost can be eliminated in Item 48: Use the inline modifier for functions with parameters of functional types and Item 49: Consider using inline classes.

  • Objects need to be created and allocated in memory, references need to be created, etc. These are small numbers, but they can rapidly accumulate when there are many objects. In the snippet below, you can see the cost of object creation.

class A private val a = A() // Benchmark result: 2.698 ns/op fun accessA(blackhole: Blackhole) { blackhole.consume(a) } // Benchmark result: 3.814 ns/op fun createA(blackhole: Blackhole) { blackhole.consume(A()) } // Benchmark result: 3828.540 ns/op fun createListAccessA(blackhole: Blackhole) { blackhole.consume(List(1000) { a }) } // Benchmark result: 5322.857 ns/op fun createListCreateA(blackhole: Blackhole) { blackhole.consume(List(1000) { A() }) }

By eliminating objects, we can avoid all three of these costs. By reusing objects, we can eliminate the first and the third ones. If we know the costs of objects, we can start considering how we can minimize these costs in our applications by limiting the number of unnecessary objects. Let’s see some ways we can do this.

Object declaration

A very simple way to reuse an object instead of creating it every time is using an object declaration (singleton). To see an example, let’s imagine that you need to implement a linked list. This list can either be empty or it can be a node containing an element and a reference to the rest of the list. This is an example implementation:

sealed class LinkedList<T> class Node<T>( val head: T, val tail: LinkedList<T> ) : LinkedList<T>() class Empty<T> : LinkedList<T>() // Usage val list: LinkedList<Int> = Node(1, Node(2, Node(3, Empty()))) val list2: LinkedList<String> = Node("A", Node("B", Empty()))

One problem with this implementation is that we need to create an instance of Empty every time we create a list. Instead, we should have just one instance and use it universally. The only problem is which generic type we should use. We want this empty list to be a subtype of all lists. We cannot use all types, but we don’t need to. A solution is that we can make Empty a list of Nothing, which is a subtype of every type. If this list is covariant (what we achieve with the out modifier), then LinkedList<Nothing> is a subtype of every LinkedList. Making type arguments covariant truly makes sense here since LinkedList is immutable and its generic type is used only in "out" positions (Item 24: Consider variance for generic types). So, this is the improved code:

sealed class LinkedList<out T> class Node<out T>( val head: T, val tail: LinkedList<T> ) : LinkedList<T>() object Empty : LinkedList<Nothing>() // Usage val list: LinkedList<Int> = Node(1, Node(2, Node(3, Empty))) val list2: LinkedList<String> = Node("A", Node("B", Empty))

This is a useful and popular trick, especially when we define immutable sealed classes. We use it when we define messages or specify a set of options using a sealed class or interface, etc.

sealed class ValueChange<out T> object Keep : ValueChange<Nothing>() object SetDefault : ValueChange<Nothing>() object SetEmpty : ValueChange<Nothing>() class Set<out T>(val value: T) : ValueChange<T>() sealed class ManagerMessage class CodeProduced(val code: String) : ManagerMessage() object ProductionStopped : ManagerMessage() sealed interface AdView object FacebookAd : AdView object GoogleAd : AdView class OwnAd(val text: String, val imgUrl: String) : AdView

Apart from object declaration, there are other ways to reuse objects, one of which is a factory function with a cache.

Factory function with a cache

Every time we use a constructor, we create a new object, but this is not necessarily true when we use factory functions as they can have a cache. The simplest case is when a factory function always returns the same object. This is, for instance, how emptyList from stdlib is implemented:

fun <T> emptyList(): List<T> = EmptyList

Sometimes we have a set of objects, and we return one of them. For instance, Dispatchers.Default in the Kotlin coroutines library has a pool of threads; whenever we start anything using the default dispatcher, it will be started in a thread that is not in use. Similarly, we might have a pool of connections with a database. Having a pool of objects is a good solution when object creation is heavy and we might need to use a few mutable objects at the same time.

Caching can also be done for parameterized factory functions. In such a case, we might keep our objects in a map:

private val connections: MutableMap<String, Connection> = mutableMapOf<String, Connection>() fun getConnection(host: String) = connections.getOrPut(host) { createConnection(host) }

Caching can be used for all pure functions. In such cases, we call it memoization. Here is a function that calculates a number in a given position in the Fibonacci sequence on the basis of the definition:

private val FIB_CACHE: MutableMap<Int, BigInteger> = mutableMapOf<Int, BigInteger>() fun fib(n: Int): BigInteger = FIB_CACHE.getOrPut(n) { if (n <= 1) BigInteger.ONE else fib(n - 1) + fib(n - 2) }

Now, during the first run our function is nearly as efficient as a linear solution, and later it gives a result immediately if it has already been calculated. A comparison between this and a classic linear Fibonacci number implementation on an example machine is presented in the below table. Also, the iterative implementation we compare it to is presented below.

n = 100n = 200n = 300n = 400
fibIter1997 ns5234 ns7008 ns9727 ns
fib (first)4413 ns9815 ns15484 ns22205 ns
fib (later)8 ns8 ns8 ns8 ns
fun fibIter(n: Int): BigInteger { if (n <= 1) return BigInteger.ONE var p = BigInteger.ONE var pp = BigInteger.ONE for (i in 2..n) { val temp = p + pp pp = p p = temp } return p }

You can see that using this function for the first time is slower than using the classic approach as checking if the value is in the cache and adding it there causes additional overhead. However, once the values have been added, the retrieval is nearly instantaneous.

This has a significant drawback though: we are reserving and using more memory since the Map needs to be stored somewhere. Everything would be fine if this were cleared at some point, but take into account the fact that for the Garbage Collector (GC) there is no difference between a cache and any other static field that might be necessary in the future. The cache will hold this data as long as possible, even if we never use the fib function again. One thing that helps is using a soft reference that can be removed by the GC when memory is needed. This should not be confused with weak references. In simple words, the difference is:

  • A weak reference does not prevent the Garbage Collector from cleaning up a value. So, if no other reference (variable) is using this value, it will be cleaned up.
  • A soft reference does not guarantee that a value won’t be cleaned up by the GC either, but in most JVM implementations this value won’t be cleaned up unless memory is needed. Soft references are perfect when we implement a cache.

This is an example of a property delegate (details in Item 21: Use property delegation to extract common property patterns) that creates a map on demand and lets us use it, but this can be cleared when memory is needed (full implementation should include thread synchronization):

private val FIB_CACHE: MutableMap<Int, BigInteger> by SoftReferenceDelegate { mutableMapOf<Int, BigInteger>() } fun fib(n: Int): BigInteger = FIB_CACHE.getOrPut(n) { if (n <= 1) BigInteger.ONE else fib(n - 1) + fib(n - 2) } class SoftReferenceDelegate<T : Any>( val initialization: () -> T ) { private var reference: SoftReference<T>? = null operator fun getValue( thisRef: Any?, property: KProperty<*> ): T { val stored = reference?.get() if (stored != null) return stored val new = initialization() reference = SoftReference(new) return new } }

Designing a cache well is not easy, and there are already some well-implemented libraries providing them. I've used Caffeine and Ehcache. I am happy with both, but I am still waiting for a well-implemented cache with support for suspending functions.

In the end, caching is always a tradeoff: we buy performance at the cost of memory. Remember this and use caches wisely. No one wants to move from performance issues to lack of memory issues.

Heavy object lifting

A very useful trick for performance is lifting a heavy object to an outer scope. Surely, if possible, we should lift all heavy operations from collection processing functions to a general processing scope. For instance, here is a function that counts the number of values equal to the maximum value:

fun <T : Comparable<T>> Iterable<T>.countMax(): Int = count { it == this.maxOrNull() }

A better solution is to extract the biggest element to the level of the countMax function:

fun <T : Comparable<T>> Iterable<T>.countMax(): Int { val max = this.maxOrNull() return count { it == max } }

This solution is better for performance because we don’t need to find the biggest element on the receiver in every iteration. Notice that it also improves readability by making it clear that max is called on the extension receiver, therefore it is the same throughout all iterations.

Extracting a value to an outer scope so as to not recalculate it unnecessarily is an important practice. This might sound obvious, but it is not always very clear. Just take a look at this function, where we use a regex to determine whether a string contains a valid IP address:

fun String.isValidIpAddress(): Boolean { return this.matches( ("\\A(?:(?:25[0-5]|2[0-4][0-9]|" + "[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|" + "[01]?[0-9][0-9]?)\\z").toRegex() ) } // Usage print("5.173.80.254".isValidIpAddress()) // true

The problem with this function is that the Regex object needs to be created every time we use it. This is a serious disadvantage since regex pattern compilation is a complex operation. This is why this function is not suitable for repeated use in performance-constrained parts of our code. However, we can improve it by lifting the regex up to the top level:

private val IS_VALID_IP_REGEX = "\\A(?:(?:25[0-5]|2[0-4]" + "[0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|" + "[01]?[0-9][0-9]?)\\z".toRegex() fun String.isValidIpAddress(): Boolean = matches(IS_VALID_IP_REGEX)

If this function is in a file together with some other functions and we don’t want to create this object unless it is used, we can even initialize the regex lazily:

private val IS_VALID_IP_REGEX by lazy { ("\\A(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}" + "(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\z").toRegex() }

Making properties lazy is also useful when we are dealing with classes.

Lazy initialization

Often, when we need to create a heavy class, it is better to do so lazily. For instance, imagine that class A needs instances of B, C, and D, all of which are heavy. If we just create them during class creation, the creation of A will be very heavy because it will need to create B, C, and D and then the rest of its body. Therefore, the heaviness of object creation will just accumulate.

class A { val b = B() val c = C() val d = D() //... }

There is a cure though. We can just initialize these heavy objects lazily:

class A { val b by lazy { B() } val c by lazy { C() } val d by lazy { D() } //... }

Each object will then be initialized just before its first usage. The cost of creating these objects will be spread instead of accumulated.

Using primitives

In JVM we have a special built-in type to represent basic elements like numbers or characters. They are called primitives and are used by the Kotlin/JVM compiler under the hood wherever possible. However, there are some cases where a wrapped class needs to be used instead. The two main cases are:

  • When we operate on a nullable type (primitives cannot be null).
  • When we use a type as a generic.

So, in short:

Kotlin typeJava type
Intint
Int?Integer
ListList

Now you know that you can optimize your code to have primitives under the hood instead of wrapped types. Such optimization makes sense mainly on Kotlin/JVM and on some flavors of Kotlin/Native, but it doesn’t make any sense on Kotlin/JS. It should also be remembered that it makes sense only when operations on a number are repeated many times. Access to both primitive and wrapped types is relatively fast compared to other operations. The difference manifests itself when we deal with really big collections (we will discuss this in Item 55: Consider Arrays with primitives for performance-critical processing) or when we operate on an object intensively. Also, remember that forced changes might lead to less readable code. This is why I suggest this optimization only for performance-critical parts of code and in libraries. You can find out what is performance-critical using a profiler.

To see an example, imagine that you are implementing a standard library for Kotlin and you want to introduce a function that will return the biggest element or null if this iterable is empty. You don’t want to iterate over the iterable more than once. This is not a trivial problem, but it can be solved with the following function:

fun Iterable<Int>.maxOrNull(): Int? { var max: Int? = null for (i in this) { max = if (i > (max ?: Int.MIN_VALUE)) i else max } return max }

This implementation has serious disadvantages:

  • We need to use an Elvis operator in every step

  • We use a nullable value, so there will be an Integer instead of an int under the hood in JVM.

Resolving these two problems requires us to implement the iteration using a while loop:

fun Iterable<Int>.maxOrNull(): Int? { val iterator = iterator() if (!iterator.hasNext()) return null var max: Int = iterator.next() while (iterator.hasNext()) { val e = iterator.next() if (max < e) max = e } return max }

For a collection of elements from 1 to 10 million, this optimized implementation took 289 ms on my computer, while the previous one took 518 ms. This is nearly 2 times faster, but remember that this is an extreme case designed to show the difference. Such optimization is rarely reasonable in code that is not performance-critical, but everything is performance-critical if you implement a Kotlin standard library. This is why the second approach is chosen here:

/** * Returns the largest element or `null` if there are * no elements. */ public fun <T : Comparable<T>> Iterable<T>.max(): T? { val iterator = iterator() if (!iterator.hasNext()) return null var max = iterator.next() while (iterator.hasNext()) { val e = iterator.next() if (max < e) max = e } return max }

Summary

In this item, you’ve seen different ways to avoid object creation. Some of them are cheap in terms of readability, so these should be used freely. For instance, lifting heavy objects out of a loop or function is generally a good idea. This is good practice for performance, and thanks to this extraction we can name this object, so we make our function easier to read. Optimizations that are tougher or require bigger changes should possibly be skipped. We should avoid premature optimization unless we have such guidelines in our project or we are developing a library that might be used in unknown ways. We’ve also learned some optimizations that might be used to optimize the performance-critical parts of our code.

1:

To measure the size of concrete fields in JVM objects, use Java Object Layout.

2:

Java Language Specification, Java SE 8 edition, 3.10.5