article banner

Why using Kotlin Coroutines?

This is a chapter from the book Kotlin Coroutines. You can find it on LeanPub or Amazon.

Why do we need to learn Kotlin Coroutines? We already have well-established JVM libraries like RxJava or Reactor. Moreover, Java itself has support for multithreading, while many people also choose to just use plain old callbacks instead. Clearly, we already have many options for performing asynchronous operations.

Kotlin Coroutines offer much more than that. They are an implementation of a concept that was first described in 19631 but waited years for a proper industry-ready implementation2. Kotlin Coroutines connects powerful capabilities presented by half-century-old papers to a library that is designed to perfectly help in real-life use cases. What is more, Kotlin Coroutines are multiplatform, which means they can be used across all Kotlin platforms (like JVM, JS, iOS, and also in the common modules). Coroutines are also very efficient, significantly more efficient than libraries like RxJava or Reactor, and incomparably more efficient than blocking classic threads. Finally, they do not change the code structure drastically. We can use most Kotlin coroutines' capabilities nearly effortlessly (which we cannot say about RxJava or callbacks). This makes them beginner-friendly3.

Let's see it in practice. We will explore how different common use cases are solved by coroutines and other well-known approaches. I will show two typical use cases: Android and backend business logic implementation. Let's start with the first one.

Coroutines on Android (and other frontend platforms)

When you implement application logic on the frontend, what you most often need to do is:

  • get some data from one or many sources (API, view element, database, preferences, another application);
  • process this data;
  • do something with this data (display it in the view, store it in a database, send it to an API).

To make our discussion more practical, let's first assume we are developing an Android application. We will start with a situation in which we need to get news from an API, sort it, and display it on the screen. This is a direct representation of what we want our function to do:

fun onCreate() { val news = getNewsFromApi() val sortedNews = news .sortedByDescending { it.publishedAt } view.showNews(sortedNews) }

Sadly, this cannot be done so easily. On Android, each application has only one thread that can modify the view. This thread is very important and should never be blocked. That is why the above function cannot be implemented in this way. If it were started on the main thread, getNewsFromApi would block it, and our application would crash. If we started it on another thread, our application would crash when we call showNews because it needs to run on the main thread.

Thread switching

We could solve these problems by switching threads. First to a thread that can be blocked, and then to the main thread.

fun onCreate() { thread { val news = getNewsFromApi() val sortedNews = news .sortedByDescending { it.publishedAt } runOnUiThread { view.showNews(sortedNews) } } }

Such thread switching can still be found in some applications, but it is known for being problematic for several reasons:

  • There is no mechanism here to cancel these threads, so we often face memory leaks.
  • Making so many threads is costly.
  • Frequently switching threads is confusing and hard to manage.
  • The code will unnecessarily get bigger and more complicated.

To see those problems clearly, imagine the following situation: You open and quickly close a view. While opening, you might have started multiple threads that fetch and process data. Without cancelling them, they will still be doing their job and trying to modify a view that no longer exists. This means unnecessary work for your device, possible exceptions in the background, and who knows what other unexpected results.

Considering all these problems, let's look for a better solution.


Callbacks are another pattern that might be used to solve our problems. The idea is that we make our functions non-blocking, but we pass to them a function that should be executed once the process started by the callback function has finished. This is how our function might look if we use this pattern:

fun onCreate() { getNewsFromApi { news -> val sortedNews = news .sortedByDescending { it.publishedAt } view.showNews(sortedNews) } }

Notice that this implementation does not support cancellation. We might make cancellable callback functions, but it is not easy. Not only does each callback function need to be specially implemented for cancellation, but to cancel them we need to collect all the objects separately.

fun onCreate() { startedCallbacks += getNewsFromApi { news -> val sortedNews = news .sortedByDescending { it.publishedAt } view.showNews(sortedNews) } }

Callback architecture solves this simple problem, but it has many downsides. To explore them, let's discuss a more complex case in which we need to get data from three endpoints:

fun showNews() { getConfigFromApi { config -> getNewsFromApi(config) { news -> getUserFromApi { user -> view.showNews(user, news) } } } }

This code is far from perfect for several reasons:

  • Getting news and user data might be parallelized, but our current callback architecture doesn't support that (it would be hard to achieve this with callbacks).
  • As mentioned before, supporting cancellation would require a lot of additional effort.
  • The increasing number of indentations make this code hard to read (code with multiple callbacks is often considered highly unreadable). Such a situation is called "callback hell", which can be found especially in some older Node.JS projects:

  • When we use callbacks, it is hard to control what happens after what. The following way of showing a progress indicator will not work:
fun onCreate() { showProgressBar() showNews() hideProgressBar() // Wrong }

The progress bar will be hidden just after starting the process of showing news, so practically immediately after it has been shown. To make this work, we would need to make showNews a callback function as well.

fun onCreate() { showProgressBar() showNews { hideProgressBar() } }

That's why the callback architecture is far from perfect for non-trivial cases. Let's take a look at another approach: RxJava and other reactive streams.

RxJava and other reactive streams

An alternative approach that is popular in Java (both in Android and backend) is using reactive streams (or Reactive Extensions): RxJava or its successor Reactor. With this approach, all operations happen inside a stream of data that can be started, processed, and observed. These streams support thread-switching and concurrent processing, so they are often used to parallelize processing in applications.

This is how we might solve our problem using RxJava:

fun onCreate() { disposables += getNewsFromApi() .subscribeOn( .observeOn(AndroidSchedulers.mainThread()) .map { news -> news.sortedByDescending { it.publishedAt } } .subscribe { sortedNews -> view.showNews(sortedNews) } }

The disposables in the above example are needed to cancel this stream if (for example) the user exits the screen.

This is definitely a better solution than callbacks: no memory leaks, cancellation is supported, proper use of threads. The only problem is that it is complicated. If you compare it with the "ideal" code from the beginning (also shown below), you'll see that they have very little in common.

fun onCreate() { val news = getNewsFromApi() val sortedNews = news .sortedByDescending { it.publishedAt } view.showNews(sortedNews) }

All these functions, like subscribeOn, observeOn, map, or subscribe, need to be learned. Cancelling needs to be explicit. Functions need to return objects wrapped inside Observable or Single classes. In practice, when we introduce RxJava, we need to reorganize our code a lot.

fun getNewsFromApi(): Single<List<News>>

Think of the second problem, for which we need to call three endpoints before showing the data. This can be solved properly with RxJava, but it is even more complicated.

fun showNews() { disposables += getConfigFromApi() .flatMap { getNewsFromApi(it) } .subscribeOn(, getUserFromApi() .subscribeOn( ) { news: News, user: User -> Pair(news, user) } .subscribeOn( .observeOn(AndroidSchedulers.mainThread()) .subscribe { (news, user) -> view.showNews(news, user) } }

This code is truly concurrent and has no memory leaks, but we need to introduce RxJava functions such as zip and flatMap, pack a value into Pair, and destructure it, properly set schedulers for each stream. This is a correct implementation, but it's quite complicated. So finally, let's see what coroutines offer us.

Using Kotlin Coroutines

The core functionality that Kotlin Coroutines introduce is the ability to suspend a coroutine at some point and resume it in the future. Thanks to that, we might run our code on the Main thread and suspend it when we request data from an API. When a coroutine is suspended, the thread is not blocked and is free to be used by other processes, therefore it can be used to change the view or process other coroutines. Once the data is ready, the coroutine waits for the Main thread (this is a rare situation, but there might be a queue of coroutines waiting for it); once it gets the thread, it can continue from the point where it stopped.

This picture shows updateNews and updateProfile functions running on the Main thread in separate coroutines. They can do this interchangeably because they suspend their coroutines instead of blocking the thread. When the updateNews function is waiting for a network response, the Main thread is used by updateProfile. Here, it's assumed that getUserData did not suspend because the user's data was already cached, therefore it can run until its completion. This wasn't enough time for the network response, so the main thread is not used at that time (it can be used by other functions). Once the data appears, we grab the Main thread and use it to resume the updateNews function, starting from the point straight after getNewsFromApi().

By definition, coroutines are components that can be suspended and resumed. Concepts like async/await and generators, which can be found in languages like JavaScript, Rust or Python, also use coroutines, but their capabilities are very limited.

So, our first problem might be solved by using Kotlin coroutines in the following way:

fun onCreate() { viewModelScope.launch { val news = getNewsFromApi() val sortedNews = news .sortedByDescending { it.publishedAt } view.showNews(sortedNews) } }

In the above code, I used viewModelScope, which is currently quite common on Android. We might instead use a custom scope. We will discuss both options later.

This code is nearly identical to what we've wanted since the beginning! In this solution, the code runs on the Main thread but it never blocks it. Thanks to the suspension mechanism, we are suspending (instead of blocking) the coroutine when we need to wait for data. When the coroutine is suspended, the Main thread can go do other things, like drawing a beautiful progress bar animation. Once the data is ready, our coroutine takes the Main thread again and starts from where it previously stopped.

How about the other problem with three calls? It could be solved similarly:

fun showNews() { viewModelScope.launch { val config = getConfigFromApi() val news = getNewsFromApi(config) val user = getUserFromApi() view.showNews(user, news) } }

This solution looks good, but how it works is not optimal. These calls will happen sequentially (one after another), so if each of them takes 1 second, the whole function will take 3 seconds instead of 2 seconds, which we can achieve if the API calls execute in parallel. This is where the Kotlin coroutines library helps us with functions like async, which can be used to immediately start another coroutine with some request and wait for its result to arrive later (with the await function).

fun showNews() { viewModelScope.launch { val config = async { getConfigFromApi() } val news = async { getNewsFromApi(config.await()) } val user = async { getUserFromApi() } view.showNews(user.await(), news.await()) } }

This code is still simple and readable. It uses the async/await pattern that is popular in other languages, including JavaScript or C#. It is also efficient and does not cause memory leaks. The code is both simple and well implemented.

With Kotlin coroutines, we can easily implement different use cases and use other Kotlin features. For instance, they do not block us from using for-loops or collection-processing functions. Below, you can see how the next pages might be downloaded in parallel or one after another.

// all pages will be loaded simultaneously fun showAllNews() { viewModelScope.launch { val allNews = (0 until getNumberOfPages()) .map { page -> async { getNewsFromApi(page) } } .flatMap { it.await() } view.showAllNews(allNews) } } // next pages are loaded one after another fun showPagesFromFirst() { viewModelScope.launch { for (page in 0 until getNumberOfPages()) { val news = getNewsFromApi(page) view.showNextPage(news) } } }

Coroutines on the backend

In my opinion, the biggest advantage of using coroutines on the backend is simplicity. Unlike RxJava, using coroutines barely changes how our code looks. In most cases, migrating from threads to coroutines only involves adding the suspend modifier. When we do this, we can easily introduce concurrence, test concurrent behavior, cancel coroutines, and use all the other powerful features we will explore in this book.

suspend fun getArticle( articleKey: String, lang: Language ): ArticleJson? { return articleRepository.getArticle(articleKey, lang) ?.let { toArticleJson(it) } } suspend fun getAllArticles( userUuid: String?, lang: Language ): List<ArticleJson> = coroutineScope { val user = async { userRepo.findUserByUUID(userUuid) } val articles = articleRepo.getArticles(lang) articles .filter { hasAccess(user.await(), it) } .map { toArticleJson(it) } }

Except for all these features, there is one more important reason to use coroutines: threads are costly. They need to be created, maintained, and they need their memory allocated4. If your application is used by millions of users and you are blocking whenever you wait for a response from a database or another service, this adds up to a significant cost in memory and processor use (for the creation, maintenance, and synchronization of these threads).

This problem can be visualized with the following snippets that simulate a backend service with 100,000 users asking for data. The first snippet starts 100,000 threads and makes them sleep for a second (to simulate waiting for a response from a database or other service). If you run it on your computer, you will see it takes a while to print all those dots, or it will break with an OutOfMemoryError exception. This is the cost of running so many threads. The second snippet uses coroutines instead of threads and suspends them instead of making them sleep. If you run it, the program will wait for a second and then print all the dots. The cost of starting all these coroutines is so cheap that it is barely noticeable.

import kotlin.concurrent.thread fun main() { repeat(100_000) { thread { Thread.sleep(1000L) print(".") } } }
import kotlinx.coroutines.* fun main() = runBlocking { repeat(100_000) { launch { delay(1000L) print(".") } } }


I hope you feel convinced to learn more about Kotlin coroutines now. They are much more than just a library, and they make concurrent programming as easy as possible with modern tools. If we have that settled, let's start learning. For the rest of this chapter, we will explore how suspension works: first from the usage point of view, then under the hood.


Conway, Melvin E. (July 1963). "Design of a Separable Transition-diagram Compiler". Communications of the ACM. ACM. 6 (7): 396–408. doi:10.1145/366663.366704. ISSN 0001-0782. S2CID 10559786


I believe that the first industry-ready and universal coroutines were introduced by Go in 2009. However, it is worth mentioning that coroutines were also implemented in some older languages, like Lisp, but they didn't become popular. I believe this is because their implementation wasn't designed to support real-life cases. Lisp (just like Haskell) was mostly treated as a playground for scientists rather than as a language for professionals.


This does not change the fact that we should understand coroutines to use them well.


Most often, the default size of the thread stack is 1 MB. Due to Java optimizations, this does not necessarily mean 1 MB times the number of threads will be used, but a lot of extra memory is spent just because we create threads.