article banner (priority)

Cancellation in Kotlin Coroutines

This is a chapter from the book Kotlin Coroutines. You can find it on LeanPub or Amazon.

A very important functionality of Kotlin Coroutines is cancellation. It is so important that some classes and libraries use suspending functions primarily to support cancellation1. There is a good reason for this: a good cancellation mechanism is worth its weight in gold2. Just killing a thread is a terrible solution as there should be an opportunity to close connections and free resources. Forcing developers to frequently check if some state is still active isn't convenient either. The problem of cancellation waited for a good solution for a very long time, but what Kotlin Coroutines offer is surprisingly simple: they are convenient and safe. This is the best cancellation mechanism I've seen in my career. Let's explore it.

Basic cancellation

The Job interface has a cancel method, which allows its cancellation. Calling it triggers the following effects:

  • Such a coroutine ends the job at the first suspension point (delay in the example below).
  • If a job has some children, they are also cancelled (but its parent is not affected).
  • Once a job is cancelled, it cannot be used as a parent for any new coroutines. It is first in the "Cancelling" and then in the "Cancelled" state.
import kotlinx.coroutines.delay import kotlinx.coroutines.launch import kotlinx.coroutines.coroutineScope //sampleStart suspend fun main(): Unit = coroutineScope { val job = launch { repeat(1_000) { i -> delay(200) println("Printing $i") } } delay(1100) job.cancel() job.join() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

We might cancel with a different exception (by passing an exception as an argument to the cancel function) to specify the cause. This cause needs to be a subtype of CancellationException, because only an exception of this type can be used to cancel a coroutine.

After cancel, we often also add join to wait for the cancellation to finish before we can proceed. Without this, we would have some race conditions. The snippet below shows an example in which without join we will see "Printing 4" after "Cancelled successfully".

import kotlinx.coroutines.coroutineScope import kotlinx.coroutines.delay import kotlinx.coroutines.launch //sampleStart suspend fun main() = coroutineScope { val job = launch { repeat(1_000) { i -> delay(100) Thread.sleep(100) // We simulate long operation println("Printing $i") } } delay(1000) job.cancel() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Cancelled successfully // Printing 4 //sampleEnd

Adding job.join() would change this because it suspends until a coroutine has finished cancellation.

import kotlinx.coroutines.coroutineScope import kotlinx.coroutines.delay import kotlinx.coroutines.launch //sampleStart suspend fun main() = coroutineScope { val job = launch { repeat(1_000) { i -> delay(100) Thread.sleep(100) // We simulate long operation println("Printing $i") } } delay(1000) job.cancel() job.join() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

To make it easier to call cancel and join together, the kotlinx.coroutines library offers a convenient extension function with a self-descriptive name, cancelAndJoin.

// The most explicit function name I've ever seen public suspend fun Job.cancelAndJoin() { cancel() return join() }

A job created using the Job() factory function can be cancelled in the same way. This is often used to make it easy to cancel many coroutines at once.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1_000) { i -> delay(200) println("Printing $i") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

This is a crucial capability. On many platforms, we often need to cancel a group of concurrent tasks. For instance, in Android, we cancel all the coroutines started in a view when a user leaves this view.

class ProfileViewModel : ViewModel() { private val scope = CoroutineScope(Dispatchers.Main + SupervisorJob()) fun onCreate() { scope.launch { loadUserData() } } override fun onCleared() { scope.coroutineContext.cancelChildren() } // ... }

How does cancellation work?

When a job is cancelled, it changes its state to "Cancelling". Then, at the first suspension point, a CancellationException is thrown. This exception can be caught using a try-catch, but it is recommended to rethrow it.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { repeat(1_000) { i -> delay(200) println("Printing $i") } } catch (e: CancellationException) { println(e) throw e } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") delay(1000) } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // JobCancellationException... // Cancelled successfully //sampleEnd

Keep in mind that a cancelled coroutine is not just stopped: it is cancelled internally using an exception. Therefore, we can freely clean up everything inside the finally block. For instance, we can use a finally block to close a file or a database connection. Since most resource-closing mechanisms rely on the finally block (for instance, if we read a file using useLines), we simply do not need to worry about them.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { delay(Random.nextLong(2000)) println("Done") } finally { print("Will always be printed") } } delay(1000) job.cancelAndJoin() } // Will always be printed // (or) // Done // Will always be printed //sampleEnd

Just one more call

Since we can catch CancellationException and invoke more operations before the coroutine truly ends, you might be wondering where the limit is. The coroutine can run as long as it needs to clean up all the resources. However, suspension is no longer allowed. The Job is already in a "Cancelling" state, in which suspension or starting another coroutine is not possible at all. If we try to start another coroutine, it will just be ignored. If we try to suspend, it will throw CancellationException.

import kotlinx.coroutines.* import kotlin.random.Random suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { delay(2000) println("Job is done") } finally { println("Finally") launch { // will be ignored println("Will not be printed") } delay(1000) // here exception is thrown println("Will not be printed") } } delay(1000) job.cancelAndJoin() println("Cancel done") } // (1 sec) // Finally // Cancel done

Sometimes, we truly need to use a suspending call when a coroutine is already cancelled. For instance, we might need to roll back changes in a database. In this case, the preferred way is to wrap this call with the withContext(NonCancellable) function. We will later explain in detail how withContext works. For now, all we need to know is that it changes the context of a block of code. Inside withContext, we used the NonCancellable object, which is a Job that cannot be cancelled. So, inside the block the job is in the active state, and we can call whatever suspending functions we want.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { delay(200) println("Coroutine finished") } finally { println("Finally") withContext(NonCancellable) { delay(1000L) println("Cleanup done") } } } delay(100) job.cancelAndJoin() println("Done") } // Finally // Cleanup done // Done //sampleEnd

invokeOnCompletion

Another mechanism that is often used to free resources is the invokeOnCompletion function from Job. It is used to set a handler to be called when the job reaches a terminal state, namely either "Completed" or "Cancelled".

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = launch { delay(1000) } job.invokeOnCompletion { exception: Throwable? -> println("Finished") } delay(400) job.cancelAndJoin() } // Finished //sampleEnd

One of this handler’s parameters is an exception:

  • null if the job finished with no exception;
  • CancellationException if the coroutine was cancelled;
  • the exception that finished a coroutine (more about this in the next chapter).

If a job was completed before invokeOnCompletion was called, the handler will be invoked immediately. The onCancelling3 and invokeImmediately4 parameters allow further customization.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = launch { delay(Random.nextLong(2400)) println("Finished") } delay(800) job.invokeOnCompletion { exception: Throwable? -> println("Will always be printed") println("The exception was: $exception") } delay(800) job.cancelAndJoin() } // Will always be printed // The exception was: // kotlinx.coroutines.JobCancellationException // (or) // Finished // Will always be printed // The exception was null //sampleEnd

invokeOnCompletion is called synchronously during cancellation, and we do not control the thread in which it will be running.

Stopping the unstoppable

Because cancellation happens at the suspension points, it will not happen if there is no suspension point. To simulate such a situation, we could use Thread.sleep instead of delay. This is a terrible practice, so please don’t do this in any real-life projects. We are just trying to simulate a case in which we are using our coroutines extensively but not suspending them. In practice, such a situation might happen if we have some complex calculations, like neural network learning (yes, we also use coroutines for such cases in order to simplify processing parallelization), or when we need to do some blocking calls (for instance, reading files).

The example below presents a situation in which a coroutine cannot be cancelled because there is no suspension point inside it (we use Thread.sleep instead of delay). The execution needs over 3 minutes, even though it should be cancelled after 1 second.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1_000) { i -> Thread.sleep(200) // We might have some // complex operations or reading files here println("Printing $i") } } delay(1000) job.cancelAndJoin() println("Cancelled successfully") delay(1000) } // Printing 0 // Printing 1 // Printing 2 // ... (up to 1000) //sampleEnd

There are a few ways to deal with such situations. The first one is to use the yield() function from time to time. This function suspends and immediately resumes a coroutine. This gives an opportunity to do whatever is needed during suspension (or resuming), including cancellation (or changing a thread using a dispatcher).

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1_000) { i -> Thread.sleep(200) yield() println("Printing $i") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") delay(1000) } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

Another option is to track the state of the job. Inside a coroutine builder, this (the receiver) references the scope of this builder. CoroutineScope has a context we can reference using the coroutineContext property. Thus, we can access the coroutine job (coroutineContext[Job] or coroutineContext.job) and check what its current state is. Since a job is often used to check if a coroutine is active, the Kotlin Coroutines library provides a function to simplify that:

public val CoroutineScope.isActive: Boolean get() = coroutineContext[Job]?.isActive ?: true

We can use the isActive property to check if a job is still active and stop calculations when it is inactive.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { do { Thread.sleep(200) println("Printing") } while (isActive) } delay(1100) job.cancelAndJoin() println("Cancelled successfully") } // Printing // Printing // Printing // Printing // Printing // Printing // Cancelled successfully //sampleEnd

Alternatively, we might use the ensureActive() function, which throws CancellationException if Job is not active.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1000) { num -> Thread.sleep(200) ensureActive() println("Printing $num") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

The result of ensureActive() and yield() seem similar, but they are very different. The function ensureActive() needs to be called on a CoroutineScope (or CoroutineContext, or Job). All it does is throw an exception if the job is no longer active. It is lighter, so generally it should be preferred. The function yield is a regular top-level suspension function. It does not need any scope, so it can be used in regular suspending functions. Since it does suspension and resuming, other effects might arise, such as thread changing if we use a dispatcher with a pool of threads (more about this in the Dispatchers chapter). yield is more often used just in suspending functions that are CPU intensive or are blocking threads.

suspendCancellableCoroutine

Here, you might remind yourself of the suspendCancellableCoroutine function introduced in the How does suspension work? chapter. It behaves like suspendCoroutine, but its continuation is wrapped into CancellableContinuation<T>, which provides some additional methods. The most important one is invokeOnCancellation, which we use to define what should happen when a coroutine is cancelled. Most often we use it to cancel processes in a library or to free some resources.

suspend fun someTask() = suspendCancellableCoroutine { cont -> cont.invokeOnCancellation { // do cleanup } // rest of the implementation }

Here is a full example in which we wrap a Retrofit Call with a suspending function.

suspend fun getOrganizationRepos( organization: String ): List<Repo> = suspendCancellableCoroutine { continuation -> val orgReposCall = apiService .getOrganizationRepos(organization) orgReposCall.enqueue(object : Callback<List<Repo>> { override fun onResponse( call: Call<List<Repo>>, response: Response<List<Repo>> ) { if (response.isSuccessful) { val body = response.body() if (body != null) { continuation.resume(body) } else { continuation.resumeWithException( ResponseWithEmptyBody ) } } else { continuation.resumeWithException( ApiException( response.code(), response.message() ) ) } } override fun onFailure( call: Call<List<Repo>>, t: Throwable ) { continuation.resumeWithException(t) } }) continuation.invokeOnCancellation { orgReposCall.cancel() } }

It’s so good that Retrofit now supports suspending functions!

class GithubApi { @GET("orgs/{organization}/repos?per_page=100") suspend fun getOrganizationRepos( @Path("organization") organization: String ): List<Repo> }

The CancellableContinuation<T> also lets us check the job state (using the isActive, isCompleted and isCancelled properties) and cancel this continuation with an optional cancellation cause.

Summary

Cancellation is a powerful feature. It is generally easy to use, but it can sometimes be tricky. So, it is important to understand how it works.

A properly used cancellation means fewer wasted resources and fewer memory leaks. It is important for our application’s performance, and I hope you will use these advantages from now on.

1:

A good example is CoroutineWorker on Android, where according to the presentation Understand Kotlin Coroutines on Android on Google I/O'19 by Sean McQuillan and Yigit Boyar (both working on Android at Google), support for coroutines was added primarily to use the cancellation mechanism.

2:

Actually, it’s worth much more since the code is currently not very heavy (it used to be, when it was stored on punched cards).

3:

If true, the function is called in the "Cancelling" state (i.e., before "Cancelled"). false by default.

4:

This parameter determines whether the handler should be called immediately if the handler is set when a coroutine is already in the desired state. true by default.