Cancellation in Kotlin Coroutines

This is a chapter from the book Kotlin Coroutines. You can find Early Access on LeanPub.

A very important functionality of Kotlin Coroutines is cancellation. It is so important that some classes and libraries use suspending functions primarily to support cancellation1. There is a good reason behind it. A good cancellation mechanism is worth its weight in gold2. Just killing a thread is a terrible solution - there should be a space to close connections and free resources. Forcing developers to often check if some state is still active isn't convenient either. The problem of cancellation was waiting for a good answer for a very long time, but what Kotlin Coroutines offers is surprisingly simple, convenient, and safe. This is the best cancellation mechanism I've seen in my career. Let's explore it.

Basic cancellation

The Job interface has a method cancel, that allows its cancellation. Calling it triggers the following effects:

  • Such a coroutine ends the job at the first suspension point (delay in the example below).
  • If a job has some children, they are canceled too (but its parent is not affected).
  • Once a job is canceled, it cannot be used as a parent for any new coroutines, it is first in "Cancelling" and then in "Cancelled" state.
import kotlinx.coroutines.delay import kotlinx.coroutines.launch import kotlinx.coroutines.runBlocking //sampleStart fun main() = runBlocking { val job = launch { repeat(1_000) { i -> delay(200) println("Printing $i") } } delay(1100) job.cancel() job.join() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

We might cancel with a different exception (by using an exception as an argument to the cancel function) to specify the cause. This cause needs to be a subtype of CancellationException, because only an exception of this type can be used to cancel a coroutine.

After cancel, we often also add join, to wait for the cancellation to be done before we can move further. Without it, we would have some race conditions. For instance, let's say that cancel was called straight after delay in the coroutine. It would be possible that "Cancelled successfully" would be printed before the last "Printing $i".

To make it easier to call cancel and join together, kotlinx.coroutines library offers a convenient extension function with a self-descriptive name cancelAndJoin.

// The most explicit function name I've seen public suspend fun Job.cancelAndJoin() { cancel() return join() }

A job created using the Job() factory function can be canceled the same way. This is often used to make it easy to cancel many coroutines at once.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1_000) { i -> delay(200) println("Printing $i") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

This is a crucial capability. In many platforms, we often need to cancel a group of concurrent tasks. For instance, in Android, we cancel all the coroutines started on a view when the user leaves this view.

class ProfileViewModel : ViewModel() { private val scope = CoroutineScope(Dispatchers.Main + SupervisorJob()) fun onCreate() { scope.launch { loadUserData() } } fun onDestroy() { scope.coroutineContext.cancelChildren() } // ... }

How does cancellation work?

When a job is cancelled, it changes its state to "Cancelling". Then at the first suspension point, a CancellationException exception is thrown. This exception can be caught using a try-catch, but it is preferred to rethrow it in the end.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { repeat(1_000) { i -> delay(200) println("Printing $i") } } catch (e: CancellationException) { println(e) throw e } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") delay(1000) } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // JobCancellationException... // Cancelled successfully //sampleEnd

This is important to know, that a canceled coroutine is not just stopped, but canceled internally using an exception. Thanks to that, we can freely clean up everything inside the finally block. For instance, we can use it to close a file or a database connection. Since most mechanisms of closing resources rely on the finally block (for instance, if we read a file using useLines), we just do not need to worry about them.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { delay(Random.nextLong(2000)) println("Done") } finally { print("Will always be printed") } } delay(1000) job.cancelAndJoin() } // Will always be printed // (or) // Done // Will always be printed //sampleEnd

Just one more call

Since we can catch CancellationException and invoke more operations before the coroutine truly ends, you might be wondering where the limit is. The coroutine can run as long as it needs to clean up all the resources. However, suspension is not allowed anymore. The Job is already in a "Cancelling" state, where suspension or starting another coroutine is not possible at all. If we try to start another coroutine, it will just be ignored. If we try to suspend, it will throw CancellationException and our finally block will end.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { delay(200) println("Job is done") } finally { println("Finally") launch { // will be ignored println("Will not be printed") } delay(100) // here exception is thrown println("Will not be printed") } } delay(100) job.cancelAndJoin() println("Cancel done") } // Finally // Cancel done //sampleEnd

Sometimes we truly need to use a suspending call when a coroutine is already canceled. For instance, we might need to roll back changes on a database. Then the preferred way is to wrap this call with the withContext(NonCancellable) function. We will later explain in detail how withContext works. For now, all we need to know is that it changes the context for a block of code. Inside we used the NonCancellable object, which is a Job that cannot be canceled. So inside the block, the job is in the active state, and we can call whatever suspending functions we want.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { try { delay(200) println("Coroutine finished") } finally { println("Finally") withContext(NonCancellable) { delay(1000L) println("Cleanup done") } } } delay(100) job.cancelAndJoin() println("Done") } // Finally // Cleanup done // Done //sampleEnd

invokeOnCompletion

Another mechanism, that is often used to free resources, is the invokeOnCompletion function from Job. It is used to set a handler to be called when the job reaches a terminal state, so either "Completed" or "Canceled".

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = launch { delay(1000) } job.invokeOnCompletion { exception: Throwable? -> println("Finished") } delay(400) job.cancelAndJoin() } // Finished //sampleEnd

A parameter of this handler is an exception, that is:

  • null if job finished with no exception,
  • CancellationException if the coroutine was canceled,
  • the exception that finished a coroutine (more about it in the next chapter).

If a job was completed before invokeOnCompletion was called, the handler will be invoked immediately. Parameters onCancelling3 and invokeImmediately4 allow further customization.

import kotlinx.coroutines.* import kotlin.random.Random //sampleStart suspend fun main(): Unit = coroutineScope { val job = launch { delay(Random.nextLong(2400)) println("Finished") } delay(800) job.invokeOnCompletion { exception: Throwable? -> println("Will always be printed") println("The exception was: $exception") } delay(800) job.cancelAndJoin() } // Will always be printed // The exception was: // kotlinx.coroutines.JobCancellationException // (or) // Finished // Will always be printed // The exception was null //sampleEnd

Stopping the unstoppable

Because cancellation happens on suspension points, it will not take place if there is no suspension point. To simulate such a situation, we could use Thread.sleep instead of delay. This is a terrible practice. Please, do not do that in any real-life projects. We are just trying to simulate a case where we are using our coroutines extensively, while not suspending them. In practice, such a situation might happen if we have some complex calculations, like neural network learning (yes, we use coroutines for such cases too, to simplify processing parallelization), or when we need to do some blocking calls (for instance reading files).

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1_000) { i -> Thread.sleep(200) // We might have some // complex operations or reading files here println("Printing $i") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") delay(1000) } // Printing 0 // Printing 1 // Printing 2 // ... (up to 1000) //sampleEnd

The above coroutine couldn't be cancelled because there was no suspension point. The execution needs over 3 minutes, even though it should be cancelled after 1100 ms.

There are a few ways to deal with such situations. The first one is to use the yield() function from time to time. This function suspends and immediately resumes a coroutine. This gives space for whatever needs to happen during suspension (or resuming), including cancellation (or changing thread using dispatcher).

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1_000) { i -> Thread.sleep(200) yield() println("Printing $i") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") delay(1000) } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

Another option is to track the state of the job. Inside a coroutine builder, this (the receiver) references the scope of this builder. CoroutineScope has a context we can reference using the coroutineContext property. This way we can access the coroutine job (coroutineContext[Job]), and check what is its current state. Since this is often used to check if a coroutine is active, Kotlin Coroutines library provides a function to simplify that:

public val CoroutineScope.isActive: Boolean get() = coroutineContext[Job]?.isActive ?: true

We can use it to check if a job is still active and stop calculations when it is not anymore.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { do { Thread.sleep(200) println("Printing") } while (isActive) } delay(1100) job.cancelAndJoin() println("Cancelled successfully") } // Printing // Printing // Printing // Printing // Printing // Printing // Cancelled successfully //sampleEnd

Alternatively, we might use the ensureActive() function, that throws CancellationException if Job is not active.

import kotlinx.coroutines.* //sampleStart suspend fun main(): Unit = coroutineScope { val job = Job() launch(job) { repeat(1000) { num -> Thread.sleep(200) ensureActive() println("Printing $num") } } delay(1100) job.cancelAndJoin() println("Cancelled successfully") } // Printing 0 // Printing 1 // Printing 2 // Printing 3 // Printing 4 // Cancelled successfully //sampleEnd

The result of ensureActive() and yield() seem similar, but they are very different. The function ensureActive() needs to be called on a CoroutineScope (or CoroutineContext, or Job). All it does is, it throws an exception if the job is not active anymore. It is lighter, and so should be generally preferred. The function yield is a regular top-level suspension function. It does not need any scope, so it can be used in regular suspending functions. Since it does suspension and resuming, other effects might happen, like thread changing if we use a dispatcher with a pool of threads (more about it, in the chapter Dispatchers). It is more often used just in suspending functions that are CPU intensive or blocking threads.

suspendCancellableCoroutine

Here you might remind yourself of the suspendCancellableCoroutine function introduced in the How does suspension work? chapter. It behaves like suspendCoroutine, but its continuation is wrapped into CancellableContinuation<T>, which provides some additional methods. The most important one is invokeOnCancellation which we use to set what should happen when a coroutine is canceled. Most often we use it to cancel processes in a library or to free some resources.

suspend fun someTask() = suspendCancellableCoroutine { cont -> cont.invokeOnCancellation { // do cleanup } // rest of the implementation }

Here is a full example, where we wrap Retrofit Call with a suspending function.

suspend fun getOrgRepos(): List<Repo> = suspendCancellableCoroutine { continuation -> val orgReposCall = apiService.getOrgReposCall() orgReposCall.enqueue(object : Callback<List<Repo>> { override fun onResponse( call: Call<List<Repo>>, response: Response<List<Repo>> ) { if (response.isSuccessful) { val body = response.body() if (body != null) { continuation.resume(body) } else { continuation.resumeWithException( ResponseWithEmptyBody ) } } else { continuation.resumeWithException( ApiException( response.code(), response.message() ) ) } } override fun onFailure( call: Call<List<Repo>>, t: Throwable ) { continuation.resumeWithException(t) } }) continuation.invokeOnCancellation { orgReposCall.cancel() } }

The CancellableContinuation<T> also lets us check the job state (using isActive, isCompleted and isCancelled properties), and cancel this continuation with an optional cancellation cause.

Summary

Cancellation is a powerful feature. It is generally easy to use, but it can sometimes be tricky. It is important to understand how it works.

We’ve learned a lot about the cancellation. For instance, that we can freely release resources and clear the state on finally or in the invokeOnCompletion handler. We should also know when ensureActive(), isActive checks or yield() should be used.

A properly used cancellation means fewer resources wasted and fewer memory leaks. It is important for our application performance, and I hope that from now on, you will be using these advantages.

1:

A good example is CoroutineWorker on Android, where according to the presentation Understand Kotlin Coroutines on Android on Google I/O'19 by Sean McQuillan and Yigit Boyar (both working on Android in Google), support for coroutines was added primarily to use their cancellation mechanism.

2:

Actually much more, since a code is currently not very heavy (it used to be, when it was stored on punched cards).

3:

If true, the function is called in the "Cancelling" state already (that is before "Cancelled"). false by default.

4:

When true and this job is already in the desired state (depending on onCancelling), then the handler is immediately and synchronously invoked and no-op DisposableHandle is returned. When false then no-op DisposableHandle is returned, but the handler is not invoked. true by default.