Cancellation in Kotlin Coroutines
This is a chapter from the book Kotlin Coroutines. You can find it on LeanPub or Amazon.
A very important functionality of Kotlin Coroutines is cancellation. It is so important that some classes and libraries use suspending functions primarily to support cancellation1. There is a good reason for this: a good cancellation mechanism is worth its weight in gold2. Just killing a thread is a terrible solution as there should be an opportunity to close connections and free resources. Forcing developers to frequently check if some state is still active isn't convenient either. The problem of cancellation waited for a good solution for a very long time, but what Kotlin Coroutines offer is surprisingly simple: they are convenient and safe. This is the best cancellation mechanism I've seen in my career. Let's explore it.
Basic cancellation
The Job
interface has a cancel
method, which allows its cancellation. Calling it triggers the following effects:
- Such a coroutine ends the job at the first suspension point (
delay
in the example below). - If a job has some children, they are also cancelled (but its parent is not affected).
- Once a job is cancelled, it cannot be used as a parent for any new coroutines. It is first in the "Cancelling" and then in the "Cancelled" state.
We might cancel with a different exception (by passing an exception as an argument to the cancel
function) to specify the cause. This cause needs to be a subtype of CancellationException
, because only an exception of this type can be used to cancel a coroutine.
After cancel
, we often also add join
to wait for the cancellation to finish before we can proceed. Without this, we would have some race conditions. The snippet below shows an example in which without join
we will see "Printing 4" after "Cancelled successfully".
Adding job.join()
would change this because it suspends until a coroutine has finished cancellation.
To make it easier to call cancel
and join
together, the kotlinx.coroutines library offers a convenient extension function with a self-descriptive name, cancelAndJoin
.
A job created using the Job()
factory function can be cancelled in the same way. This is often used to make it easy to cancel many coroutines at once.
This is a crucial capability. On many platforms, we often need to cancel a group of concurrent tasks. For instance, in Android, we cancel all the coroutines started in a view when a user leaves this view.
How does cancellation work?
When a job is cancelled, it changes its state to "Cancelling". Then, at the first suspension point, a CancellationException
is thrown. This exception can be caught using a try-catch, but it is recommended to rethrow it.
Keep in mind that a cancelled coroutine is not just stopped: it is cancelled internally using an exception. Therefore, we can freely clean up everything inside the finally
block. For instance, we can use a finally
block to close a file or a database connection. Since most resource-closing mechanisms rely on the finally
block (for instance, if we read a file using useLines
), we simply do not need to worry about them.
Just one more call
Since we can catch CancellationException
and invoke more operations before the coroutine truly ends, you might be wondering where the limit is. The coroutine can run as long as it needs to clean up all the resources. However, suspension is no longer allowed. The Job
is already in a "Cancelling" state, in which suspension or starting another coroutine is not possible at all. If we try to start another coroutine, it will just be ignored. If we try to suspend, it will throw CancellationException
.
Sometimes, we truly need to use a suspending call when a coroutine is already cancelled. For instance, we might need to roll back changes in a database. In this case, the preferred way is to wrap this call with the withContext(NonCancellable)
function. We will later explain in detail how withContext
works. For now, all we need to know is that it changes the context of a block of code. Inside withContext
, we used the NonCancellable
object, which is a Job
that cannot be cancelled. So, inside the block the job is in the active state, and we can call whatever suspending functions we want.
invokeOnCompletion
Another mechanism that is often used to free resources is the invokeOnCompletion
function from Job
. It is used to set a handler to be called when the job reaches a terminal state, namely either "Completed" or "Cancelled".
One of this handler’s parameters is an exception:
null
if the job finished with no exception;CancellationException
if the coroutine was cancelled;- the exception that finished a coroutine (more about this in the next chapter).
If a job was completed before invokeOnCompletion
was called, the handler will be invoked immediately. The onCancelling
3 and invokeImmediately
4 parameters allow further customization.
invokeOnCompletion
is called synchronously during cancellation, and we do not control the thread in which it will be running.
Stopping the unstoppable
Because cancellation happens at the suspension points, it will not happen if there is no suspension point. To simulate such a situation, we could use Thread.sleep
instead of delay
. This is a terrible practice, so please don’t do this in any real-life projects. We are just trying to simulate a case in which we are using our coroutines extensively but not suspending them. In practice, such a situation might happen if we have some complex calculations, like neural network learning (yes, we also use coroutines for such cases in order to simplify processing parallelization), or when we need to do some blocking calls (for instance, reading files).
The example below presents a situation in which a coroutine cannot be cancelled because there is no suspension point inside it (we use Thread.sleep
instead of delay
). The execution needs over 3 minutes, even though it should be cancelled after 1 second.
There are a few ways to deal with such situations. The first one is to use the yield()
function from time to time. This function suspends and immediately resumes a coroutine. This gives an opportunity to do whatever is needed during suspension (or resuming), including cancellation (or changing a thread using a dispatcher).
It is a good practice to use yield
in suspend functions, between blocks of non-suspended CPU-intensive or time-intensive operations.
Another option is to track the state of the job. Inside a coroutine builder, this
(the receiver) references the scope of this builder. CoroutineScope
has a context we can reference using the coroutineContext
property. Thus, we can access the coroutine job (coroutineContext[Job]
or coroutineContext.job
) and check what its current state is. Since a job is often used to check if a coroutine is active, the Kotlin Coroutines library provides a function to simplify that:
We can use the isActive
property to check if a job is still active and stop calculations when it is inactive.
Alternatively, we might use the ensureActive()
function, which throws CancellationException
if Job
is not active.
The result of ensureActive()
and yield()
seem similar, but they are very different. The function ensureActive()
needs to be called on a CoroutineScope
(or CoroutineContext
, or Job
). All it does is throw an exception if the job is no longer active. It is lighter, so generally it should be preferred. The function yield
is a regular top-level suspension function. It does not need any scope, so it can be used in regular suspending functions. Since it does suspension and resuming, other effects might arise, such as thread changing if we use a dispatcher with a pool of threads (more about this in the Dispatchers chapter). yield
is more often used just in suspending functions that are CPU intensive or are blocking threads.
suspendCancellableCoroutine
Here, you might remind yourself of the suspendCancellableCoroutine
function introduced in the How does suspension work? chapter. It behaves like suspendCoroutine
, but its continuation is wrapped into CancellableContinuation<T>
, which provides some additional methods. The most important one is invokeOnCancellation
, which we use to define what should happen when a coroutine is cancelled. Most often we use it to cancel processes in a library or to free some resources.
Here is a full example in which we wrap a Retrofit Call
with a suspending function.
It’s so good that Retrofit now supports suspending functions!
The CancellableContinuation<T>
also lets us check the job state (using the isActive
, isCompleted
and isCancelled
properties) and cancel this continuation with an optional cancellation cause.
Summary
Cancellation is a powerful feature. It is generally easy to use, but it can sometimes be tricky. So, it is important to understand how it works.
A properly used cancellation means fewer wasted resources and fewer memory leaks. It is important for our application’s performance, and I hope you will use these advantages from now on.
A good example is CoroutineWorker
on Android, where according to the presentation Understand Kotlin Coroutines on Android on Google I/O'19 by Sean McQuillan and Yigit Boyar (both working on Android at Google), support for coroutines was added primarily to use the cancellation mechanism.
Actually, it’s worth much more since the code is currently not very heavy (it used to be, when it was stored on punched cards).
If true, the function is called in the "Cancelling" state (i.e., before "Cancelled"). false
by default.
This parameter determines whether the handler should be called immediately if the handler is set when a coroutine is already in the desired state. true
by default.