Cancellation in Kotlin Coroutines
A very important functionality of Kotlin Coroutines is cancellation. It is so important that some classes and libraries use suspending functions primarily to support cancellation1. There is a good reason for this: a good cancellation mechanism is worth its weight in gold2. Just killing a thread is a terrible solution as there should be an opportunity to close connections and free resources. Forcing developers to frequently check if some state is still active isn't convenient either. The problem of cancellation waited for a good solution for a very long time, but what Kotlin Coroutines offer is surprisingly simple: they are convenient and safe. This is the best cancellation mechanism I've seen in my career. Let's explore it.
Job interface has a
cancel method, which allows its cancellation. Calling it triggers the following effects:
- Such a coroutine ends the job at the first suspension point (
delayin the example below).
- If a job has some children, they are also cancelled (but its parent is not affected).
- Once a job is cancelled, it cannot be used as a parent for any new coroutines. It is first in the "Cancelling" and then in the "Cancelled" state.
We might cancel with a different exception (by passing an exception as an argument to the
cancel function) to specify the cause. This cause needs to be a subtype of
CancellationException, because only an exception of this type can be used to cancel a coroutine.
cancel, we often also add
join to wait for the cancellation to finish before we can proceed. Without this, we would have some race conditions. The snippet below shows an example in which without
join we will see "Printing 4" after "Cancelled successfully".
job.join() would change this because it suspends until a coroutine has finished cancellation.
To make it easier to call
join together, the kotlinx.coroutines library offers a convenient extension function with a self-descriptive name,
A job created using the
Job() factory function can be cancelled in the same way. This is often used to make it easy to cancel many coroutines at once.
This is a crucial capability. On many platforms, we often need to cancel a group of concurrent tasks. For instance, in Android, we cancel all the coroutines started in a view when a user leaves this view.
How does cancellation work?
When a job is cancelled, it changes its state to "Cancelling". Then, at the first suspension point, a
CancellationException is thrown. This exception can be caught using a try-catch, but it is recommended to rethrow it.
Keep in mind that a cancelled coroutine is not just stopped: it is cancelled internally using an exception. Therefore, we can freely clean up everything inside the
finally block. For instance, we can use a
finally block to close a file or a database connection. Since most resource-closing mechanisms rely on the
finally block (for instance, if we read a file using
useLines), we simply do not need to worry about them.
Just one more call
Since we can catch
CancellationException and invoke more operations before the coroutine truly ends, you might be wondering where the limit is. The coroutine can run as long as it needs to clean up all the resources. However, suspension is no longer allowed. The
Job is already in a "Cancelling" state, in which suspension or starting another coroutine is not possible at all. If we try to start another coroutine, it will just be ignored. If we try to suspend, it will throw
Sometimes, we truly need to use a suspending call when a coroutine is already cancelled. For instance, we might need to roll back changes in a database. In this case, the preferred way is to wrap this call with the
withContext(NonCancellable) function. We will later explain in detail how
withContext works. For now, all we need to know is that it changes the context of a block of code. Inside
withContext, we used the
NonCancellable object, which is a
Job that cannot be cancelled. So, inside the block the job is in the active state, and we can call whatever suspending functions we want.
Another mechanism that is often used to free resources is the
invokeOnCompletion function from
Job. It is used to set a handler to be called when the job reaches a terminal state, namely either "Completed" or "Cancelled".
One of this handler’s parameters is an exception:
nullif the job finished with no exception;
CancellationExceptionif the coroutine was cancelled;
- the exception that finished a coroutine (more about this in the next chapter).
invokeOnCompletion is called synchronously during cancellation, and we do not control the thread in which it will be running.
Stopping the unstoppable
Because cancellation happens at the suspension points, it will not happen if there is no suspension point. To simulate such a situation, we could use
Thread.sleep instead of
delay. This is a terrible practice, so please don’t do this in any real-life projects. We are just trying to simulate a case in which we are using our coroutines extensively but not suspending them. In practice, such a situation might happen if we have some complex calculations, like neural network learning (yes, we also use coroutines for such cases in order to simplify processing parallelization), or when we need to do some blocking calls (for instance, reading files).
The example below presents a situation in which a coroutine cannot be cancelled because there is no suspension point inside it (we use
Thread.sleep instead of
delay). The execution needs over 3 minutes, even though it should be cancelled after 1 second.
There are a few ways to deal with such situations. The first one is to use the
yield() function from time to time. This function suspends and immediately resumes a coroutine. This gives an opportunity to do whatever is needed during suspension (or resuming), including cancellation (or changing a thread using a dispatcher).
It is a good practice to use
yield in suspend functions, between blocks of non-suspended CPU-intensive or time-intensive operations.
Another option is to track the state of the job. Inside a coroutine builder,
this (the receiver) references the scope of this builder.
CoroutineScope has a context we can reference using the
coroutineContext property. Thus, we can access the coroutine job (
coroutineContext.job) and check what its current state is. Since a job is often used to check if a coroutine is active, the Kotlin Coroutines library provides a function to simplify that:
We can use the
isActive property to check if a job is still active and stop calculations when it is inactive.
Alternatively, we might use the
ensureActive() function, which throws
Job is not active.
The result of
yield() seem similar, but they are very different. The function
ensureActive() needs to be called on a
Job). All it does is throw an exception if the job is no longer active. It is lighter, so generally it should be preferred. The function
yield is a regular top-level suspension function. It does not need any scope, so it can be used in regular suspending functions. Since it does suspension and resuming, other effects might arise, such as thread changing if we use a dispatcher with a pool of threads (more about this in the Dispatchers chapter).
yield is more often used just in suspending functions that are CPU intensive or are blocking threads.
Here, you might remind yourself of the
suspendCancellableCoroutine function introduced in the How does suspension work? chapter. It behaves like
suspendCoroutine, but its continuation is wrapped into
CancellableContinuation<T>, which provides some additional methods. The most important one is
invokeOnCancellation, which we use to define what should happen when a coroutine is cancelled. Most often we use it to cancel processes in a library or to free some resources.
Here is a full example in which we wrap a Retrofit
Call with a suspending function.
It’s so good that Retrofit now supports suspending functions!
CancellableContinuation<T> also lets us check the job state (using the
isCancelled properties) and cancel this continuation with an optional cancellation cause.
Cancellation is a powerful feature. It is generally easy to use, but it can sometimes be tricky. So, it is important to understand how it works.
A properly used cancellation means fewer wasted resources and fewer memory leaks. It is important for our application’s performance, and I hope you will use these advantages from now on.
A good example is
CoroutineWorker on Android, where according to the presentation Understand Kotlin Coroutines on Android on Google I/O'19 by Sean McQuillan and Yigit Boyar (both working on Android at Google), support for coroutines was added primarily to use the cancellation mechanism.
Actually, it’s worth much more since the code is currently not very heavy (it used to be, when it was stored on punched cards).
If true, the function is called in the "Cancelling" state (i.e., before "Cancelled").
false by default.
This parameter determines whether the handler should be called immediately if the handler is set when a coroutine is already in the desired state.
true by default.