Cancellation in Kotlin Coroutines
A very important functionality of Kotlin Coroutines is cancellation. It is so important that some classes and libraries use suspending functions primarily to support cancellation1. There is a good reason behind it. A good cancellation mechanism is worth its weight in gold2. Just killing a thread is a terrible solution - there should be a space to close connections and free resources. Forcing developers to often check if some state is still active isn't convenient either. The problem of cancellation was waiting for a good answer for a very long time, but what Kotlin Coroutines offers is surprisingly simple, convenient, and safe. This is the best cancellation mechanism I've seen in my career. Let's explore it.
Job interface has a method
cancel, that allows its cancellation. Calling it triggers the following effects:
- Such a coroutine ends the job at the first suspension point (
delayin the example below).
- If a job has some children, they are cancelled too (but its parent is not affected).
- Once a job is cancelled, it cannot be used as a parent for any new coroutines. It is first in the "Cancelling" and then in the "Cancelled" state.
We might cancel with a different exception (by using an exception as an argument to the
cancel function) to specify the cause. This cause needs to be a subtype of
CancellationException, because only an exception of this type can be used to cancel a coroutine.
cancel, we often also add
join, to wait for the cancellation to be done before we can move further. Without it, we would have some race conditions. The below snippet shows an example, where without
join, we will see "Printing 4" after "Cancelled successfully". Adding
job.join() would change that, because it suspends until a coroutine has finished cancellation.
To make it easier to call
join together, kotlinx.coroutines library offers a convenient extension function with a self-descriptive name
A job created using the
Job() factory function can be cancelled the same way. This is often used to make it easy to cancel many coroutines at once.
This is a crucial capability. On many platforms, we often need to cancel a group of concurrent tasks. For instance, in Android, we cancel all the coroutines started on a view when the user leaves this view.
How does cancellation work?
When a job is cancelled, it changes its state to "Cancelling". Then at the first suspension point, a
CancellationException is thrown. This exception can be caught using a try-catch, but it is preferred to rethrow it in the end.
This is important to know, that a cancelled coroutine is not just stopped, but cancelled internally using an exception. Thanks to that, we can freely clean up everything inside the
finally block. For instance, we can use it to close a file or a database connection. Since most mechanisms of closing resources rely on the
finally block (for instance, if we read a file using
useLines), we just do not need to worry about them.
Just one more call
Since we can catch
CancellationException and invoke more operations before the coroutine truly ends, you might be wondering where the limit is. The coroutine can run as long as it needs to clean up all the resources. However, suspension is not allowed anymore. The
Job is already in a "Cancelling" state, where suspension or starting another coroutine is not possible at all. If we try to start another coroutine, it will just be ignored. If we try to suspend, it will throw
CancellationException and our finally block will be invoked.
Sometimes we truly need to use a suspending call when a coroutine is already cancelled. For instance, we might need to roll back changes on a database. Then the preferred way is to wrap this call with the
withContext(NonCancellable) function. We will later explain in detail how
withContext works. For now, all we need to know is that it changes the context for a block of code. Inside we used the
NonCancellable object, which is a
Job that cannot be cancelled. So inside the block, the job is in the active state, and we can call whatever suspending functions we want.
Another mechanism, that is often used to free resources, is the
invokeOnCompletion function from
Job. It is used to set a handler to be called when the job reaches a terminal state, so either "Completed" or "Cancelled".
A parameter of this handler is an exception, that is:
nullif job finished with no exception,
CancellationExceptionif the coroutine was cancelled,
- the exception that finished a coroutine (more about that in the next chapter).
Stopping the unstoppable
Because cancellation happens on suspension points, it will not take place if there is no suspension point. To simulate such a situation, we could use
Thread.sleep instead of
delay. This is a terrible practice. Please, do not do that in any real-life projects. We are just trying to simulate a case where we are using our coroutines extensively, while not suspending them. In practice, such a situation might happen if we have some complex calculations, like neural network learning (yes, we use coroutines for such cases too, to simplify processing parallelization), or when we need to do some blocking calls (for instance reading files).
The above coroutine couldn't be cancelled because there was no suspension point. The execution needs over 3 minutes, even though it should be cancelled after 1100 ms.
There are a few ways to deal with such situations. The first one is to use the
yield() function from time to time. This function suspends and immediately resumes a coroutine. This gives space for whatever needs to happen during suspension (or resuming), including cancellation (or changing thread using dispatcher).
Another option is to track the state of the job. Inside a coroutine builder,
this (the receiver) references the scope of this builder.
CoroutineScope has a context we can reference using the
coroutineContext property. This way we can access the coroutine job (
coroutineContext[Job]), and check what its current state is. Since this is often used to check if a coroutine is active, the Kotlin Coroutines library provides a function to simplify that:
We can use it to check if a job is still active and stop calculations when it is not anymore.
Alternatively, we might use the
ensureActive() function, that throws
Job is not active.
The result of
yield() seem similar, but they are very different. The function
ensureActive() needs to be called on a
Job). All it does is throw an exception if the job is not active anymore. It is lighter, and so should be generally preferred. The function
yield is a regular top-level suspension function. It does not need any scope, so it can be used in regular suspending functions. Since it does suspension and resuming, other effects might happen, like thread changing if we use a dispatcher with a pool of threads (more about that in the chapter Dispatchers). It is more often used just in suspending functions that are CPU intensive or blocking threads.
Here you might remind yourself of the
suspendCancellableCoroutine function introduced in How does suspension work? chapter. It behaves like
suspendCoroutine, but its continuation is wrapped into
CancellableContinuation<T>, which provides some additional methods. The most important one is
invokeOnCancellation which we use to set what should happen when a coroutine is cancelled. Most often we use it to cancel processes in a library or to free some resources.
Here is a full example, where we wrap a Retrofit
Call with a suspending function. So good, that Retrofit now supports suspending functions.
CancellableContinuation<T> also lets us check the job state (using the
isCancelled properties), and cancel this continuation with an optional cancellation cause.
Cancellation is a powerful feature. It is generally easy to use, but it can sometimes be tricky. It is important to understand how it works.
We’ve learned a lot about the cancellation. For instance, that we can freely release resources and clear the state on
finally or in the
invokeOnCompletion handler. We should also know when
isActive checks or
yield() should be used.
A properly used cancellation means fewer resources wasted and fewer memory leaks. It is important for our application performance, and I hope that from now on, you will be using these advantages.
A good example is
CoroutineWorker on Android, where according to the presentation Understand Kotlin Coroutines on Android on Google I/O'19 by Sean McQuillan and Yigit Boyar (both working on Android in Google), support for coroutines was added primarily to use their cancellation mechanism.
Actually much more, since code is currently not very heavy (it used to be, when it was stored on punched cards).
If true, the function is called in the "Cancelling" state already (that is before "Cancelled").
false by default.
true and this job is already in the desired state (depending on
onCancelling), then the handler is immediately and synchronously invoked and a no-op DisposableHandle is returned. When
false then a no-op DisposableHandle is returned, but the handler is not invoked.
true by default.