Job and children awaiting in Kotlin Coroutines
In the chapter Structured Concurrency, we mentioned the following consequences of the parent-child relationship:
- children inherit context from their parent,
- a parent suspends until all the children are finished,
- when the parent is cancelled, its child coroutines are cancelled too,
- when a child is destroyed, it destroys the parent as well.
Inheriting a context from parent to child is a basic part of a coroutine builder's behavior.
The other three important consequences of structured concurrency depend fully on the
Job context. Furthermore,
Job can be used to cancel coroutines, track their state, and much more. It is really important and useful, so this and the next two chapters are dedicated to the
Job and the essential Kotlin Coroutines mechanisms that are connected to it.
Conceptually, a job represents a cancellable thing with a lifecycle. Formally
Job is an interface, but it has a concrete contract and state, so might be treated similarly to an abstract class.
A job lifecycle is represented by its state. There are the following states and transitions:
A diagram of job (so also coroutine) states.
In the "Active" state, a job is running and doing its job. If the job is created with a coroutine builder, this is the state where the body of this coroutine will be executed. In this state, we can start child coroutines. Most coroutines will start in the "Active" state. Only those that are started lazily, will start with the "New" state and need to be started to move to the "Active" state. If the job can finish them all uninterrupted, then it is moved to the "Completing" state, where it can wait for its children. Once the children are done, the job changes its state to "Completed", which is a terminal one. Alternatively, if a job cancels or fails when running (in the "Active" or "Completing" state), its state will change to "Cancelling". In this state, we have the last chance to do some clean-up, like closing connections or freeing resources (we will see how to do it in the next chapter). Once it is done, the job will move to the "Cancelled" state.
The state is printed with job’s
toString2. In the example below, we see different jobs as their states change. The last one is started lazily, which means it does not start automatically. All the others will immediately become active once created.
To check the state in code, we use the properties:
|New (optional initial state)||false||false||false|
|Active (default initial state)||true||false||false|
|Completing (transient state)||true||false||false|
|Cancelling (transient state)||false||false||true|
|Cancelled (final state)||false||true||true|
|Completed (final state)||false||true||false|
Coroutine builders create their jobs based on their parent job
The coroutine builders from the Kotlin coroutines library create their own job. Most of them return it, so it can be used outside. It is clearly visible for
Job is an explicit result type.
The type returned by
Deferred<T>, but it also implements the
Job interface, so can be used the same way as well.
Job is a coroutine context, we can access it using
coroutineContext[Job]. Although there is also an extension property
job, that lets us access a job more easily.
The way how a job is passed from parent to child is different from other contexts. Other coroutine contexts are just passed unchanged. Every coroutine builder creates its own Job, which has a relationship to the parents' Job. The parent can reference all its children and the children can refer to the parent. This is how waiting for children’s coroutines, cancellation, and exception handling are implemented.
Structured concurrency mechanisms will not work if a new
Job context replaces the one from the parent. To see that, we might use the
Job() function, which creates a
Job context (it will be explained later).
In the above example, the parent does not wait for its children, because it has no relation with them. It is because the children have their own jobs that replace the parent's jobs.
When a coroutine cuts off from a parent job, there is nearly no relation, except for the other contexts being passed. We are losing structured concurrency, so we need to be very careful when doing that.
The first important advantage of a job is that it can be used to wait until the coroutine is completed. For that, we use the
join method. It is a suspending function that suspends until a concrete job reaches a final state (either Completed or Cancelled).
Job interface also exposes a
children property that lets us reference all its children. We might as well use it to wait until all children are in a final state.
Job factory function
Job can also be created without a coroutine, using the
Job() factory function. This creates a job that isn't associated with any coroutine and can be used as a context. This also means that we can use such a job as a parent of many coroutines.
A common mistake is to create a job using the
Job() factory function, start some coroutines on it, and then use
join on the job. Such a program will never end, because
Job is still in an active state, even when its children are finished. It is because this context is still ready to be used by other coroutines.
A better approach would be to join on all the current children of the job.
Job() is a great example of a factory function. At first, you might think that you're calling a constructor of the
Job class. The reality is that it is instead a fake constructor1 - a simple function that looks like a constructor. Moreover, the actual type returned by this function is not a
Job, but its subinterface
CompletableJob interface adds two additional methods to
complete(): Boolean- used to complete a job. Once it is used, all the child coroutines will keep running until they are all done, but new coroutines cannot be started on this job. The result is
trueif this job was completed as a result of this invocation and
falseotherwise (if it was already completed, for instance, because of an exception in a child).
completeExceptionally(exception: Throwable): Boolean- Completes this job exceptionally with a given exception. This means that all children will be cancelled immediately (with
CancellationExceptionwrapping the exception provided as an argument). The result, just like in the above function, responds to the question: “Was this job finished because of the invocation?”.
complete is often used after we start the last coroutine on a job. Thanks to that, we can just wait for the job completion using the
You can pass a reference to the parent as an argument of the
Job function. Thanks to that, such a job will be cancelled when the parent is.
The next two chapters describe cancellation and exception handling in Kotlin Coroutines. Those two important mechanisms fully depend on the child-parent relationship created using
A pattern described well in Effective Kotlin Item 33: Consider factory functions instead of constructors.
I hope I do not need to remind, that
toString should be used for debugging and logging purposes, and should not be parsed in code. This would be breaking this function’s contract, as I’ve described in Effective Kotlin.