What is the difference between cats-effect and ZIO?
It might not be apparent at first but there are many differences and it's handy for you to know what! Check out this article by SoftwareMill Co-Founder Adam Warski and you'll know exactly what you need to pay attention to.
'The 'IO' from cats-effect and 'ZIO' from zio might seem quite similar — but as is often the case, the devil is in the details!
Both libraries define datatypes which allow describing asynchronous processes. The general approach is the same: processes are first described as immutable values. These descriptions are composed using a number of combinators, which allow sequencing processes, running them in parallel, etc. All side-effects are captured as values and evaluated lazily. Nothing happens, until we interpret these descriptions into a running process.
How does cats-effect and zio differ, then? First and foremost, there are notable differences in the API, error handling, available combinators, but there are also differences in the semantics, which are quite subtle and can be surprising.
One such difference is how thread shifting works. As Daniel Spiewak recommends, an application should have at least three thread pools:
- a bounded thread pool (by the number of CPUSs) for CPU-intensive work
- an unbounded thread pool for executing blocking I/O calls
- a bounded thread pool for non-blocking I/O callbacks
Depending on the task at hand, we should pick “the right pool for the job”. Hence when describing a process, we need a way to express the fact that e.g. a blocking effect should run on the blocking pool, and a calculation-intensive effect on the CPU pool.
Let’s define a couple of thread pools that we’ll use later in the examples:
And an effect which, when run (remember that effects are evaluated lazily, each time they are used), will print the name of the current thread:
cats-effect
We start on the 'main' thread, then we jump to one of the threads of 'ec1', and finally to one thread from 'ec2'. The '*>' operator simply sequences two effects; 'a *> b' is equivalent to 'a.flatMap(_ => b)'.
The second is 'ContextShift.evalOn(ExecutionContext)', which evaluates the given effect on the given thread pool, and then shifts back to the thread pool backing the context shift (note that there are two thread pools involved):
Again, we start on 'main', then the middle effect is evaluated on the thread pool passed as a parameter to 'evalOn' ('ec2'), and after that, we jump back to thread pool backing 'cs1', that is 'ec1'.
So far so good. Where’s the caveat then? There are two, in fact.
Non-local reasoning
First, adding an effect to your program can influence the effects that come after it. Consider the following code:
Now we add an additional step in the middle ('someEffect'), which might look innocent:
However, notice that now the last effect is evaluated on a different thread pool! That’s because the middle effect didn’t “clean up” after itself. This point is also raised in John de Goes’s post comparing ZIO and cats effect (section 4, “Stable shifting”).
The consequence of using 'IO.shift' is that we can’t reason about the program in a local way, when it comes to threading — we have to know the implementation details of the effects we use, to determine which thread pool a particular effect is going to run on.
Async
Another way of creating 'IO' values is by providing asynchronous callbacks. This is used e.g. when integrating with async I/O libraries, such as Netty. Netty has its own threadpool, and the way it integrates with concurrency libraries, such as cats-effect, is by calling a callback when a particular action is complete.
For example, here’s an effect which runs asynchronously:
What happens if we sequence another effect after the async one? It’s run on the async thread pool!
This might be dangerous, and even lead to deadlocks (for example, see this sttp issue; in case of Netty, we should never do any work on the non-blocking IO thread pool). That’s why we should always 'shift' to another thread pool after using async operations.
How to fix that? It might seem that shifting immediately after an async operation solves the problem:
Indeed. But! What if there’s an exception during asynchronous execution? Well, the error will be propagated, so no further effects will be executed. Unless there’s some finalizer effect, which always runs:
In the code above, we create an asynchronous effect ('ae'), which is always completed with an exception (the effect as a whole fails). We then specify that the 'printThread' effect should always run, regardless if 'ae' completes successfully or not, using 'guarantee'.
As you can see, the finalizer is also run on the async thread pool. That’s not good — our goal was to avoid doing any work on the async thread pool. Hence, we need to always shift, both in case of success and error of the effect. We can do this using 'guarantee' once more:
ZIO
ZIO offers only one method of changing thread pools: 'ZIO.on(ExecutionContext)'. Similarly to 'ContextShift.evalOn', it evaluates the effect it’s called on, on the given thread pool:
Unlike 'ContextShift.evalOn', where we in fact provide two thread pools (one on which to evaluate the effect, the other on which to shift after the evaluation is done), here we can see that after evaluating the effect, ZIO shifts to its default thread pool:
Because there’s no operator that would correspond to 'shift' in ZIO, it’s not possible to encounter the first problem (non-local reasoning). What about async operations? Let’s test!
After the asynchronous effect completes, we get shifted back to ZIO’s default pool. Hence, there’s no danger of blocking; Netty’s threads are safe! Things also works as expected when we run on a specific thread pool:
Even though the asynchronous effect is run on a dedicated thread pool, after it completes we go back to the one which was specified originally using '.on'.
Also in case of errors, we get the desired behaviour:
Summary
ZIO’s '.on' and cats-effect’s '.evalOn' might seem to do the same, but they don’t. Remember to pay attention to the thread pools you are using!
Thanks to YannMoisan and walmaaoui for prompting research on this issue!
All the source code is available on GitHub.'
This article was written by Adam Warski and posted originally on SoftwareMill Blog.