We hope you enjoyed Part 1 of this blog series written by Stéphane Derosiaux. We now lead onto Part 2 on How typeclasses save us, where Stéphane helps us to get away from specific implementations and abtract our functions even more.
'As we saw in part 1, it’s always better to enforce the bare minimum types in our functions and algebras by using polymorphism and generic effects 'F[_]'.
Unfortunately, because of Scala/JVM quirks, it’s always possible to find edge-cases and do things outside of what the types convey ('null', 'throw', side-effects, non-total functions..). A best-practice is to code Total & Deterministic & Side-Effects free functions (Pure Functionnal Programming), to ensure the types convey exactly what’s the functions can do. Types are documentation.
We’ll see how typeclasses help us to get away from specific implementations and abtract our functions even more. Also, we’ll be careful with the implementations used behind those typeclasses: they could not act all in the same way (performance, stack-safety).
Typeclasses to the rescue
Let’s focus on specific examples to clear our mind.
We’ll take a look at some cats-effect typeclasses, but all the reasonings are valid for any typeclasses.
Be Coherent
Typeclasses should be (locally or globally) coherent. It means we should find only one instance for a given type in a given (local) scope, for a deterministic implicit resolution. It’s a difficult problem with workarounds right now.
Often a typeclass extends another one (like 'Monad > Applicative > Functor'), therefore if you have two typeclasses with a same parent in scope, you don’t have coherence.
In this example, which '.map' should the compiler pick? It exists in 'Monad' and 'Traverse':
Typeclasses coherences is a very hot topic, feel free to wander:
- Allow Typeclasses to Declare Themselves Coherent dotty/#2047
- What kind of typeclass coherence should we support? dotty/#4234
- Coherence Domains in Scala
- Local Coherence from the Scala Typeclass proposal
- Type classes: confluence, coherence and global uniqueness
Be in Sync with the Postel’s law
Be conservative in what you send, be liberal in what you accept —Postel’s law
In my previous job, I wanted to replace all our 'Future' or 'Task' by 'F[_]: Sync'.
'Sync' is a typeclass providing the capability of deferring some execution. Anyone can implement this trait with its own type. We can find 'Sync' instances for 'cats IO', 'monix Task', 'ZIO', 'BIO', 'UIO'. It’s not only used for asynchronous execution: we also have instances for monad transformers (if their inner monad has a 'Sync') 'EitherT', 'OptionT','WriteT', 'StateT' etc.
Here is the description of 'Sync'
(from cats 1.x):
Basically, it just lazifies some execution using a thunk (and provides some cleanup with its parent 'Bracket' which is monadic).
Unfortunately, there is no 'Sync' instance of 'Future' because it wouldn’t be lawful (would not respect the law 'Sync' needs to respect): 'Future' being not lazy, it can’t defer an execution.
This goes against 'Sync'’s laws.
As an alternative, we could build a 'Sync' instance over another thunk '() ⇒ Future[A]' to force the suspension:
Yes, that’s ugly. No-one should use 'Future' because it’s eager and not referentially-transparent, making it hard to know what’s going on and is bug-prone.
We prefer relying on 'IO' or 'Task' from the Scala ecosystem which are lazy by nature, and referentially-transparent.
But better: we prefer relying on 'Sync' and let the caller use 'IO' or 'Task' as they want!
As you can see, the code is not that much different, but it’s more powerful, and follow the Postel’s law: we are liberal in what we accept! “Just give me a 'Sync' contract, and I’ll work with you”.
It also reduces drastically what the function can do. 'Sync' has less features than 'Task' which is a full-blown class: it can run the execution, memoize, add async boundaries, add callbacks on lifecycle changes, etc.
If you don’t need those features in your function, why use 'Task' inside? Use the least power you can. If you need more power, use more powerful typeclasses.
Bracket, LiftIO, Async, Effect, Concurrent, ConcurrentEffect
Instead of knowing all implementations that can run a computation, it’s useful to know which typeclasses to use instead (here, focusing on cats-effect), and let the callers decide upon the implementation (it can get back to the root of the program!).
The typeclasses in cats-effect are particularly useful because a program always needs some form of computation effect, therefore it’s quite ubiquitous to use them.
All of them are used the same way as we saw with 'Sync'. They just provide more or less features to deal with sync/async executions, and all are 'Monad's to 'flatMap' the hell out of them.
Here is a small overview of what’s possible:
- 'Bracket': it’s the loan-pattern in FP (to manage auto-cleaning of resources):
The cleanup logic is “embedded” into the result: no need to think about it anymore. This is a wonderful abstraction, no need of variable in the outer scope of 'try', of 'finally { if (f != null) f.close(); }' and so on.
- 'LiftIO': transform any cats 'IO' to the desired effect. Necessary for “bridges”.
- 'Async': trigger an async execution. It provides us a callback to call when our execution is “done”. 'IO.fromFuture' uses this by registering to 'Future.onComplete' and callbacking.
- 'Effect': a super 'Async' that can run the effect, and still wraps the result into 'IO[Unit]', referential-transparency abided.
Previously, we could also call 'runSyncStep' to evaluate steps until an async-boundary, but that will probably be gone soon, but nevermind.
- 'Concurrent' is 'Async' with computations race and cancellation:
Usage:
The other usage is to create cancellable computations:
The previous code was just declarative, now we start the computation:
- 'ConcurrentEffect': finally, this one is 'Concurrent' with the possibility to start a cancellable computation (as we did on 'IO').
- 'SyncIO': was committed a week ago, stay tuned!
Capabilities: separation of concerns
Typeclasses represents capabilities.
Here, 'program' expect multiple features/capabilities that 'F' must have to be executed. In an eyeblink, you know this function will do IOs, async stuff, access to the DB, check permissions, and draw something. All are different concerns that can be implemented as the caller want. It’s like SOLID OOP programming where you refer to interfaces, not to implementations.
The difference is that we’re dealing only with typeclasses: no inheritance, implicitly resolved, applied to functions.
We’ll see how important this is when we are going to stack monads.
To avoid having tons of required “capabilities” in our functions, they should be split apart and deal with the minimal set of features (Single Responsibility Principle). The only “big” function is the main entry, where we need to provide everything, but we should soon call functions with only a few capabilities: a function that uses 'DbAccess' should not rely on 'Drawable' (except to pass it on nested functions):
It will be clearer for the reader (and the writer) to know what a function is dealing with, what the function can do, what the function has access to. It’s easier to reason about it, because its scope is small and possibilities of actions are not endless.
This is why we have static types: to restrict what we can do, how we can combine them. Having generics A is a step even further: you can’t do anything with them. Typeclasses exist to be able to act on such types, by just providing some operations.
If you take 'String' for instance, you can do so many things with it it’s not funny. But if you just provide 'A: Show' to a function, you know it can only stringify the value behind 'A' (call '.show()' on it).
John A De Goes demonstrates this in FP to the Max. Check it out now if you didn’t saw it yet.
Shims: conversions between typeclasses
We talked a lot about cats. It can happen a project uses cats and scalaz at the same time. In this case, djspiewak/shims provides interoperability (isomorphisms) between their typeclasses.
It’s a bunch of 'implicit defs' between the two ecosystems, to avoid polluting our code.
I never had the need to use it. The last time I had both of them in a codebase, I directly replaced the scalaz parts in favor of cats. I guess in large projects, you don’t have time to change everything, so shims can come in handy.
What if you need to specialize your code?
• Callbacks could be handled 'Bracket' as 'Jakub Kozlowski' suggested on Twitter
• Parallelisation by 'Parallel'
For the sake of it, let’s say there is no typeclass equivalent for our features:
Here, it’s difficult to generalize 'compute' because it relies on several 'Task'’s features which don’t have their equivalent in typeclasses: callbacks, unordered gathering, fallback to the default 'Scheduler'…
We could create our own specialized typeclasses (respect the coherence) with the methods we want:
Typeclasses should have laws (internal & external), tested with Scalacheck, such as 'Functor' has internal laws about its identity:Or 'Sync' has internal laws about handling errors:
Modularization
We saw we always have a “core” that depends upon 'F[_]' to let us use any typeclass and build any stacks on top.
A good practice when writing an application or a library is to write this abstract core in its own module and have distinct modules for implementations.
Example of a library which core contains only tagless final algebras, type aliases, and such generic constructions:
A project will depend upon 'lib-core' and 'lib-monix' (which will import monix on its own) for instance, but won’t depend on the other two modules (which would provide the same function as 'lib-monix', just with a different implementation). It could also depends on 'lib-core' and nothing else, and provide its own implementation. This avoids polluting dependencies of the upstream projects.
Same story for HTTP frameworks providing serializers and deserializers for like circe and play-json: you use different modules because you commit only to one of those.
All implementations are different
Performances
We scratched the performance issue when dealing with Monad Transformers, and we successfully remove the stack by providing our own typeclass implementations.
Another factor is the implementations themselves. Their performance can largely differ. Quite some work has been done lately on the synchronous/asynchronous effects performances, due to some (good) competition between the main actors: scalaz/cats/monix.
A simple benchmark (bits taken from scalaz-zio) can shows tremendous differences to execute the same computation (abstracted by 'F[_]'):
Here are the results, with scalaz-zio almost 3x times faster:
We are not here to debate about those differences.
Our point is that if you care about performance, you should definitely bench and compare several implementations to fit your need. The needed features can be the same (provided by the typeclass), but how it’s done internally has a large impact on the end result.
The best part: it’s easy to test different implementations without modifying the whole program (the typeclasses has to act the same way!), and gain performances.
Stack Safety
Finally, it has to be taken into account that some Monad implementations can be not stack-safe.
It means that if the computations 'flatMap' the hell out, it’s possible for the program to crash at runtime.
If we implement a recursive method that 'flatMap's before the recursive call, we can test this out:
This method sets the value of each element with its index, recursively.
We can provide any monadic 'F[_]':
Because our last example uses 'Id', aka “nothing”, 'StateT' is not stack-safe. This creates a long encapsulation of '.flatMap(...flatMap(...flatMap(...)))' which explodes.
No having stack-safety (normally done via trampolining internally) is a no-go. Watch out for what you are using. You wouldn’t like to get a production crash because of some customer having an too long array of whatever data.
Hopefully, all implementations in cats, scalaz, monix, and similar quality projects have our back, and everything is stack-safe. (it was not always the case, see make IndexedStateT stack safe for instance)'
Stay posted for Part 3: Stacking Monad Transformers without stack
This article was written by Stéphane Derosiaux and originally posted on sderosiaux.com