We love the different programming series Mateusz Kubuszok has written. This one is part 1 of 'Implicits, type classes, and extension methods', enjoy reading!
'Implicits. For some people, they are an indispensable feature of Scala, that allows achieving otherwise impossible things. For others a sole reason to avoid using the language. As far as I can tell the majority of the later never really learned how to use them right.
In this mini-series I want to explain different concepts related to implicits: the motivation behind, different mechanisms based on them and a bit on how to work with them. Originally it was supposed to be a single post, but it was way too long for anyone to read.
Implicit val, def, conversion, class…
First of all, implicit is not a thing. There are several distinct ideas, all of which use the same keyword 'implicit', using the same internal mechanism, but in the end, doing different things for different reasons. We could group them like this:
- 'implicit' parameter,
- 'implicit val' and 'implicit def' taking only 'implicit' parameters,
- 'implicit def' taking one parameter - 'implicit' conversion,
- 'implicit class'.
Type-based Dependency Injection
Let’s start with a simple case.
def runServer(config: Config, ec: ExecutionContext): Unit
def runMigrations(config: Config, ec: ExecutionContext): Unit
def runREPL(config: Config, ec: ExecutionContext): Unit
// config.mode is an enum telling which run mode user wanted
def startApp(config: Config, ec: ExecutionContext): Unit =
config.mode match {
case Server => runServer(config, ec)
case Migrations => runMigrations(config, ec)
case REPL => runREPL(config, ec)
}
val config: Config
val ec: ExecutionContext
startApp(config, ec)
As we can see, here we simply pass dependencies around. Additionally, these dependencies have unique types. In this scope we could find only one instance for each type we would ask compiler about. So, we can imagine, we could ask the compiler to do something like:
def startApp(use here the only thing of type Config,
use here the only thing of type: ExecutionContext): Unit =
(that only Config thing).mode match {
case Server => runServer(pass Config for me, pass ExecutionContext for me)
case Migrations => runMigrations(pass Config for me, pass ExecutionContext for me)
case REPL => runREPL(pass Config for me, pass ExecutionContext for me)
}
startApp(pass Config for me, pass ExecutionContext for me)
Of course, this would break pretty easily. Put another 'Config' and 'ExecutionContext' here and the compiler would not be able to guess, which object we want. So, let’s mark them:
mark as special val config: Config
mark as special val ec: ExecutionContext
This too would fail if we marked 2 'Config's or 'ExecutionContext's as special, but at least this would be a thing we could do something about.
At this point, we can notice, that since it’s the compiler that passes objects for us, we don’t have to write them ourselves:
startApp
The compiler knows, that it needs here one 'Config' and one 'ExecutionContext', and in order to find them it searches through the list of objects marked as special. Then all we would see is that these parameters were passed implicitly.
def runServer(implicit config: Config, ec: ExecutionContext): Unit
def runMigrations(implicit config: Config, ec: ExecutionContext): Unit
def runREPL(implicit config: Config, ec: ExecutionContext): Unit
def startApp(implicit config: Config, ec: ExecutionContext): Unit =
config.mode match {
case Server => runServer
case Migrations => runMigrations
case REPL => runREPL
}
implicit val config: Config
implicit val ec: ExecutionContext
startApp
There are some rules regarding implicits:
- implicits cannot be ambiguous. You cannot define 2 implicits with the same type in the same scope,
- however, class and its subclass are slightly different scopes. The rule of thumb is that in such cases, the closer scope would win. However, as always it is a bit more complex, so we will explain that later on,
- companion objects are also checked for implicits. If you have type 'X', then 'X' companion will be checked for 'X' implicits, but also for 'F[X]' or 'X[A]' (this will be explained later, as we reach type classes),
- a method can have one parameter list, which will be marked as 'implicit'. As of now, it will always be the last parameter list. Each of params will become implicit value (so you cannot get params from the implicit scope, but not make them implicits themselves).
For convenience 'scala.Predef' defines utility, that allows us to get implicit by its type:
@inline def implicitly[T](implicit e: T) = e
implicit val s: String = "test"
implicity[String] // "test"
Most of the time, these rules work quite good. There is at least one case though, where they can break your code unexpectedly:
implicitly[X] // gets correct X from some other scope
class Y {
implicit val x: X = implicitly[X]
}
(new Y).x // null!
If you run this particular code, you could be surprised, why 'y.x' would return 'null'. Problem is, that usually when we define variables, they become available only from the moment they are defined. However, class parameters are always accessible from any place in the class - even before they are initialized. Normally, scalac would produce an error if you explicitly tried to access uninitialized val. However, here it is not so obvious - value comes from 'implicitly[X]'. Unfortunately, implicitly takes it from the closest scope it can find - which is 'Y' class - at this point, not initialized completely.
I think I stumbled on this issue only twice. But the first time it was a nightmare, to figure out what was happening. It only shows how complex things can get once several features are used at once.
Instance produced on demand
Notice, that implicit instances so far were defined as values. We had to define a complete type and implementation. But, let’s say we want to always have an instance of a 'List' that is an empty 'List'. The List has a type parameter. What now?
implicit def emptyList[T]: List[T] = List.empty[T]
implicitly[List[String]] // Nil
implicitly[List[Int]] // Nil
That wasn’t very interesting. But let’s say we want a 'Stream', and we want it to be filled with a value of implicit scope. Can we do it?
implicit s: String = "test"
implicit i: Int = 0
implicit def streamOfT[T](implicit value: T): Stream[T] =
Stream.continually(value)
implicitly[Stream[String]] // Stream("test", "test", ...)
implicitly[Stream[Int]] // Stream(0, 0, ...)
We see, that we can use 'implicit def' to define value in scope in a more dynamic way. It can even be fed with other values from implicit scope! But now one would write a code like that in examples: passing around primitives is not a not going to work in a long run. So, what is the actual benefit of this mechanism and what is the reason for implicits’ existence in the first place?
Flavors of polymorphism
- Subtyping aka subtype polymorphism. Within one type, you have a subset of values making another type. OO languages usually implement it with interface inheritance. Because the subtype must fulfill all requirements and assertions of the supertype, the supertype can be treated as a generalization and common denominator of all its subtypes.
(That is the ideal situation - OO programmers need to be actively reminded about Liskov substitution principle. A lot of bugs were born when someone overrode a method and created a class that shared an interface with the superclass but couldn’t be used as a drop-in replacement).
- Parametric polymorphism, 'List' of 'String's should behave the same as 'List' of 'Int's. So why not let them have the same implementation? Of course, they have to have different types, so that we don’t put integers into a list of strings and vice versa. Type parameters let us define a whole implementation where instead of a type we put a placeholder, and when we supplement it with a concrete type, the implementation is materialized.
There is also another way to look at it - mathematical one. Functions, in general, are sets of arguments-values pairs. Types are also sets of values. And set might be both an argument and a value of a function. So it is possible to create a function taking a type as an argument and returning another type as a value. An example of such function (called type constructor) would be 'List[_]', which takes a parameter e.g. 'String' and then returns a concrete type e.g. 'List[String]'.
- Ad-hoc polymorphism aka operator overloading. You have '*' (or '+') operator and you want to be able to use if for multiplying: ints, longs, doubles, floats, complex numbers, vectors, matrices… For each type it would have a slightly different implementation, so you cannot just create one function that rules them all, and simply adjust the type (as is the case with parametric polymorphism), nor extract common behavior (what matrices multiplication has in common with multiplying scalars?).
As a matter of the fact, each overloaded function is actually a set of different functions, sharing the same name, where one that will be used is based on a type of the argument(s) (and their number aka arity).
The distinction between all 3 kinds of polymorphism is very important. At some point, developers of Haskell needed to introduce some kind of polymorphism to handle things like mentioned multiplication or addition, but not by having everything being build into the language. They wanted their polymorphic solution to be extensible. Parametric polymorphism is not a solution to this problem and they didn’t want to introduce subtyping with its shares of problems with inheritance.
Type classes in Haskell
- let us define a named contract. This contract would require, that for a type certain functions have to be defined and meet certain assertions,
- when type declared itself as following the said contract, then somewhere in the scope of program required functions have to be defined and explicitly pointed to (and have to follow assertions),
- operators like '*', '+' etc have assigned contracts, and are used as syntactic sugar for functions mentioned in their contracts,
- one contract can inherit obligations from another contract and add to them.
- we would require ⊕⊕ to be binary operator over A:
- which is also associative:
- and has a neutral element 0:
Examples of such monoids would be: numbers with addition and 0, numbers with multiplication and 1, lists of the same type with list concatenation and an empty list, matrices of the same size with matrix addition with a matrix of zeros, square matrices of the same size with matrix multiplication and an identity matrix.
Haskell would define such contract this way:
class Monoid m where
mempty :: m -- neutral element
mappend :: m -> m -> m -- operator A x A -> A
mconcat :: [m] -> m -- fold list (e.g. calc sum/product)
-- defining mconcat is optional, since it has the following default:
mconcat = foldr mappend mempty
-- this infix synonym for mappend is found in Data.Monoid
x <> y = mappend x y
infixr 6 <>
together with the following laws:
-- Identity laws
x <> mempty = x
mempty <> x = x
-- Associativity
(x <> y) <> z = x <> (y <> z)
An implementation for numbers would look like:
newtype Sum n = Sum n
instance Num n => Monoid (Sum n) where
mempty = Sum 0
mappend (Sum x) (Sum y) = Sum (x + y)
newtype Product n = Product n
instance Num n => Monoid (Product n) where
mempty = Product 1
mappend (Product x) (Product y) = Product (x * y)
(Examples are taken from Haskell wiki).
While this syntax might be alien to quite a lot of people, it should show us a few things:
- Haskell uses names that no other language would consider newcomer friendly ('mempty' - monoid-empty, 'mappend' - monoid-append, 'mconcat' - monoid-concatenate),
- Haskell kind of wraps types in 'Sum' or 'Product', because both addition and multiplication are monoids and it has to distinct them somehow which one we want to use,
- Haskell borrowed 'class' and 'instance' keywords from object-oriented languages,
- if we think about it… it is not about declaring a class at all! That is unless we assume that Haskell only lets us use 'class' to define parametrized trait with an interface and assertions, while 'instance' only allow us to create one global singleton instance of the interface per type used as a parameter.
Type classes in Scala
- JVM basically won’t let you make sure that several functions are defined at once in other way than by putting them into the same interface,
- Haskell would supply the right implementation inside parametrized type and each parametrized function is turned into a specialized version, which might have all occurrences of an overload function replaced with a pointer to a specific implementation,
- also Haskell can look through the whole scope for an implementation for a specific type. With JVM ecosystem where you compile stuff directly to the bytecode and later on you would have trouble re-compiling stuff because new type appeared. '@Specialize' won’t help either, since it only generates special cases for primitives.
To address the issues, he chooses following solutions:
- type classes with be implemented as actual classes. Or trait. This way you will have to guarantee,
- wherever you use a type class, you will call a method on an actual instance which will store all necessary information about types inside,
- the fact that you call this code from type-erased generic implementation that could have been compiler somewhere else won’t be an issue, because you will use dependency injection to get it.
trait Monoid[A] {
def empty: A
def append(a1: A, a2: A): A
def concat(list: List[A]): A = list.foldRight(empty)(append)
}
val intMonoid = new Monoid[Int] {
def empty: Int = 0
def append(a1: Int, a2: Int): Int = a1 + a2
}
def combineValues[T](
map: Map[String, List[T]]
)(monoid: Monoid[T]): Map[String, T] =
map.mapValues(monoid.concat)
combineValues(Map(
"test1" -> List(1, 2, 3, 4, 5),
"test2" -> List(6, 7, 8, 9, 0)
))(intMonoid)
Thing is, usually, you want to have only one type class for each type in the scope. Additionally, such usage would introduce clumsiness the original solution hasn’t. Therefore the idea of implicits was born. Originally, implicits were introduced as a way of automatically passing type classes where needed basing solely on their type.
object Monoid {
implicit val intMonoid: Monoid[Int] = new Monoid[Int] {
def empty: Int = 0
def append(a1: Int, a2: Int): Int = a1 + a2
}
}
def combineValues[T](
map: Map[String, List[T]]
)(implicit monoid: Monoid[T]): Map[String, T] =
map.mapValues(monoid.concat)
combineValues(Map(
"test1" -> List(1, 2, 3, 4, 5),
"test2" -> List(6, 7, 8, 9, 0)
))
Here we can notice one interesting thing about type classes and implicits: when you need an implicit of type 'F[A]' compiler will look for it inside companion objects of F and A if no implicit could be found in the current scope. This way, you don’t have to define/import them everywhere if you just want to rely on default.
But passing type classes could be made simpler even further. Since the most common use case is a type class with one parameter, you can declare them as context bounds.
def needMonoid[T](ts: List[T])(implicit monoid: Monoid[T]): T =
monoid.concat(ts)
// could be written with context bound
def needMonoid[T: Monoid](ts: List[T]): T =
implicitly[Monoid[T]].concat(ts)
def severalBoundArePossible[T : Bound1 : Bound2]: Unit = {
implicitly[Bound1[T]]
implicitly[Bound2[T]]
}
With context bound notation you declare that you need an implicit. It will become available in implicit scope, but it has no name you can use. This is why:
- it won’t cause 'unused parameter' issue if you just want to force some contract, but not made use of if (e.g. implicit evidence),
- you can pass it inside a method to another method that requires it,
- if you want to use it, you need to extract it from implicit scope manually e.g. with 'implicitly[T]'.
object Monoid {
def apply[T: Monoid]: Monoid[T] = implicitly[Monoid[T]]
def append[A: Monoid]:(a1: A, a2: A): A = apply[A].append(a1, a2)
def concat[A: Monoid](list: List[A]) = apply[A].concat(list)
}
Once you define such helpers inside a companion, you can use them inside a code making the context very clear:
def combineValues[T: Monoid](map: Map[String, List[T]]): Map[String, T] =
map.mapValues(Monoid.concat[T])
If you’ll ever start working with really FP-heavy libraries like Cats or Scalaz, you’ll notice that concepts everywhere.
Summary
In this post, we learned a bit about the first usage of implicits - type-based dependency injection as well as the biggest reason behind introducing them - the type classes.'
This article was written by Mateusz Kubuszok and posted originally on kubuszok.com