Connecting...

W1siziisimnvbxbpbgvkx3rozw1lx2fzc2v0cy9zawduawz5lxrly2hub2xvz3kvanbnl2jhbm5lci1kzwzhdwx0lmpwzyjdxq

Alphabet Soup: Type-level transformations by James Phillips

W1siziisijiwmtkvmdqvmjuvmtuvmdmvmjgvntazl3blegvscy1wag90by0ynzg4odggkdeplmpwzwcixsxbinailcj0ahvtyiisijkwmhg5mdbcdtawm2uixv0

Have you used the Alphabet-soup Scala library yet?

It provides faster compiler-time and reduces boilerplate. How do you make use of this library? Senior Scala Developer, James Phillips tells us all!

 

 

The library alphabet-soup can be found here.

Alphabet-soup is a scala library to provide faster compiler-time type-mixing, to reduce boilerplate and the cognitive overhead of reading and writing canonical code.

If there’s only one way to write a piece of code, why do we have to write it? The compiler should do it for us.

 

Caveat

One major difference about alphabet-soup’s approach to the usual type-transformation libraries is that it is entirely type based and not name based. If you map a case class to a case class, it does not care what the field names are, only the types. (As a nice side-effect, it is faster to compile than these other libraries as a result).

For this reason, it works with tuples where other libraries may not. Also for this reason, everything you care about must have a unique type. You can’t have duplication (just like you wouldn’t have duplication in field names). This seems like a lot of overhead, but in reality it’s actually very rare that you combine values from different concept-spaces. When was the last time you added a user’s first name to their username?

Be warned there’s a whole concept of 'Atom' I have omitted below which is important to get the library up and running in your own project. It is detailed in the README of the library linked above.

 

The concept

There were a couple of use-cases which drove the development of alphabet-soup, one to do with working on a very abstract level and one to do with reducing needless boiler-plate scattered around projects.

#1 Reducing needless boilerplate

In several of my projects, there is a distinction between what I call ‘resource’ types that are exposed in an API, and ‘model’ types, which represent database structures or business-related code structures.

This is done for many reasons, chief among which is that if you want to change the database you shouldn’t automatically update your API to reflect this. Doing that should be hard, and a conscious choice. So there exists a mapping between these types, which the user (i.e. me) has to implement.

A really dull example is:

1 case class Payment(id: PaymentId, userId: UserId, amount: Price, suspicious: IsSuspicious) // IsSuspicious is some value class wrapping boolean
2
3 // Revealed in my API
4 case class PaymentResource(id: PaymentId, userId: UserId, amount: Price)
5 object PaymentResource {
6   def from(p: Payment): PaymentResource = PaymentResource(p.id, p.userId, p.amount)
7 }

The 'from' method’s definition is boring. At no point is there a choice to make, so how could I write it incorrectly? Yet I seemingly still need to write it.

Things get even worse when your resource is made up of multiple models, since different datatypes are stored in different tables. Imagine:

You can see how these 'from' methods may grow and grow, and yet still contain nothing interesting. I’m bored just thinking about them.

I want to get rid of them.

#2 Automatic type projections

Imagine for a moment you have an 'aa: ApplicativeAsk[F, (PaymentId, Config, UserId)]', where 'F' can be whatever effect you like that supports the reader, say 'ReaderT[IO, ?]'.

If I wanted to extract an 'F[UserId]' from this, I would have to write 'aa.ask.map(_._3)' (or a more verbose variant).

Not too bad, but if I wanted to extract '(PaymentId, UserId)' the example would become:

Much worse. Why should we need to do that? There’s only one way to extract '(PaymentId, UserId)' from '(PaymentId, Config, UserId)': we shouldn’t have to write that code.

It gets even worse if you imagine the case where we have an 'implicit val aa: ApplicativeAsk[F, (PaymentId, Config, UserId)]' and we need an 'implicit val aa2: ApplicativeAsk[F, (PaymentId, UserId)]'. What do we need to do now? Well:

1 implicit val aa2: ApplicativeAsk[F, (PaymentId, UserId)] = new ApplicativeAsk[F, (PaymentId, UserId)] {
2   val applicative = aa.applicative
3   val ask: F[(PaymentId, UserId)] = aa.ask.map { case (p, _, u) => p -> u }
4   def reader[X](f: (PaymentId, UserId) => X): M[X] = ??? // Something very boring
5 }

That just builds a new 'ApplicativeAsk' for us. A monkey with a compiler could do it, since at no point is there a choice to make.

Why is this stuff not automatic? I want to remove the need for the monkey and just get the compiler to do it.

 

Implementation

Below is a rough-and-ready naïve implementation of our algorithm — the first version I wrote. Several bugs were uncovered caused by limitations in the compiler, and more structure added later as requirements came in from testing. But the underlying algorithm did not change from the below.

Mixer

So, we want something that takes a value of some type, 'A', and produces a value of some type, 'B'. If it’s impossible to produce a 'B' canonically from 'A', we want compilation to fail.

This is a pretty simple interface we want to implement. To keep with the alphabet-soup theme, I’ll call it 'Mixer':

 

Generic

 

Immediately when thinking about how to implement the above, we run into a problem. 'A' is completely generic, and we can do nothing with it. Ideally it would have some structure we could recurse over.

Enter 'Generic', from shapeless. 'Generic' will turn any type 'A' into an 'HList', if it can. If it can’t, our entire 'Mixer' concept is impossible for the type 'A'. 'Generic' works on tuples, hlists and case classes so it seems a pretty good place to start.

An example:

This is very promising! Each of the three types on the left all contain exactly the same information, and 'Generic' turns them into the same generic structure, allowing us to potentially rebuild them as we see fit later on.

One major problem though, 'Generic' is only skin-deep:

Note the inner tuple on the right hand side — only the outer-most layer is transformed.

This won’t do. We need something that is completely de-structured on the right. We need to write our own thing which recursively applies 'Generic' all the way down.

 

Atomiser

I called this new concept 'Atomiser'; it takes types apart into its constituent components.

The structure is very, very similar to 'Generic':

We’re following the 'Aux' pattern here, like most of shapeless does. If you’re not familiar with it, there will be a blog post in the future about why it exists and what problems it solves.

Note that 'Atomiser', like 'Generic', provides a way to go to the generic structure as well as from it, back to our real structure.

So, how do we implement 'Atomiser'? Our structure is now potentially very complicated, as we basically have a tree of types to traverse.

Well the first thing we have to do is use shapeless’ 'Generic' to transform our outer-most layer for us. Once we’ve done that, we search the tree recursively. I leave the implementations out because they’re boring, and canonical.

The first:

The recursive step:

1 // Run Atomiser on the head of our tree `T`, then run Atomiser on the tail of our tree `T`. Combine the results.
2 implicit def headFirstSearchCase[H, T <: HList, HOut, TOut <: HList](
3   implicit atomiserH: Atomiser.Aux[H, HOut],
4   atomiserT: Atomiser.Aux[T, TOut],
5 ): Atomiser.Aux[H :: T, HOut :: TOut] = ???

And the termination condition:

The eagle-eyed amongst you may note that the compiler won’t know which to apply first, 'genericCase' or 'headFirstSearchCase'.

Because of this conflict in the domains of our implicits, 'genericCase' must be made low priority. In pseudo-code, it will end up looking like this:

And we’re done. Those three provide a rough implementation of 'Atomiser'. There’s actually a lot more going on under the hood, and several bugs will arise with the above in testing, but they’re just implementation details. There will be a future blog post on the super technical details and problems that you run into when tightening up the above.

The important thing is that algebraically, the above algorithm is correct.

AtomSelector

So now we have a completely generic tree structure of any applicable type. We can imagine that our first step in implementing 'Mixer[A, B]' will be to apply 'Atomiser' to both type arguments.

What do we then do?

We need to produce a 'B' from 'A'. So it makes sense to iterate through 'B'’s tree structure, and for each element try to find the same element present in 'A'’s tree structure. If there is such an element, take it and use it to build a run-time value of type 'B'. If there is no such element, our compilation should fail.

So, we need a type class that will select a type from a generic structure, or fail compilation. Enter 'AtomSelector':

The actual implementations of the implicits have been left out below, since they’re canonical and can’t be written incorrectly.

Let’s start the implementation with an easy case where we just select the head of our list. Remember, we’re assuming 'L' has been atomised:

That says, if we’re trying to select 'H' and our type has 'H' as the head element, just take it.

Next, we have to do some recursive step. We’re working on a tree so we actually need two recursive steps, one recursing on the head and one on the tail:

1 implicit def recurse[H, T <: HList, U](implicit st: AtomSelector[T, U]): AtomSelector[H :: T, U] = ???
2 implicit def recurseNested[H <: HList, T <: HList, U: Atom](implicit st: AtomSelector[H, U]): AtomSelector[H :: T, U] = ???

The first handles the tail case, the second handles the head. Both just assume that our desired element 'U' is within the structure they recursively call.

And… we’re done. We don’t need a termination case, since you can’t select any type from 'HNil'. If we don’t terminate, i.e. we recurse through the entire structure and cannot find a 'U', then we just fail compilation as desired.

Again, the reality is more complex and the reasons why and what extra layers need to be put where will be detailed in a future post.

Algebraically and naïvely though, the above is correct.

 

Stitching it together

Now we’re nearly ready to stitch it all together and implement our 'Mixer' type class!

We’re actually going to implement something else first, which does the actual algorithm. It’s called 'MixerImplFromAtomised':

It’s exactly the same as 'Mixer', but assumes 'A' and 'B' have been atomised. This is purely for efficiency purposes, so we don’t have to re-atomise 'A' and 'B' in every implicit.

Let’s start with an easy case, the termination:

This says, “we can mix any type into 'HNil'”. Which is true: we can always produce no structure from any structure, just by forgetting about the structure.

Next up is the case where the head of our list is something we can select (ie, not an 'HList'):

 

 

 

This says, if we can select 'BH' from 'A', and mix 'A' into 'BT', then we can mix 'A' into 'BH :: BT'.

And the last case, low priority again, is where we must recurse on the head:

1 implicit def headIsHListRecurse[A, BT <: HList, BHH, BHT <: HList](
2   implicit m1: MixerImplFromAtomised[A, BHH :: BHT],
3   m2: MixerImplFromAtomised[A, BT]
4 ): MixerImplFromAtomised[A, (BHH :: BHT) :: BT] = new MixerImplFromAtomised[A, (BHH :: BHT) :: BT] {
5   def mix(a: A): (BHH :: BHT) :: BT = m1.value.mix(a) :: m2.mix(a)
6 }

This says, if we can mix 'A' into 'BHH::BHT', and we can mix 'A' into 'BT', then we can mix 'A' into '(BHH::BHT) :: BT'. We need to produce the entirety of the RHS, so the mixer algorithm must be invoked twice, once for the head structure and once for the tail structure.

And we have now covered all cases. Now…

 

The final step

We’re finally going to implement our top-level algorithm:

It’s thankfully very simple, since 'MixerImplFromAtomised' did all the heavy lifting for us. We only need to atomise our types, and then invoke our algorithm:

And there’s one very specific case where we can shortcut the entire charade, the equality case:
Again, there are details omitted for clarity. The algorithm is correct, but a myriad of technical details arise when dealing with collections, or defaults, and specific problems with the scala compiler to get around, which will be detailed in the future.
 

Using it

So we’re done!

Now let’s use it in our two boring examples from above.

Example #1, Resources

1 case class Payment(id: PaymentId, userId: UserId, amount: Price, suspicious: IsSuspicious)
2 case class User(id: UserId, name: Username)
3
4 case class PaymentResource(id: PaymentId, username: Username, amount: Price)
5 object PaymentResource {
6   def from(p: Payment, u: User): PaymentResource = Mixer[(Payment, User), PaymentResource].mix(p -> u)
7 }

See the use of 'Mixer' in 'def from'. One really nice thing about this approach is that we can forget about the field names, so if we want to produce some type we just need some structure that can satisfy all the constituent types. So we can just bung 'Payment' and 'User' into a tuple, and let the compiler figure it all out under the hood for us.

We could alternatively use 'Mixer[(Payment, User), PaymentResource]' everywhere we needed it, rather than scoping within 'def from'. That would come with slower compile times, since we’d be recalculating the structures every time, but it’s definitely doable to reduce the boilerplate even further.

Example #2, ApplicativeAsk

Just a note that, of course, the approach below will obviously work for other structures such as 'MonadState', etc.

The problem was we had an 'aa: ApplicativeAsk[F, (PaymentId, Config, UserId)]' and we wanted an 'aa2: ApplicativeAsk[F, (PaymentId, UserId)]'.

Well:

1 implicit def aaFromMixer[F[_], A, B](implicit aa: ApplicativeAsk[F, A], m: Mixer[A, B]): ApplicativeAsk[F, B] = new ApplicativeAsk[F, B] {
2   val applicative = aa.applicative
3   // Use of mixer here:
4   val ask: F[B] = aa.ask.map(m.mix)
5   def reader[X](f: B => X): M[X] = aa.reader[X](a => f(m.mix(a)))
6 }

Now,

1  // We have this already
2  implicit val aa: ApplicativeAsk[F, (PaymentId, Config, UserId)] = ???
3
4  implicitly[ApplicativeAsk[F, PaymentId]] // compiles
5  implicitly[ApplicativeAsk[F, Config]] // compiles
6  implicitly[ApplicativeAsk[F, UserId]] // compiles
7  implicitly[ApplicativeAsk[F, (Config, UserId)]] // compiles
8  implicitly[ApplicativeAsk[F, (PaymentId, UserId)]] // compiles
9  implicitly[ApplicativeAsk[F, (PaymentId, Config)]] // compiles
10 implicitly[ApplicativeAsk[F, (Config, PaymentId)]] // compiles
11 // etc…

Which is pretty cool!'

 

This article was written by James Phillips and originally posted on Medium.