Connecting...

Arts Arts And Crafts Concept 1314543

Counting at compile time by James Phillips

Arts Arts And Crafts Concept 1314543

Want to get more in-depth knowledge of counting at compile time?

Using Scala, Senior Scala Developer James Phillips showed us the approach that can be taken and how you can achieve genuine calculations performed purely at compile time!

 

'Counting at compile time is one of the world’s simple pleasures. We have taken the scala compiler and hacked something in to it to make it do things it really doesn’t look like it should be doing.

The standard approach of ‘counting at compile time’ utilises a type commonly called 'Nat'. It’s available in libraries such as shapeless. We have types available, that look like this: 'type _13 = <something>' — this represents the number 13. '<something>' is yet to be revealed.

Through type-class trickery and a clever definition of 'something' we can make the compiler perform magic for us, and turn two types such as '_13' and '_3' into a third type '_16' by ‘adding’ them together, all at compile time. You can, of course, also make it do any other arithmetic operation you can dream of, if you have the time to implement it.

A use-case might look a bit like this:

Here, 'Sum' is our yet-to-be-defined typeclass which adds two 'Nat' together. 's.Out' is an inner type of this typeclass, representing the sum of 'A' and 'B'. The magic thing is if 'A=3' and 'B=7' then the return type is '10' — not just a normal path-dependent type 's.Out'.

Why do we want to do this? Because we can! So let’s.

 

Nat Definition

Here’s what a bare-bones 'Nat' implementation looks like:

You can go as high as you like, but eventually the compiler starts complaining as the type nesting gets too deep.

You can see that each number is just defined as the successor of the previous number. The full expanded type of '_3' would be 'Succ[Succ[Succ[_0]]]' for example. In any calculation we do, we could peel off a layer of '_3' and then perform our calculation recursively on '_2'. This is how the majority of 'Nat' typeclasses (and type-level programming typeclasses in general) work.

 

ToInt

You may have noticed that these are purely types. Nothing else. No runtime version exists of these things, in this version of 'Nat' I’ve just cobbled together. (Unless you did 'new _3 {}', but there’s really no reason to do that).

But, it’s a perfectly reasonable request to want to take a type 'N <: Nat' and have the runtime integer it represents, 'n'.

This could be done with macro magic, as in shapeless, or by defining companions like the following explicitly:

But I will forego these and derive the integer representations implicitly, as an example of a very simple compile-time calculation we can do on these 'Nat's.

We need a type-level function which takes in one type, a 'Nat', and spits out a run-time value 'Int'. A compile-time (or type-level) function is a trait, and its arguments are its type parameters. The return of the function (either a type or a value) is contained within the trait. So it looks like this:

You can see how this would now work. If, in our code, we found ourselves in possession of a 'val i: ToInt[_3]' instance, we could do 'i.value' to get the run-time representation of '_3'. Which we hope would be '3'.

How do we find ourselves in possession of these objects? We generate them implicitly, for every 'Nat'. And then they’re available automatically for us whenever we need them.

Of course, implementing one implicit for each 'Nat' would be horrendous. But remember above, where I pointed out '_3' is one layer of 'Succ' wrapped around '_2'. If we knew the result of '_2' we might in theory be able to calculate our instance for '_3'. This is induction/recursion, so here’s what 'ToInt' looks like:

Now you can do the following:

and we have turned our completely arbitrary 'Nat' construction into real runtime integers!

 

Comparisons

That’s all well and good, we ‘proved’ our 'Nat' can make sense as real integers. But we didn’t do any real compile time calculation on them.

The next step would be to have a typeclass which proves one 'Nat' is less than or equal to another. Say this typeclass was called 'Lte' (for ‘less than or equal’). Then an instance of type 'Lte[A, B]' would be available implicitly if and only if 'A <= B' (well, if their runtime representations held that relationship).

So it’s a typelevel function which has two arguments this time, and no return value or type: we just make it not available implicitly for invalid arguments. This would be roughly equivalent to an exception in a runtime algorithm.

So it’s very simple and looks like this:

Now we have to generate the implicit instances, again recursively. They are again reasonably simple:

So say we were trying to compute 'Lte[_3, _5]'. Then we would call 'succCase' implicitly with the arguments '_2' and '_4', since we peeled away one layer of 'Succ' on both arguments. We would continue peeling away until we tried to compare '_0' to '_2', where we could call 'zeroCase' and finish our recursion. We’d then zip back up the stack and return a value of type 'Lte[_3, _5]'.

If, on the other hand, we were searching for a value of type 'Lte[_5, _3]', we would as a first step look for 'Lte[_4, _2]' and so on. Eventually we would search for 'Lte[_2, _0]'. However, 'zeroCase' isn’t applicable since the '_0' is on the right instead of the left. And 'succCase' isn’t applicable since '_0' is the 'Succ' of no integer.

So we cannot compile a value of type 'Lte[_5, _3] '— as we would expect since 5 is not less-than-or-equal to 3.

You can try it in a repl:

which is great! A genuine type-level computation. Our 'Nat' are now ordered.

('Lt', 'Gt', 'Gte' can all be defined similarly or derived from one another)

 

Addition

The big one.

This is the first time our typelevel computation returns a type as its result, rather than a value. We could, of course, add two 'Nat' and return their runtime sum instead of their compile time sum if we wanted, but that would be boring.

So, we’re going to write a typelevel function 'Sum' which takes two summands, and returns a type. It looks like this:

Notice that the return type 'Out' is an inner type, not in the type parameter list. This is because if you wanted to summon an implicit 'Sum[A, B, Out]' then you would always have to know the sum of 'A' and 'B' before you could summon it, which would ruin the point. Instead we want to summon a 'Sum[A, B]' and have the compiler work out the innards for us.

We’re going to implement this naïvely at first, as we would want to write it, and then fix it when we realise the compiler won’t let us do that.

So, how do you add two numbers 'a' and 'b' together if all you can do is peel one layer of 'Succ' off, and recurse?

Well, suppose we could write 'a = z + 1', for some 'z'. Which we definitely can do, for every 'a' except '0'. Then we can see that 'a + b = z + (b + 1)'.

That seems like a recursive step to me. We just need to plug in the zero case on top of it.

So, 'Sum' could look like this:

Note here the return types. What we want to say is that the type of the implicit we’re generating is 'Sum[Succ[A], B] { type Out = sum.Out }', where the inner type has been fully realised as a type we calculated (in this case 'sum.Out'). Unfortunately this is not valid scala (though it should be) — but we’ll ignore this for now and examine the algorithm.

If you imagine searching for a 'Sum[_3, _2]', what would happen? We would examine 'succCase', and then recursively search for a 'Sum[_2, _3]'. Then a 'Sum[_1, _4]', and then finally a 'Sum[_0, _5]', which is provided by 'zeroCase'. We see the result of 'Sum[_0, _5]' is just '_5' ( 'type Out = B' )- so we take this value, zip back up the stack and plug in the type '_5' and we get an instance of type

Which works wonderfully… nearly.

Unfortunately due to limitations in scala the above type is not expressible in the position of a return type (or argument type come to that) of a function. We can’t specify inner types at that position. But we have to — we need to tell the compiler what the inner type is as the return type so it’s available at compile time rather than runtime.

Thankfully, someone once upon a time came up with a horrible hack known as the ‘aux pattern’ (aux meaning auxiliary). This is where we recognise such an expression is invalid, and we simply substitute a dummy type which is valid.

So we define this:

Think of 'Aux' simply as an alias for what we actually want to express. Now we can revisit 'Sum' and write it like this:

We’ve simply substituted 'Aux' in. And, amazingly, this now works. You can test it yourself:

and we’re done! Genuine calculations performed purely at compile time. You can of course go further and define multiplication, or lowest common denominator, or exponentiation, or whatever you want.

 

A quick note about the aux pattern. Let’s take 'zeroCase' as an example. If we had written

instead of using the aux pattern, then the 'Out' type 'B' would not be known to the compiler, it would be a runtime thing only. So in this case, in our test 'sum[_0, _6]' above, the return type would be a path-dependent 's.Out' and would not be fully realised as '_6'. Putting the return type in the type of the implicit definition elevates the runtime calculation to compile time, and the 'Aux' pattern allows us to do genuine typelevel calculations.'

 

This article was written by James Phillips and posted originally on Medium.