Connecting...

Pexels Photo 92905

Using Applicative Style to Increase Compositionality — Part I by Stephen Lazaro

Pexels Photo 92905

It's always great to find out the different ways Developers use functional programming as it helps to develop your skills and knowledge. In this article written by Stephen Lazaro, he helps us to examine functional programming's applicative style that they use at Earnest to increase compositionality allowing them to simplify responsibilities, and provide greater extensibility. 

 

'As developers, especially back-end developers, we often find ourselves writing code that will need to be maintained and revisited for months or even years. We’d like to have clear, easy to rework code. It’s not fun to revisit code and find we have no idea how to change it!

Each component should have a clear and single responsibility. It should also be easy to extend and integrate with new functionality. In this blog post and its sequel, we examine functional programming’s applicative style that we employ here at Earnest to increase compositionality allowing us to simplify responsibilities, and provide greater extensibility.

In this post we’ll work in JavaScript and in the next we’ll work in Scala, both languages in use at Earnest, to emphasize that applicative style is portable between languages.

 

Study One: Report Aggregation

Let’s suppose that we want to produce one or many summary statistics from a report. The report is a series of record objects with some fixed schema (it’s not particularly important what it is for this purpose). Let’s assume that the report is an iterator or streaming interface for now. It might not all fit in memory at once, and it may not be possible to re-run. We need to consume the report in one go.

The first approach that occurs to most people is to mutate some variables defined outside of the iteration loop, taking a generally imperative approach:

// Obtain aggregates in parallel
const getAggregates = report => {
  // Announce some goals
  let sumOfBalance = 0;
  let count = 0;
  // Update our goal values in the appropriate way
  report.forEach(function (record) {
    sumOfBalance += record.balance;
    count += 1;
  });
  // Give our accomplished goals back to the person who asked us
  return {
   totalBalance: sumOfBalance,
   length: count
  };
}

Mutably produce aggregates out of band

There’s certainly nothing wrong with this code from a strictly technical standpoint. It’s clear, it works, and it only consumes the iterator once to produce these aggregates — so it’s largely efficient.

On the other hand, there are some possible rough edges here. First, 'getAggregates' is responsible for calculating multiple aggregates “in parallel”, but each of these aggregates is calculated in the same loop. If we wanted to reuse just one of those calculations elsewhere, we would pretty much have to write it again. Making our aggregates happens in the same space as traversing the large report so we don’t have as nice a separation of responsibilities as we might like.

More importantly, partly because of the conflation of responsibilities, this code is unnecessarily difficult to extend in a maintainable fashion. Imagine that rather than just two aggregates, we needed to calculate 25.

const getAggregates = report => {
  // Announce… uh, a lot of goals
  let sumOfBalance = 0;
  let count = 0;
  let sumOfInterest = 0;
  let numberInRepayment = 0;
  // …20 other declarations here…
  let obscureCalculation = 0;

  report.forEach(function (record) {
    // A lot of updates?
    sumOfBalance += record.balance;
    count += 1;
    sumOfInterest += record.interest;
    numberInRepayment += record.inRepayment ? 1 : 0;
    // .. 20 other calculations here…
    obscureCalculation = insaneProcedure(obscureCalculation, record)
  });
  
  // Return a huge object
  return {
    totalBalance: sumOfBalance,
    // … 23 other keys here …
    length: count
   };
};

This is not even to raise the specter of a possibly unbounded number of aggregates that we don’t know beforehand! This function could grow quite large and become pretty intimidating looking.

How can we improve this? Well, there is a lot in common with each of these aggregates, so maybe we need to pull them apart and understand their parts.

To start, let’s isolate one aggregate in its own function and try to understand its parts:

// Single aggregate, summing the balances of records
const getSumOfBalance = report => {
  // Declare our goal
  let sumOfBalance = 0;

  // Sum each balance into our goal
  report.forEach(function (record) {
    sumOfBalance += record.balance;
  });

  // Return the achieved goal
  return sumOfBalance;
}

Now the ceremony around names and variables sticks out like a sore thumb — we don’t need any of that! We can refactor this to be immutable and declaration free using a 'reduce':

// Single aggregate, summing the balances of records
const getSumOfBalance = report =>
  // Run over the records
  report.reduce(
    // Adding the balance of each record to the total
    function (sum, record) {
      return sum + record.balance;
    },
    // Starting at 0
    0
  );

At the risk of seeming overly fussy, we can separate yet another responsibility. Namely, traversing the report and building the aggregate.

// Sum the balances _object by object_
const sumBalances = (prior, record) =>
  prior + record.balance;

// Sum all the balances one by one by using our object by object definition
const getSumOfBalance = report =>
  report.reduce(sumBalances, 0);

Overall, this code is pretty nice looking! We’ve broken each responsibility into its own component, but it doesn’t solve our initial problem. After all, we wanted to build multiple aggregates in a single pass. Right now it looks like we’re only further from our goal.

We could go back to the loop approach by applying our functions like 'sumBalance', but then we’re stuck mixing iteration with calculation. We still have to know every aggregate beforehand. Not much of an improvement!

Let’s take a step back. Now, it looks like to build an aggregate we need two things:
1. An initial value to start from.
2. A way to combine each record into the prior calculation.

We need different initial values so we can try to calculate things like products, or more sophisticated statistical functions.

You may recognize that we’re talking about monoids, but we don’t need to know anything about that to structure this code.

Let’s encode this abstraction:

// Produce an aggregation from a _way of combining_ and a _starting value_
const makeAggregation = (operation, initial) => ({
  operation: operation,
  initialValue: initial
});

// Sum balances by aggregating through _summation_ and _starting at zero_
const sumBalances = makeAggregation(
  (sum, record) => sum + record.balance,
  0
);

// And to use it…
const getSumOfBalance = report =>
  report.reduce(sumBalances.operation, sumBalances.initialValue)

Creating a notion for aggregates!
 

Great, we have a notion for aggregations, so we have an interface or shape to code against. That’ll make it a lot easier to write code that works on any aggregation.

We want to combine those aggregate objects. How can we do this? Well, it turns out there’s only one way to do it in general:

// Again, producing an aggregation
const makeAggregation = (operation, initial) => ({
  operation: operation,
  initialValue: initial
});

// Combine them via running them on the same inputs in sequence
// If JavaScript weren't a single threaded run time, we could even do this in parallel!
const combineAggregation = (aggregationA, aggregationB) => ({
  reduction: ([priorA, priorB], record) =>
    [
      aggregationA.operation(priorA, record),
      aggregationB.operation(priorB, record)
    ],
  initialValue: [aggregationA.initialValue, aggregationB.initialValue]
});

// Sum balances by aggregating through _summation_ and _starting at zero_
const sumBalances = makeAggregation(
  (sum, record) => sum + record.balance,
  0
);

// Count via just adding 1 every time we see another record
const count = makeAggregation(
  (sum, record) => sum + 1,
  0
);

// Both take the count and sum the balances
const countAndSumBalances =
  combineAggregation(sumBalances, count);

// Now actually run those aggregations!
const getSumOfBalancesAndCount = report =>
  report.reduce(
    countAndSumBalances.operation,
    countAndSumBalances.initialValue
  );

Combining two aggregates and running them in one pass.

Alright, cool! We can combine two aggregates and have our clean distribution of responsibilities. If you’re versed in functional programming lingo, you might notice that we’re talking about something a lot like semigroupal functors here.

We’re still not quite there though. Two is good, but can we do unbounded numbers of aggregates? It turns out we can, we just have to lean on arrays using the exact same machinery!

// Again, producing an aggregation
const makeAggregation = (operation, initial) => ({
  operation: operation,
  initialValue: initial
});

// Adjust the combination a bit to be more general
// We take _all the aggregations_ and apply them to the prior values _in their index_ and the next value _in their index_
const combineAggregationsReduction = (combined, next) => 
  makeAggregation((prior, nextValue) =>
    // A bit finicky, admittedly
    [
      …combined.operation(
       prior.slice(0, -1),
       nextValue
      ),
      next.operation(
        prior[prior.length - 1],
        nextValue
      )
    ],
    […combined.initialValue, next.initialValue]
  );

// We need to wrap the first aggregate's reduction and initial value in arrays
// so that we can combine it correctly with the rest.
const liftToArray = aggregate => 
  makeAggregation(
    ([a], b) => [aggregate.operation(a, b)],
    [aggregate.initialValue]
   );

// Reduce our list of aggregates into one aggregate
const combineAggregation = aggregates =>
  aggregates.slice(1).reduce(
    combineAggregationsReduction,
    liftToArray(aggregates[0])
  );

// Which we can use to combine multiple aggregates via…
// Again, sum balances by aggregating through _summation_ and _starting at zero_
const sumBalances = makeAggregation(
  (sum, record) => sum + record.balance,
  0
);

// Again, count via just adding 1 every time we see another record
const count = makeAggregation(
  (sum, record) => sum + 1,
  0
);

// Take the product over "variation" whatever that is
const productOfVariation =
  makeAggregation(
    (product, record) => product * record.variation,
    1
);

// Record the _last_ id in the report via just saving the latest seen
const lastIdSeen =
  makeAggregation(
    (prior, next) => next.id,
    null 
  );

// And we now run all of the aggregates at once
const getAggregates = reports => {
  const aggregates = combineAggregation([
    count,
    sumBalances,
    productOfVariation,
    lastIdSeen
  ]);
  
  return reports.reduce(
    aggregates.operation,
    aggregates.initialValue
  );
};

Combining arbitrary aggregates in an aggregation!

This trick is known as using a traversable functor (the array in this case) in common functional programming parlance, but again we don’t need to know that to use the design.

Notice that it’s now trivial to extend this with more aggregates, we just write a new aggregation and add it to the array of aggregates to apply. Even better, in principle, we can accept any array of aggregations and apply them. If we wanted, we could now dynamically determine at runtime what aggregations we want to apply and just pass them in.

The final version of our (now relatively abstract and generic) machinery then looks like:

// Again, producing an aggregation
const makeAggregation = (operation, initial) => ({
  operation: operation,
  initialValue: initial
});

// Again, combining aggregations
const combineAggregationsReduction = (combined, next) => 
  makeAggregation((prior, nextValue) =>
    // A bit finicky, admittedly
    [
      …combined.operation(
        prior.slice(0, -1),
        nextValue
      ),
      next.operation(
        prior[prior.length - 1],
        nextValue
      )
    ],
    […combined.initialValue, next.initialValue]
  );

// Again, just setting up the first aggregation
const liftToArray = aggregate => 
  makeAggregation(
    ([a], b) => [aggregate.operation(a, b)],
    [aggregate.initialValue]
  );

// Again, reduce our list of aggregates into one aggregate
const combineAggregation = aggregates =>
  aggregates.slice(1).reduce(
    combineAggregationsReduction,
    liftToArray(aggregates[0])
  );

// Now, we can combine arbitrarily many dynamically
// determined aggregations to later run in a single pass
const getAggregates = aggregates => {
  const combined = combineAggregation(aggregates);

  return reports =>
    reports.reduce(
      combined.operation,
      combined.initialValue
    );
};

Final result. Not bad!

Pretty easy to use! This is fairly clear and much more extensible. Let’s pat ourselves on the back.

On the other hand, we should be upfront about the costs, since there is no free lunch. This version of 'combineAggregationsReduction' builds quite a few intermediate arrays due to its use of the spread operator. At scale, this could induce some non-trivial memory pressure! A possibly more efficient approach would be to allow local mutation and use 'Array.prototype.concat' to combine results. Another approach would be to lean on a utility library like 'ramda' that has an efficient 'zipWith' operator (or implement it yourself) and rework the approach a bit. There’s a lot of wiggle room in the design space here, but it is something to be aware of with regards to the relatively naive version presented here.

From a readability perspective, the arrays might not be the ideal return type, but they do have some advantages. While we lack the context of keys, and the general readability that follows, it’s impossible to assume anything about the schema now (it’s just some array), which is useful when dynamically determining the aggregates to be run. That’s not ideal for all use cases! That said, it is absolutely possible to adapt this machinery to return objects, by enriching our 'aggregation' objects with a 'key' that each aggregate saves its state under. I encourage the interested reader to give it a try (it’s much easier if you allow yourself to use a utilities library to manage merging objects)!

 

Coda

Hopefully, this study made the meaning of the applicative style a little more clear and was an interesting study in some strategies to write composable and functionally styled code. If you’re hungry for more, keep an eye out for the sequel!

If this kind of stuff interests you, Earnest is hiring and we’d love to talk!

Here are some classic resources on applicatives and especially their use in building aggregates and reports:

And, of course, you can always reach out to me @Slazasaurus on Twitter. Thanks for reading!'

 

This article was written by Stephen Lazaro and posted originally on Medium