Scala Object Serialization for MapR-DB by Nicolas A Perez


Let's take a look at things from a Scala point of view!

If you're using MapR-DB it is common to serialize and deserialize objects to JSON as MapR-DB data is stored using JSON format. Have a read of this article by Software Engineer, Nicolas A Perez on the steps of this process.



Previously, we have discussed some of the advantages and features accompanying MapR-DB. However, this time we are going to get our hand dirty while using this enterprise-grade database.

When using MapR-DB, it is a common practice to serialize and deserialize our business objects (commonly known as POJO) from/to JSON all the time since MapR-DB data is stored using JSON format. These operations are very common and frequent so we will take a look at them from Scala’s point of view.


MapR-DB Document API

There are a series of steps to be followed to create new objects and to insert them into MapR-DB. Let’s look at the typical workflow.

1 val connection = DriverManager.getConnection("ojai:mapr:")
2 val documentStore: DocumentStore = connection.getStore("/user/mapr/tables/view_counts")
4 val document = connection.newDocument().set("_id", someId).set("count", 10)
6 documentStore.insertOrReplace(document)
7 documentStore.close()

This is the basic steps to insert into MapR-DB. This snippet can be extended for more complex use cases, but essentially, they will look very similar to this one.

There is one issue that can be easily recognized here. The way we create the 'document' object through the fluent API is far from convenient. Normally, we would like to pass a POJO instead of building the document manually.

In Java, we could do the following.

1 @Data
2 @Builder(toBuilder = true)
3 class LinkMessage {
4     String _id;
6     int count;
7 }
9 Link getLink() {
10     return LinkMessage
11       .builder()
12       ._id("someId")
13       .count(10)
14       .build();
15 }
17 DocumentStore store = connection.getStore("/user/mapr/tables/view_counts");
19 Link link = getLink();
20 Document document = connection.newDocument(link);
22 store.insertOrReplace(document);
23 store.close();

In the code above, we can see that our class 'Link' is used to create the document that will be saved to the database. MapR-DB will utilize the Java Beam object to create the document object.

Now, the problem becomes a title more tedious when using Scala, which should be your language of choice, anyway.


The Scala Issue

Using Scala, we could also use Java Bean to create the desired objects as in Java, yet other problems quickly arise. Let’s see the same example we used before, but this time in Scala.

1 case class Link(@BeanProperty _id: String, @BeanProperty count: Int)
3 val link = Link("100", 10)
4 val store = connection.getStore("/user/mapr/tables/view_counts")
5 val document = connection.newDocument(link)
7 store.insertOrReplace(document)
8 store.close()

If you try this out, you will discover that the object 'link' cannot be converted to Java Bean because of the value '_id' starts with '_'. This might look small, but all documents inserted to MapR-DB should have the field '_id', converting this initial, small issue into a deal breaker.

We can always go back to use manual object construction for each POJO object we have, but we should walk away from this idea as soon it comes to us, for obvious reasons.

Another alternative is to look at mechanisms to convert Scala objects to Document. It is evident we need a type class for doing the heavy lifting and bring flexibility to the conversion system.

Let’s define a type class for doing this work. Let’s name it 'MySerializer' for a lack of a better name.

1 @typeclass
2 trait MySerializer[A] {
3   def toDocument(a: A): Document
4 }
6 object MySerializer {
7   private val objectMapper = new ObjectMapper()
8   objectMapper.registerModule(DefaultScalaModule)
10   implicit def default[A]()(implicit conn: Connection): MySerializer[A] = (a: A) => {
11     val json = objectMapper.writeValueAsString(a)
13     conn.newDocument(json)
14   }
15 }

As we can see, 'MySerializer' uses a default way to convert objects to document using Jackson serialization. Having a default serializer is a good option since the majority of the objects will use it, yet not everyone is built the same, so we need specializations as well.

Now, our code will look like as follows.

1 import MySerializer.ops._
3 val link = Link("LINK_ID", 10)
4 val linkDocument = link.toDocument
6 store.insertOrReplace(linkDocument)
7 store.close()

As mentioned before, sometimes the default document conversion won’t work, for instance, let’s look at the following example.

1 case class Person(name: String)

Using the default converter with 'Person' will cause an error when trying to save the generated document to the database. MapR-DB needs an '_id' as the document key as stated before. In this case, we need to a custom converter for the class 'Person'.

This is where the type class mechanism shines. We can specify the exact way to create documents from Person. Let’s see how.

1 object Person {
2   implicit def personSerializer(implicit conn: Connection): MySerializer[Person] = (a: Person) => {
3     conn.newDocument().set("_id",
4   }
5 }
7 val p = Person("lolo")
9 val personDocument = p.toDocument
10 println(personDocument) // {"_id": "lolo"}

Notice that we have both options, one is to use the default serializer and the second is to use a custom serializer to the specific object in question. This allows a fine-grained serialization mechanism that ultimately yields genericity without given specialization up.

At the same time, the serialization system is outside of the object itself. We should be able to modify how the serialization works without affecting the objects at all. Ultimately, we could override how serialization is done based on a specific context while having different serialization mechanics for different situations as they are needed. This is almost impossible to do in Java, but Scala is a beast at the Ad-Hoc polymorphism world.



MapR-DB OJAI API is nice, but it does not play well with Scala objects, especially around those that do not comply with the Java Bean specifications. On the other hand, Scala offers advanced constructs like type classes that allow us to go around many of the interoperability issues out there while keeping type safety and enabling ad-hoc polymorphism.‚Äč'
This article was written by Nicolas A Perez and posted originally on Medium