New JSON Array Extraction for Lift

Posted by Matt Farmer on May 03, 2017 · 6 mins read

JSON is everywhere. That means that properly supporting all of JSON’s flexibility is an important priority for any web framework in 2017. For a typed language, like Scala, this type of problem can get interesting because JSON can do a few things that aren’t quite as intuitive to do in an actual Scala application. One of those is supporting collections with many diverse types.

In JSON, like many other data formats, you can have arrays that help store collections of information. Such as a list of my favorite dishes to eat:

["spaghetti", "pizza", "pb&j", "pimento cheese sandwich"]

As you can tell, I make sure that all of the major food groups are covered in my diet. I could also create a tabular data structure and describe a health rating of 0-10 of each of these foods:

[
  ["spaghetti", 6],
  ["pizza", 4],
  ["pb&j", 7],
  ["pimento cheese sandwich", 8]
]

This is a totally valid JSON data structure. The inner data structure here is called a heterogenous array because the types of the different positions are different. However, it’s one that’s a bit challenging to represent in Scala without additional libraries. Unless you use a Tuple.

The data structure above can easily be modeled in Scala using a Seq[(String, Int)], but until recently Lift-Json couldn’t interpret this data structure without a custom serializer and deserializer defined by the developer.

We recently merged a pull request that changes that by implementing a feature we’re calling Array Tuple Extraction. This feature will ship with Lift 3.1.0-M3 and will be disabled by default for backwards-compatibility reasons. This blog post will walk through what has changed, the limitations of our new Tuple extraction, and how to use it.

Current Tuple Support

Lift-Json has always supported tuples in one form or another, but it has not, until now, supported heterogenous arrays. In Lift 3.1.0-M2 and earlier, a Seq[(String, Int)] would be decomposed and serialized into the following JSON data structure:

[
  {"_1": "spaghetti", "_2": 6},
  {"_1": "pizza", "_2": 4},
  {"_1": "pb&j", "_2": 7},
  {"_1": "pimento cheese sandwich", "_2": 8}
]

If those _1 and _2 members in the object look familiar to you, it’s because they come from the constructor for the Tuple2 class that Scala would use to model this at runtime.

One of Lift-Json’s strengths is that its method of looking at the declaration of a class when figuring out how to serialize it means it can do a lot of things automatically without intervention. The above data structure will correctly serialize and deserialize through Lift-Json with no extra work from the developer using the library.

Unfortunately, this isn’t super semantically correct with regards to how JSON thinks about arrays. Many REST JSON APIs require you to submit a heterogenous array as a part of a payload for certain requests. Until recently, this is a need that Lift-Json users would need to manually craft the AST representation of the data structure or write a custom serializer/deserializer for this data.

Array Tuple Extraction

The new Array Tuple Extraction functionality allows developers using Lift-Json to correctly represent Heterogenous Arrays as mixed-type Tuples in Scala without any additional serializer/deserializer work.

As with many changes you might want to make to Lift-Json’s extractor, all you need to do is provide a Formats instance as an implicit val that turns the feature on. You can do this, for example, by declaring the following at the top of a class:

val formats = new DefaultFormats {
  override val tuplesAsArrays = true
}

This will activate the new tuple extractor and decomposer that will represent a Tuple of (Thing1, Thing2) as a JSON array of [Thing1, Thing2].

There are, however, a few caveats with this functionality:

  • Scala primitives don’t yet work reliably. If you’re going to serialize/deserialize Tuples with your application using the array extractor you should ensure you’re using Java’s boxed types (java.lang.Integer, java.lang.Boolean, etc). This has something to do with the limitations of JVM reflection on Scala primitive types, but we haven’t determined the best solution.
  • Back-compat support is provided, but not two-way. The array extractor can correctly interpret Tuples that were serialized in the old format, but if you want to write Tuples in that format again, you’ll need to change your Formats to disable the feature first.
  • The implementation is a bit… dodgy. The Lift-Json extraction code is the code that takes actual, factual Scala datatypes and turns them into an AST for translation into JSON. This is the point at which we start handling Tuples differently. Unfortunately, the extractor is quite old at this point and challenging to reason about. I’ve started a speculative rewrite of the extractor using Scala runtime reflection, but don’t expect to see feature parity with the main extractor for awhile.

Final Notes

My hope is that this feature turns out to be really useful to folks using the Lift Framework and Lift-Json to push JSON over a wire. If you have any questions, find some bugs, or just want to tell us what you think: drop us a line on the Lift Mailing List.