Custom JSON encoding for structs in Elixir with Jason
A guide for Elixir developers (and why you might want to)
Jason?
Jason is an actively maintained JSON parser and generator in Elixir - if you think of handling JSON structures to send to your clients or just shepherding data around your Elixir application you’re likely to reach for this or Poison - of which the contents of this post are relevant to too.
Purpose & Use-case
At Multiverse we’re working on developing a new version of an existing third-party integration and taking it in-house as part of our core platform. The feature in question isn’t important, it’s just important to know that there’s a lot of data - think millions of records.
One problem is that we don’t have direct access to the database of the existing system - what we do have is the ability to export the data into a CSV! Thankfully there are only 8 or so fields for each record meaning we can resort to a good, old-fashioned CSV import to bring the data over.
Simple right? Wrong.
Importing 1m+ rows of data in PSQL isn’t too bad but it will slow down and potentially lock up the rest of our application while we do the import - and that’s bad! We also need to transform the data into a more reasonable data structure for our new implementation - the old version had some inconsistencies and type issues we’d like to do away with - for example dates represented in epoch time and un-required data fields.
So now we need to transform and import each record, something that’ll take time. To mitigate this, we’re going to use RabbitMQ, an open source message broker by packaging each of the rows from our CSV into a JSON message, asynchronously firing them off as messages and let our RabbitMQ consumer deal with them as they come in concurrently - if you’re interesting in learning more about Rabbit, let me know!
So where does Jason encoding come in?
Remember that we want to transform our data - the issue with exporting everything into a CSV file is that when we read it into our application we’ll be reading everything as a string. Converting our date fields now involve us manually calling String.to_integer/2
which is something we don’t want to have to do manually - imagine we had 100 fields to import!
A good programmer is a lazy programmer.
Defining our struct
defmodule LegacyLog do
defstruct [:id, :user_id, :legacy_log_id, :time, :notes, :date, :inserted_at, :updated_at]
end
All we have here is a simple Elixir struct with fields for each of the records we want to handle and parse from the CSV import.
Using it in our CSV import:
defp serialise(data) do
data
|> Enum.map(fn row ->
[id, user_id, logtype_id, time, _target, notes, date, timecreated, timemodified, _, _] = row
%LegacyLog{
id: id,
user_id: user_id,
legacy_log_id: logtype_id,
time: time,
notes: notes,
date: date,
inserted_at: timecreated,
updated_at: timemodified
}
end)
end
Yes, we could transform the data here as we import it but good RabbitMQ practise involves pretending that our messages are coming from somewhere external we don’t have direct access to - even though we’re producing and consuming them ourselves in this instance.
Custom encoding the JSON
The bit you’re really here for.
We need to take advantage of the way Jason
has defined Protocols
to define a custom implementation on how to handle our encoding.
Following on from our previously defined struct, we’re going to add a defimpl
for the struct itself and specify what we want to do:
defmodule LegacyLog do
defstruct [:id, :user_id, :legacy_log_id, :time, :notes, :date, :inserted_at, :updated_at]
defimpl Jason.Encoder, for: LegacyLog do
@impl Jason.Encoder
def encode(value, opts) do
{notes, remaining_log} = Map.pop(value, :notes)
remaining_log
|> Map.from_struct()
|> Map.new(fn {k, v} -> {k, String.to_integer(v)} end)
|> Map.put(:notes, notes)
|> Jason.Encode.map(opts)
end
end
end
In the above code I’m defining an implementation for the type LegacyLog
meaning when a Jason.Encode/3
call comes it’s way, it knows to call my custom code - any other encoding function call will use the default implementation (which is fine for most use cases).
The custom code itself is:
Popping the one field I want to remain a string out of the Struct and keeping the rest
Converting the remaining
Struct
into aMap
Iterating through each field and changing them into integer values
Putting the notes back
Doing the iodata generation using
Jason.Encode
which guarantees valid JSON.
We’re done!
Follow me on Twitter for more Elixir (and general programming) tips.