Recently I found a way of storing state in an Elixir Process that I hadn’t seen before. I never read about it in a book, never saw it in a talk and haven’t seen blog posts mentioning it. Did other people already know about this and it was frowned on? Did others just not know about it? But that couldn’t be, because I found it being used internally in the Elixir Logger module!
It’s called the “Process Dictionary”.
Before I dig in to the Dictionary and how easy it is to use and abuse, if you aren’t already familiar with the recommended ways to manage state with processes, check out my earlier post (or many others online).
I setup a simple Github project that makes it easy to play with Elixir processes and state management. https://github.com/brainlid/meetup_process_state
Logger has a powerful feature of tracking and including metadata on log entries. This is helpful for a Phoenix application when you want to tie all related log entries together using a unique request_id for the user’s request. In fact, Phoenix’s Plug.RequestId module does this.
To see it in action, here is a simple example. Note that this is using this Github project which configures Logger to output specified metadata.
Given the code above, an IEx terminal session might look like this…
Did you notice that the “state_1=always_here” text in the log entry is on every
line? That message was set once in the
setup_metadata function with a call to
Logger.metadata. Setting metadata on Logger this way is actually writing some
state to the process running the code. It is showing up with every step. In step
3 some custom log metadata is written out as well. However, this will only show
up for that one line.
How is this accomplished? The running process isn’t a GenServer or using tail-recursion to maintain state like the recommended approaches.
Looking at the Logger source, we find this…
An Elixir (and Erlang) process has a “Dictionary”. This is a local key-value pair storage area. Local here means it is internal to the process.
In Elixir, you can interface with it using these functions.
The only function not exposed through Elixir is the “erase/0” function. Which “erases the entire process dictionary. Returns the entire process diction before it was erased.” So perhaps we can see why that isn’t being made conveniently accessible.
However, if you need to call it, you can do so this way…
Each process has a local store called the “Process Dictionary”.
Note that using the Process Dictionary:
- Destroys referencial transparency
- Makes debugging difficult
- Survives Catch/Throw
- Use with care
- Do not over use - try the clean version first
Now that we’ve introduced the Process Dictionary, lets abuse it!
In IEx we can interact with it like this…
We might see something like this…
Now imagine writing a unit test for the “run/0” function. It isn’t a “pure” function. It isn’t obviously predictable. Imagine running across similar code in a project. It takes more effort to understand it.
What may not be immediately apparent above is that we are modifying the state of the IEx process running our commands.
So we can use
Process.get/1 to examine the current
:state value we set in
the state_x functions from IEx and it can access the state.
Lets try changing it to something that isn’t supported by the code in the module.
Process.get/0, we can get the whole Process Dictionary returned to see
what it looks like.
What’s this? The
IEx.Evaluator module that runs our commands in out beloved
IEx is storing the session’s command history in the Process Dictionary! It’s
even storing the results of the functions.
Also we see that “$ancestors” and “$initial_call” are stored in every process. The $ancestors is the owning process and $initial_call is what was used to start this process.
In fact, we can see the Process Dictionary in Observer when we view the IEx process. There is a “Dictionary” tab.
Curious! The Observer Dictionary page doesn’t include the
:iex_history that we
see when calling
Process.get/0. Mysteries remain!
However, it’s time close out this long post looking at the Elixir/Erlang Process Dictionary.
Logger and IEx both use the Process Dictionary. They both seem to be good and appropriate uses. We should be aware of this tool so we can call on it when it is appropriate. I hope it is clear though that it can be easily abused making code harder to test and reason about. If you abuse it, be prepared to take some heat from your team.
The goal here is to become aware of another, less talked about way of storing state in a Process.
The process dictionary is not inherently evil, but it has many drawbacks that have been mentioned countless times already: it is harder to debug and reason about, it has different semantics than most of the language, it breaks when you try to send state to other processes without knowing it’s tied to the process’ very existence, it is not garbage collected until the process dies, it is hard to replace by a different key-value store and it angers people (I do consider this to be a negative). On the positive side, you have better update speed. On the neutral side, you have global access to some values, within the scope of the process: this can both be useful (static config) or dangerous (global scope!)
Look for the tradeoffs you’re ready to make, what your application actually needs. Use the right tool for the right job and make sure the process-global aspect of it and the speed do warrant all the downsides of it. I hope this helps.