Abusing Elixir Processes and State!

Danger! Photo by Joey Banks on Unsplash

Recently I found a way of storing state in an Elixir Process that I hadn’t seen before. I never read about it in a book, never saw it in a talk and haven’t seen blog posts mentioning it. Did other people already know about this and it was frowned on? Did others just not know about it? But that couldn’t be, because I found it being used internally in the Elixir Logger module!

It’s called the “Process Dictionary”.

Before I dig in to the Dictionary and how easy it is to use and abuse, if you aren’t already familiar with the recommended ways to manage state with processes, check out my earlier post (or many others online).

I setup a simple Github project that makes it easy to play with Elixir processes and state management. https://github.com/brainlid/meetup_process_state

Logger’s Metadata

Logger has a powerful feature of tracking and including metadata on log entries. This is helpful for a Phoenix application when you want to tie all related log entries together using a unique request_id for the user’s request. In fact, Phoenix’s Plug.RequestId module does this.

To see it in action, here is a simple example. Note that this is using this Github project which configures Logger to output specified metadata.

    defmodule ProcessState.LoggerMeta do
      require Logger

      defp setup_metadata do
        Logger.metadata(state_1: "always_here")
      end

      defp step_1 do
        Logger.info("Step 1")
      end

      defp step_2 do
        Logger.info("Step 2")
      end

      defp step_3 do
        Logger.info("Step 3", [custom_1: "set_in_step_3"])
      end

      def run_steps do
        setup_metadata()
        step_1()
        step_2()
        step_3()
        {:ok, "Finished"}
      end
    end

Given the code above, an IEx terminal session might look like this…

    iex(1)> ProcessState.LoggerMeta.run_steps

    22:11:31.841 pid=<0.114.0> state_1=always_here [info]  Step 1

    22:11:31.841 pid=<0.114.0> state_1=always_here [info]  Step 2

    22:11:31.842 pid=<0.114.0> state_1=always_here custom_1=set_in_step_3 [info]  Step 3
    {:ok, "Finished"}

Did you notice that the “state_1=always_here” text in the log entry is on every line? That message was set once in the setup_metadata function with a call to Logger.metadata. Setting metadata on Logger this way is actually writing some state to the process running the code. It is showing up with every step. In step 3 some custom log metadata is written out as well. However, this will only show up for that one line.

How is this accomplished? The running process isn’t a GenServer or using tail-recursion to maintain state like the recommended approaches.

Looking at the Logger source, we find this…

    def metadata(keyword) do
      # [...]
      Process.put(@metadata, {enabled?, metadata})
      :ok
    end

Process Dictionary

An Elixir (and Erlang) process has a “Dictionary”. This is a local key-value pair storage area. Local here means it is internal to the process.

In Elixir, you can interface with it using these functions.

The only function not exposed through Elixir is the “erase/0” function. Which “erases the entire process dictionary. Returns the entire process diction before it was erased.” So perhaps we can see why that isn’t being made conveniently accessible.

However, if you need to call it, you can do so this way…

:erlang.erase()

Erlang Advanced Course has this to say about the Process Dictionary

Each process has a local store called the “Process Dictionary”.

[…]

Note that using the Process Dictionary:

Destroys referencial transparency

Makes debugging difficult

Survives Catch/Throw

So:

Use with care

Do not over use - try the clean version first

Now that we’ve introduced the Process Dictionary, lets abuse it!

defmodule ProcessState.DictionaryAbuse do
  require Logger

  def state_1 do
    Process.put(:state, :greeting)
    :ok
  end

  def state_2 do
    Process.put(:state, :count_to_10)
    :ok
  end

  def state_3 do
    Process.put(:state, :random)
    :ok
  end

  def run do
    case Process.get(:state) do
      :greeting ->
        "Hello there!"

      :count_to_10 ->
        Enum.each(1..10, fn(num) ->
          IO.puts("#{inspect num}...")
          Process.sleep(500)
        end)

      :random ->
        sayings = ["Well, that's just great.", "Sorry?", "Eh?", "Go on then!"]
        Enum.random(sayings)

      _ ->
        "Try again!"
    end
  end
end

In IEx we can interact with it like this…

      alias ProcessState.DictionaryAbuse
      DictionaryAbuse.state_1
      DictionaryAbuse.run

      DictionaryAbuse.state_2
      DictionaryAbuse.run

      DictionaryAbuse.state_3
      DictionaryAbuse.run

      Process.put(:state, "other")
      DictionaryAbuse.run

We might see something like this…

    iex(1)> alias ProcessState.DictionaryAbuse
    ProcessState.DictionaryAbuse
    iex(2)> DictionaryAbuse.state_1
    :ok
    iex(3)> DictionaryAbuse.run
    "Hello there!"

    iex(4)> DictionaryAbuse.state_2
    :ok
    iex(5)> DictionaryAbuse.run
    1...
    2...
    3...
    4...
    5...
    6...
    7...
    8...
    9...
    10...
    :ok

    iex(6)> DictionaryAbuse.state_3
    :ok
    iex(7)> DictionaryAbuse.run
    "Go on then!"
    iex(8)> DictionaryAbuse.run
    "Sorry?"
    iex(9)> DictionaryAbuse.run
    "Eh?"

Now imagine writing a unit test for the “run/0” function. It isn’t a “pure” function. It isn’t obviously predictable. Imagine running across similar code in a project. It takes more effort to understand it.

What may not be immediately apparent above is that we are modifying the state of the IEx process running our commands.

So we can use Process.get/1 to examine the current :state value we set in the state_x functions from IEx and it can access the state.

    iex(10)> Process.get(:state)
    :random

Lets try changing it to something that isn’t supported by the code in the module.

    iex(11)> Process.put(:state, "other")
    :random
    iex(12)> DictionaryAbuse.run
    "Try again!"

Using Process.get/0, we can get the whole Process Dictionary returned to see what it looks like.

    iex(13)> Process.get()
    [iex_history: %IEx.History.State{queue: {[{12, 'Process.get(:state)\n',
         :random}, {11, 'DictionaryAbuse.run\n', "Go on then!"},
        {10, 'DictionaryAbuse.run\n', "Eh?"},
        {9, 'DictionaryAbuse.run\n', "Eh?"},
        {8, 'DictionaryAbuse.state_3\n', :ok}, {7, '\n', nil},
        {6, 'DictionaryAbuse.run\n', :ok},
        {5, 'DictionaryAbuse.state_2\n', :ok}, {4, '\n', nil},
        {3, 'DictionaryAbuse.run\n', "Hello there!"},
        {2, 'DictionaryAbuse.state_1\n', :ok}],
       [{1, 'alias ProcessState.DictionaryAbuse\n',
         ProcessState.DictionaryAbuse}]}, size: 12, start: 1},
     # [...],
     state: :random,
     "$ancestors": [#PID<0.58.0>],
     "$initial_call": {IEx.Evaluator, :init, 4}]

What’s this? The IEx.Evaluator module that runs our commands in out beloved IEx is storing the session’s command history in the Process Dictionary! It’s even storing the results of the functions.

Also we see that “$ancestors” and “$initial_call” are stored in every process. The $ancestors is the owning process and $initial_call is what was used to start this process.

In fact, we can see the Process Dictionary in Observer when we view the IEx process. There is a “Dictionary” tab.

    # which process is the IEx.Evaluator?
    iex> self()
    #PID<0.114.0>

    iex> :observer.start

observer inspecting IEx dictionary

Curious! The Observer Dictionary page doesn’t include the :iex_history that we see when calling Process.get/0. Mysteries remain!

However, it’s time close out this long post looking at the Elixir/Erlang Process Dictionary.

Conclusion

Logger and IEx both use the Process Dictionary. They both seem to be good and appropriate uses. We should be aware of this tool so we can call on it when it is appropriate. I hope it is clear though that it can be easily abused making code harder to test and reason about. If you abuse it, be prepared to take some heat from your team.

The goal here is to become aware of another, less talked about way of storing state in a Process.

For the final words, I’ll share what Fred Hebert, the author of Learn You Some Erlang for Great Good and Erlang in Anger, had this to say on the topic. [source]

The process dictionary is not inherently evil, but it has many drawbacks that have been mentioned countless times already: it is harder to debug and reason about, it has different semantics than most of the language, it breaks when you try to send state to other processes without knowing it’s tied to the process’ very existence, it is not garbage collected until the process dies, it is hard to replace by a different key-value store and it angers people (I do consider this to be a negative). On the positive side, you have better update speed. On the neutral side, you have global access to some values, within the scope of the process: this can both be useful (static config) or dangerous (global scope!)

Look for the tradeoffs you’re ready to make, what your application actually needs. Use the right tool for the right job and make sure the process-global aspect of it and the speed do warrant all the downsides of it. I hope this helps.

Logger’s Metadata

Process Dictionary

Conclusion

Related