Thursday, July 25, 2013

Erlang Actors

Following my post on how multithreading is hard, I spent some time learning about the alternatives, one of the most promising of which is the actor model. I have read a lot about how Erlang is a good language for building concurrent systems like message queues (RabbitMQ is the prime example), and actors are a core part of Erlang, so I decided to check it out more extensively (I found this tutorial to be useful). Erlang is functional, so you have to think carefully about how to manage "state" if you are coming from a traditional imperative programming background like I am. To guide my learning, I wrote a very primitive blocking queue in Erlang that is bounded and supports only blocking puts and takes. As it turns out, this looks very different from how you might write one in Java using synchronization primitives like locks and condition variables. It's probably easiest to start out with the code in order to guide the discussion:

To start off, we have the new() function which returns a new "instance" of the blocking queue. As it turns out, the function actually just returns an Erlang process ID, which refers to the process that is created by the spawn() call (I like to think of it as a "listener"). This newly-created process is running the listen() function, which we will get to a bit later, but the important thing to understand is that the process is the only place where the "state" of the queue is managed. No state is returned to the caller of new() because everything is immutable in Erlang, so it does not make sense for different actors to actually be carrying around the "state" of the queue. You may already notice that functional programming and the actor model make you think quite differently.

So now we have the setup: whoever creates the queue has a reference to the process ID of the queue's listener, and meanwhile the listener is running as its own process. Next are the put() and take() functions, which follow the same pattern. They each send a message to the listener (the exclamation mark syntax) and then wait for a response (the receive ... end syntax). At a high level, that's all there is to the actor model; processes are all running independently of each other except for the fact that they can send message to and receive messages from other processes. Using receive gives us the blocking semantics because the listener will not send the reply message until it has successfully put the new value or taken a value off the queue. Thus we have built-in synchronization without having to mess around with low-level synchronization primitives.

Finally, we get to the meat of the program, the listen() function. The first thing it does is decide what it should listen for, and this is based on what is currently in the queue. Depending on whether the queue is empty, full, or neither, we can receive puts, takes, or both, respectively. If the queue is empty, for example, the receive block only recognizes puts, so the process will ignore the take messages until it gets into a state in which it is able to process them (causing all callers of take() to wait in their respective receive blocks). Lastly, you'll notice that both handle_put() and handle_take() end with a recursive call to listen(). This is, again, because Erlang is functional and everything is immutable. We are not able to use something like a while loop around the receives because that would require that the queue is modified in each iteration, so instead we just call listen() again with the new state of the queue.

So there it is, a simple blocking queue implemented in a functional style with actors. It required a lot more thinking than I expected to come up with a solution using these unfamiliar tools, but in the end I started realizing their benefits. By taking shared, mutable state out of the picture you can avoid worrying about most of the issues that plague multithreaded programming. Questions that are still open in my mind include whether the actor model is rich enough to capture the complexity of real applications and what the performance implications are of writing code in this way. It has definitely been worthwhile to explore actors through Erlang, though, and I definitely plan on checking out Akka, which brings actors to Scala.

No comments:

Post a Comment