Continuing the exploration of Clojure's concurrency utilities leads us to agents, a built-in method of managing independent, asynchronous change to a single value. We can think of this as a parallel to a single actor that controls some private state which can only be accessed by sending messages to the actor. Agents are a little bit different in that they invert the control over how the state is modified: instead of an actor receiving a predefined set of messages which have different effects on the state, agents allow the sender to specify how the state is mutated. The agent itself is only responsible for serializing the mutations and allows the state to be read by anyone. Controlling access to state through agents is preferable to software transactional memory (STM) when asynchronous updates are acceptable, as you can avoid the performance concerns of STM. In this post, I will walk through an example of using agents for tracking metrics for some samples.
The first line shows how to initialize an agent. In our case, it is a map with three values: the sum of all samples, the sum of the squares of all samples, and the count of the number of samples. By updating these three values for each sample, we can compute the resulting mean and variance at any point in time. The two "public" functions here are record-sample and get-metrics, whose purpose should be clear from the names. Because we want recording metrics to be thread-safe, we process all updates through the agent by using the send function in this line: (send global-metrics update-metrics value). Send tells the agent to process this update at some point in the future, and the update-metrics function takes the sample value and updates the three values in the map. Then to read the metrics in get-metrics, we first call (await global-metrics) to ensure that all actions sent from the current thread finish in order to prevent race conditions of not reading a sample we just recorded (since agents process asynchronously). Finally, we read the current value of global-metrics and return the computed mean and variance.
The final two lines are examples of using these functions; we, in parallel, record samples from 0 to 10000 and then get the mean and variance. Running this, we see the expected result: {:mean 9999/2, :variance 33333333/4}. To understand agents a bit more, I included the commented-out line 4 to demonstrate why the await call is necessary. If we sleep on that line, we still obtain the correct result but it takes 10 seconds. If we sleep on that line and exclude line 18 with the await, we will get something arbitrary like {:mean 221/15, :variance 18404/225} from whatever subset of samples happened to be recorded at that time (the result is nondeterministic). Thus it is essential that we allow the actions from the current thread to finish before attempting to read the metrics. Other than that, there is no real synchronization visible in this code, as it is hidden away by the agent abstraction.
Clojure offers a wide array of methods of dealing with shared, mutable state. Agents are one of the core tools that the language provides for isolating who can update a piece of state so that there is no burden on the programmer to manage that. They also play well with the STM; sending actions to agents within a transaction will do nothing until the transaction is committed, so that each action is executed only a single time regardless of whether the transaction is forced to roll back. In the actor model with Erlang, we were forced by the immutability constraint to have a dedicated actor for managing any shared, mutable state, and Clojure agents are a specialization of that concept designed for simplicity of code. Overall, Clojure seems to have taken a very open-minded approach for addressing the concurrency problem and presents a handful of unique solutions mixed together in a nice way.
No comments:
Post a Comment