Anyone who has worked with MapReduce knows that, while it suits many distributed computations well, it is very difficult to implement matrix computations using the framework. It is possible to represent matrix-based algorithms such as PageRank using MapReduce (e.g. here) but it definitely isn't the most natural or efficient way. Moreover, PageRank is pretty simple as far as matrix computations go, and once you want to do things like Cholesky decompositions for solving systems of linear equations or singular value decompositions it gets very tricky. MapReduce was simply not built for these use cases, so there is nothing to fault it for, but since it is the standard go-to distributed computing framework people have tried to do just about everything with it. Different problems require different solutions, though, and the creators of MadLINQ, a system coming out of Microsoft Research, are trying to overcome the difficulty of implementing complex, distributed matrix algorithms with their new system.
The MadLINQ system shares a lot in common with its cousin DryadLINQ that it integrates with. They both have use the directed acyclic graph (DAG) model of computation where each node represents some unit of computation that depends on the outputs of some other nodes. For example, MapReduce itself can be represented in this model by a set of "map" nodes and a set of "reduce" nodes where an edge exists from each of the map nodes to each of the reduce nodes. The key then lies in how MadLINQ maps a matrix computation into a DAG that can be efficiently executed in parallel. To do this, the authors use the familiar tiling abstraction applied to many parallel matrix operations (e.g. tiled matrix multiplication) where the matrix is broken into disjoint, square sub-matrices to operate on. Each node in the DAG is then an optimized, parallel matrix operation on a set of input tiles from the original matrix or intermediate results. One nice thing is that they abstract away the underlying matrix library for the individual node computations so that improvements in that space (which many people work on) will result in improvements in MadLINQ.
One interesting optimization implemented in MadLINQ is what the authors call "fine-grained pipelining." In both MapReduce and DryadLINQ, a node in the DAG must wait for its predecessors to completely finish processing before reading all of their output and executing. This has two negative consequences: first is the fact that during periods where there is low parallelism in the DAG (you can only break down the units of execution so far before the overhead exceeds the benefits) the cluster will be idle, and second is the fact that network utilization will be more bursty because a node decides to read the entirety of its inputs at once. MadLINQ allows individual nodes to produce partial output and consume partial input, enabling pipelining at the sub-node level (hence the term "fine-grained"). This is particularly useful in the context of matrix computations because many of the individual nodes also execute tile-based algorithms which lend themselves to this granular pipelining. The authors went through quite a bit of trouble to optimize the system as much as possible to really "solve" the use case of distributed matrix computations.
MadLINQ really speaks to how hard problems in distributed systems are. MapReduce has been a huge step forward in the last 10 years, but it has its limitations, and as more and more people and companies are turning to large-scale computations and processing, they will need to invent new solutions to solve the wide array of problems out there. The fundamental design decisions that must be made when building a new distributed system inherently limit its applications in a way that, for example, modern processors are not limited. People (generally) do not design specific processors for their applications anymore, and one would expect distributed systems to eventually reach that point as well, but there is certainly a long road ahead.
No comments:
Post a Comment