## Monday, January 21, 2013

### Steiner Trees

I don't usually pay much attention to theory and algorithms results, but I recently came across a paper that proves a new approximation bound for the Steiner tree problem, which is a simple enough problem in graph theory to interest me. The Steiner tree problem is as follows: given a undirected, weighted graph $G(V, E)$ and a subset $R \subseteq V$ of its vertices, find the minimum-weight connected subgraph of $G$ that contains all the vertices in $R$. When $R = V$, it reduces to the classical minimum spanning tree problem, which has a variety of efficient solutions. The Steiner tree problem, however, is known to be NP-complete, so most of the work on it has gone into finding the best approximation algorithms. Although the best results are complex and require the appropriate theory background to understand, there are a couple of nice facts about the Steiner tree problem that can be summarized easily.

The first is that the minimum-weight Steiner tree for $R$ in $G$ has the same cost as the minimum-weight Steiner tree for $R$ in $G^c$, where $G^c$ is the metric closure of $G$. $G^c$ is obtained by creating a complete graph on $V$ where the weight of the edge between $u$ and $v$ is the weight of the shortest path between $u$ and $v$ in $G$. To show this, let $OPT$ and $OPT^c$ be the costs of the minimum-weight Steiner trees for $R$ in $G$ and $G^c$, respectively. We know that $OPT^c \leq OPT$ because $G^c$ has all of the edges of $G$ with less or equal weights, so any Steiner tree in $G$ can be directly translated to a Steiner tree in $G^c$ that is no more expensive. But we can also show that $OPT \leq OPT^c$ by observing that for any tree in $G^c$, we can replace the edges by the shortest paths in $G$. The resulting subgraph has equal cost and contains all the vertices in $R$, so the minimum-weight Steiner tree has cost at most that. Since $OPT = OPT^c$, we can assume without loss of generality that we are working with a complete graph whose distances are a metric (in particular, they satisfy the triangle inequality).

Secondly, we'll provide a simple 2-approximation algorithm for the Steiner tree problem. Consider the minimum spanning tree of $G_R$ (the subgraph induced by $R$), which exists because of the assumption that we have a complete graph. If we let $MST$ be its weight, we claim that $MST < 2 \cdot OPT$. To show this, start with a depth-first traversal of the minimum-weight Steiner tree $T$. This traversal visits all of the vertices in $R$ and uses each edge of $T$ exactly twice (once on the way "down" and once coming back "up"), so it has weight $2 \cdot OPT$. For two consecutive vertices $u, v \in R$ in the traversal, replace the path in the traversal by a direct edge from $u$ to $v$, which from the triangle inequality is no more expensive than the existing path. After doing this for all such pairs, the result is a cycle visiting all vertices in $R$, but this is strictly more expensive than the minimum spanning tree of $G_R$, so we have $MST < 2 \cdot OPT$. Thus outputting the minimum spanning tree of $G_R$ is a 2-approximation for the Steiner tree problem.

The above two results are classical and taught in algorithms classes; once you get beyond the 2-approximation, the required machinery becomes much more advanced. Although I can't claim to fully understand the techniques described in the paper, their approach is based on a formulation of the Steiner tree problem as a linear program (LP), which is essentially a set of linear constraints with a linear cost function to optimize. In combinatorial optimization problems, however, the variables usually have to take on discrete/integer values (e.g. choose this edge or not), whereas linear programs allow them to take on any real values.  The technique of LP relaxation takes the original problem and "relaxes" the discrete/integer constraint to formulate a proper linear program; the cost of doing so is that the solution may no longer be optimal, which is why this leads to approximation algorithms rather than exact ones. But linear programs have polynomial-time algorithms, so the approximations can be computed efficiently.

The core of the algorithm proposed in the paper seems to be an iterative use of LP relaxation. In each iteration, they solve the current instance of their LP (called the "directed-component cut relaxation"). Using the solution, they sample a "component" randomly and reduce it to a single node, so the graph gets smaller with each step. They repeat this process some number of times and then connect the components by computing the minimum spanning tree on the remaining vertices in $R$. This algorithm results in a remarkable $\ln{4} + \epsilon < 1.39$ approximation for the Steiner tree problem, improving on the previous best result of  $1.55$.