Minimum average cost cycle and TSP
After some time, I did again Code Jam – well, not again, this is the first time I do Code Jam, but there is a while I don’t do Programming Competitions. Back in my undergrad I remember all the fun I had with my ACM-ICPC team solving problems and discussing algorithms problems. Actually, ICPC was what made me interested in Algorithms and Theory of Computing for the first time. I was remembering that not only because Code Jam because I came across a nice problem whose solution I learned in programming competitions, specifically a technique I learned to solve this problem.
Let’s formulate the problem in a more abstract way: Given a graph and two functions: a cost function
and a benefit function
, we define the cost-benefit of a set of edges as
. Now, consider those two questions:
Question I: Find the spanning tree of maximum (minimum) cost-benefit.
Question II: Find the cycle of maximum (minimum) cost-benefit.
The solution of those uses binary search. If we can answer the following query: given , is there a cycle (spanning tree) of cost-benefit smaller (larger) than
? We either state there is no such tree (cycle) or exhibit that. How can we solve this? It is simple: consider the graph
with edge weights given by
. Then there is a cycle (spanning tree) of cost benefit
if and only if there is a cycle (spanning tree) in this graph with transformed weights with negative total weight. Finding a cycle with negative weight is easy and can be done, for example, using Bellman Ford’s algorithm. Finding a spanning tree with negative weights can be done using any minimal spanning tree algorithm, as Kruskal, Prim or Boruvka.
Taking for all
we can find using binary search, the cycle with smallest average length, i.e., smallest
where
is the number of edges in the cycle.
Asymmetric Travelling Salesman Problem
We can use this trick just described to design an -approximation to the asymmetric TSP problem. Consider we have
nodes in
and a function
, not necessarily symmetric, such that the triangular inequality holds, i.e.,
. A TSP tour is an ordering
and has total cost:
where . Let OPT be the cost of the optimal tour. It is NP-complete to calculate the optimal, but consider the following approximation algorithm: find the cycle with smallest average cost. Then remove all the nodes in that cycle except one, in the remaining graph find again the cycle of smallest average cost and remove all nodes except one. Continue doing that until there is just one node left. Taking all those cycles together, we have a strongly connected Eulerian graph (in-degrees are equal to out-degrees) for each node). I claim that the total weight of edges
in this Eulerian graph is:
where is the harmonic number. Now, since we have this graph we can find an Eulerian tour and transform it into a TSP tour shortcutting when necessary (triangle inequality guarantees that shortcutting doesn’t decrease the cost of the tour). So, we just need to prove the claim.
In fact, it is not hard to see that after removing some nodes, the optimal tour is still , where
is the tour of smallest cost for all nodes. To see this, just take the original tour and shortcut it, for example, if the original tour passed through a sequence of nodes
but nodes
then by triangle inequality:
so we can just substitute the edges by
. Now, suppose we do
iterations and in the beginning of the
iteration there are
nodes left. So, clearly the average length of the cycle we picked in the algorithm is
and therefore, if
are the cycles chosen, we have:
since:
we plug those two expressions together and we get the claim.