Archive

Archive for the ‘theory’ Category

Competitive Auctions

September 17th, 2009 No comments

This week I will present the Theory Discussion Group about Competitive Auctions. It is mainly a serie of results in papers from Jason Hartline, Andrew Goldberg, Anna Karlin, Amos Fiat, … The first paper is Competitive Auctions and Digital Goods and the second is Competitive Generalized Auctions. My objective is to begin with a short introduction about Mechanism Design, the concept of truthfulness and the characterization of Truthful Mechanisms for Single Parameter Agents. Then we describe the Random Sampling Auction for Digital Goods and in the end we discuss open questions. I thought writting a blog post was a good way of organizing my ideas to the talk.

1. Mechanism Design and Truthfulness

A mechanism is an algorithm augmented with economic incentives. They are usually applied in the following context: there is an algorithmic problem and the input is distributed among several agents that have some interest in the final outcome and therefore they may try manipulate the algorithm. Today we restrict our attention to a specific class of mechanisms called single parameter agents. In that setting, there is a set {N} consisting of {n} agents and a service. Each agent {i \in N} has a value {v_i} for receiving the service and {0} otherwise. We can think of {v_i} as the maximum player {i} is willing to pay for that service. We call an environment {\mathcal{X} \subseteq 2^N} the subsets of the bidders that can be simultaneously served. For example:

  1. Single item auction: {\mathcal{X} =\{S; \vert S \vert \leq 1\}}
  2. Multi item auction: {\mathcal{X} =\{S; \vert S \vert \leq k\}}
  3. Digital goods auction: {\mathcal{X} =2^N}
  4. Matroid auctions: {\mathcal{X}} is a matroid on {N}
  5. Path auctions: {N} is the set of edges in a graph and {\mathcal{X}} is the set of {s-t}-paths in the graph
  6. Knapsack auctions: there is a size {s_i} for each {i \in N} and {S \in \mathcal{X}} iff {\sum_{i \in S} s_i \leq C} for a fixed {C}

Most mechanism design problems focus in maximizing (or approximating) the social welfare, i.e., finding {S \in \mathcal{X}} maximizing {\sum_{i \in S} v_i}. Our focus here will be maximizing the revenue of the auctioneer. Before we start searching for such a mechanism, we should first see which properties it is supposed to have, and maybe even first that that, define what we mean by a mechanism. In the first moment, the agents report their valuations {v_i} (which can be their true valuations or lies), then the mechanism decides on an allocation {S \subseteq N} (in a possibly randomized way) and charges a payment {P_i} for each allocated agents. The profit of the auctioneer is {\sum_{i \in S} P_i} and the utility of a bidder is:

\displaystyle u_i = \left\{ \begin{aligned} v_i - P_i &, i \in S \\ 0 &, i \notin S \end{aligned} \right.

The agents will report valuations so to maximize their final utility. We could either consider a general mechanism e calculate the profit/social welfare in the game induced by this mechanism or we could design an algorithm that gives incentives for the bidders to report their true valuation. The revelation principle says there is no loss of generality to consider only mechanisms of the second type. The intuition is: the mechanisms of the first type can be simulated by mechanisms of the second type. So, we restrict our attention to mechanisms of the second type, which we call truthful mechanisms. This definnition is clear for deterministic mechanisms but not so clear for randomized mechanisms. There are two such definitions:

  • Universal Truthful mechanisms: distribution over deterministic truthful mechanisms, i.e., some coins are tossed and based on those coins, we choose a deterministic mechanism and run it. Even if the players knew the random coins, the mechanism would still be truthful.
  • Truthful in Expectation mechanisms: Let {u_i(b_i)} be the utility of agent {i} if he bids {b_i}. Since it is a randomized mechanism, then it is random variable. Truthful in expectation means that {\mathop{\mathbb E}[u_i(v_i)] \geq \mathop{\mathbb E}[u_i(b_i)], \forall b_i}.

Clearly all Universal Truthful mechanisms are Truthful in Expectation but the converse is not true. Now, before we proceed, we will redefine a mechanism in a more formal way so that it will be easier to reason about:

Definition 1 A mechanism {\mathcal{M}} is a function that associated for each {v \in {\mathbb R}^N} a distribution over elements of {\mathcal{X}}.

Theorem 2 Let {x_i(v) = \sum_{i \in S \in \mathcal{X}} Pr_{\mathcal{M}(v)}[S]} be the probability that {i} is allocated by the mechanism given {v} is reported. The mechanism is truthful iff {x_i(v)} is monotone and each allocated bidder is charged payment:

\displaystyle P_i = v_i - \frac{1}{x_i(v_i, v_{-i})} \int_0^{v_i} x_i(w, v_{-i}) dw

This is a classical theorem by Myerson about the characterization of truthful auctions. It is not hard to see that the auction define above is truthful. We just need to check that {x_i(v_i, v_{-i}) (v_i - P(v_i, v_{-i})) \geq x_i(v'_i, v_{-i}) (v_i - P(v'_i, v_{-i}))} for all {v'_i}. The opposite is trickier but is also not hard to see.

Note that this characterization implies the following characterization of deterministic truthful auctions, i.e., auctions that map each {v \in {\mathbb R}^N} to a set {S \in \mathcal{X}}, i.e., the probability distribution is concentrated in one set.

Theorem 3 A mechanism is a truthful deterministic auction iff there is a functions {f_i(b_{-i})} such that for each we allocate to bidder {i} iff {b_i \geq f_i(b_{-i})} and in case it is allocated, we charge payment {f_i(b_{-i})}.

It is actually easy to generate this function. Given a mechanism, {b_i \mapsto x_i(b_i, b_{-i})} is a monotone and is a {\{0,1\}}-function. Let {f_i(b_{-i})} the point where it transitions from {0} to {1}. Now, we can give a similar characterization for Universal Truthful Mechanism:

Theorem 4 A mechanism is a universal truthful randomized auction if there are functions {f_i(r,b_{-i})} such that for each we allocate to bidder {i} iff {b_i \geq f_i(r,b_{-i})} and in case it is allocated, we charge payment {f_i(r,b_{-i})}, where {r} are random bits.

2. Profit benchmarks

Let’s consider a Digital Goods auction, where {\mathcal{X} = 2^N}. Two natural goals for profit extraction would be {\mathcal{T}(v) = \sum_i v_i} and {\mathcal{F}(v) = \max_i i v_i} where we can think of {v_1 \geq v_2 \geq \hdots \geq v_n}, the first is the best profit you can extract charging different prices and the second is the best profit you can hope to extract by charging a fixed price. Unfortunately it is impossible to design a mechanism that even {O(1)}-approximates both benchmarks on every input. The intuition is that {v_1} can be much larger then the rest, so there is no way of setting {f_1(b_{-1})} in a proper way. Under the assumption that the first value is not much larger than the second, we can do a good profit approximation, though. This motivates us to find an universal truthful mechanism that approximates the following profit benchmark:

\displaystyle \mathcal{F}^{(2)}(v) = \max_{i \geq 2} i v_i

which is the highest single-price profit we can get selling to at least {2} agents. We will show a truthful mechanism that {4}-approximates this benchmark.

3. Profit Extractors

Profit extractor are building blocks of many mechanisms. The goal of a profit extractor is, given a constant target profit {C}, extract that profit from a set of agents if that is possible. In this first moment, let’s see {C} as an exogenous constant. Consider the following mechanism called CostShare{_C (v)}: find the largest {k} s.t. {k \cdot v_k \geq C}. Then allocate to

Lemma 5 CostShare{_C} is a truthful profit-extractor that can extract profit {C} whenever {\mathcal{F}(v) = \max_i i v_i \geq C}.

Proof: It is clear that it can extract profit at most {C} if {\mathcal{F}(v) \geq C}. We just need to prove it is a truthful mechanism and this can be done by checking the characterization of truthful mechanisms. Suppose that under CostShare{_C (v)} exacly {k} bidders are getting the item, then let’s look at a bidder {i}. If bidder {i} is not getting the item, then his value is smaller than {C/k}, otherwise we could incluse all bidders up to {i} and sell for a price {C/k_1} for some {k_1 > k}. It is easy to see that bidder {i} will get the item just if he changes his value {v_i} to some value greater or equal than {C/k}.

On the other hand, it {i} is currently getting the item under {v}, then increasing his value won’t make it change. It is also clear that for any value {v_i \geq C/k}, he will still get the item. For {v_i < C/k} he doesn’t get it. Suppose it got, then at least {k+1} people get the item, because the price they sell it to {i} must be less than {v_i < C/k}. Thefore, increasing {v_i} back to its original value, we could still sell it to {k+1} players, what is a contradiction, since we assumed we were selling to {k} players.

We checked monotonicity and we also need to check the payments, but it is straightforward to check they satisfy the second condition, since {x_i(v_i) = 1} for {v_i \geq C/k} and zero instead. \Box

4. Random Sampling Auctions

Now, using that profit extractor as a building block, the main idea is to estimate {C} smaller than {\mathcal{F}(v)} for one subset of the agents and extract that profit from them using a profit extractor. First we partition {N} is two sets {N'} and {N''} tossing a coin for each agent to decide in which set we will place it, then we calculate {\mathcal{F}' = \mathcal{F}(v_{N'})} and {\mathcal{F}'' = \mathcal{F}(v_{N''})}. Now, we run CostShare{_{\mathcal{F}'} (v'')} and CostShare{_{\mathcal{F}''} (v')}. This is called Random Cost Sharing Auction.

Theorem 6 The Random Cost Sharing Auction is a truthful auction whose revenue {4}-approximates the benchmark {\mathcal{F}^{(2)}(v)}.

Proof: Let {\mathcal{R}} be a random variable associated with the revenue of the Sampling Auction mechanism. It is clear that {\mathcal{R} = \min \{ \mathcal{F}', \mathcal{F}'' \}}. Let’s write {\mathcal{F}^{(2)}(v) = kp} meaning that we sell {k} items at price {p}. Let {k = k' + k''} where {k'} and {k''} are the items among those {k} items that went to {N'} and {N''} respectively. Then, clearly {\mathcal{F}' \geq p k'} and {\mathcal{F}'' \geq p k''}, what gives us:

\displaystyle \frac{\mathcal{R}}{\mathcal{F}^{(2)}} = \frac{\min\{\mathcal{F}', \mathcal{F}''\}}{\mathcal{F}^{(2)}} \geq \frac{\min\{k'p, k''p\}}{kp} = \frac{\min\{k', k''\}}{k}

and from there, it is a straighforward probability exercise:

\displaystyle \frac{\mathop{\mathbb E}[{\mathcal{R}}]}{\mathcal{F}^{(2)}} = \mathop{\mathbb E}\left[{ \frac{\min\{k', k''\}}{k} }\right] = \frac{1}{k} \sum_{i=1}^{k-1} \min\{ i, k-i \} \begin{pmatrix} k \\ i \end{pmatrix} 2^{-k}

since:

\displaystyle \frac{k}{2} = \sum_{i = 0}^k i \begin{pmatrix} k \\ i \end{pmatrix} 2^{-k} \leq \frac{k}{4} + 2 \sum_{i = 0}^{k/2} i \begin{pmatrix} k \\ i \end{pmatrix} 2^{-k}

and therefore:

\displaystyle  \frac{\mathop{\mathbb E}[{\mathcal{R}}]}{\mathcal{F}^{(2)}} \geq \frac{1}{4} \Box

This similar approximations can be extended to more general environments with very little change. For example, for multi-unit auctions, where {\mathcal{X} = \{ S; \vert S \vert \leq k \}} we use the benchmark {\mathcal{F}^{(2,k)} = \max_{2 \leq i \leq k} i v_i} and we can be {O(1)}-competitive against it, by random-sampling, evaluating {\mathcal{F}^{(1,k)} = \max_{\leq i \leq k} i v_i} on both sets and running a profit extractor on both. The profit extractor is a simple generalization of the previous one.

Minimum average cost cycle and TSP

September 2nd, 2009 No comments

After some time, I did again Code Jam – well, not again, this is the first time I do Code Jam, but there is a while I don’t do Programming Competitions. Back in my undergrad I remember all the fun I had with my ACM-ICPC team solving problems and discussing algorithms problems. Actually, ICPC was what made me interested in Algorithms and Theory of Computing for the first time. I was remembering that not only because Code Jam because I came across a nice problem whose solution I learned in programming competitions, specifically a technique I learned to solve this problem.

Let’s formulate the problem in a more abstract way: Given a graph {G = (V,E)} and two functions: a cost function {c:E \rightarrow {\mathbb R}_+} and a benefit function {b:E \rightarrow {\mathbb R}_+}, we define the cost-benefit of a set of edges as {\frac{b(S)}{c(S)}}. Now, consider those two questions:

Question I: Find the spanning tree of maximum (minimum) cost-benefit.

Question II: Find the cycle of maximum (minimum) cost-benefit.

The solution of those uses binary search. If we can answer the following query: given {\beta > 0}, is there a cycle (spanning tree) of cost-benefit smaller (larger) than {\beta}? We either state there is no such tree (cycle) or exhibit that. How can we solve this? It is simple: consider the graph {G} with edge weights given by {b_e - \beta c_e}. Then there is a cycle (spanning tree) of cost benefit {\leq \beta} if and only if there is a cycle (spanning tree) in this graph with transformed weights with negative total weight. Finding a cycle with negative weight is easy and can be done, for example, using Bellman Ford’s algorithm. Finding a spanning tree with negative weights can be done using any minimal spanning tree algorithm, as Kruskal, Prim or Boruvka.

Taking {c(e) = 1} for all {e \in E} we can find using binary search, the cycle with smallest average length, i.e., smallest {b(C) / \vert C \vert} where {\vert C \vert} is the number of edges in the cycle.

Asymmetric Travelling Salesman Problem

We can use this trick just described to design an {O(\log n)}-approximation to the asymmetric TSP problem. Consider we have {n} nodes in {V} and a function {c: V \times V \rightarrow {\mathbb R}_+}, not necessarily symmetric, such that the triangular inequality holds, i.e., {c(i,j) \leq c(i,k) + c(k,j)}. A TSP tour is an ordering {\pi: \{1, \hdots, n\} \rightarrow V} and has total cost:

\displaystyle c(\pi) = \sum_{j=1}^n c(\pi_j, \pi_{j+1})

where {\pi_{n+1} = \pi_1}. Let OPT be the cost of the optimal tour. It is NP-complete to calculate the optimal, but consider the following approximation algorithm: find the cycle with smallest average cost. Then remove all the nodes in that cycle except one, in the remaining graph find again the cycle of smallest average cost and remove all nodes except one. Continue doing that until there is just one node left. Taking all those cycles together, we have a strongly connected Eulerian graph (in-degrees are equal to out-degrees) for each node). I claim that the total weight of edges {E'} in this Eulerian graph is:

\displaystyle c(E') \leq 2 \mathcal{H}_n \cdot OPT

where {\mathcal{H}_n = \sum_{j=1}^n \frac{1}{j} = O(\log n)} is the harmonic number. Now, since we have this graph we can find an Eulerian tour and transform it into a TSP tour shortcutting when necessary (triangle inequality guarantees that shortcutting doesn’t decrease the cost of the tour). So, we just need to prove the claim.

In fact, it is not hard to see that after removing some nodes, the optimal tour is still {\leq OPT}, where {OPT} is the tour of smallest cost for all nodes. To see this, just take the original tour and shortcut it, for example, if the original tour passed through a sequence of nodes {i_1, \hdots, i_p} but nodes {i_2, \hdots, i_{p-1}} then by triangle inequality:

\displaystyle c(i_1, i_p) \leq \sum_{j=1}^{p-1} c(i_j, i_{j+1})

so we can just substitute the edges {(i_1, i_2), \hdots, (i_{p-1}, i_p)} by {(i_1, i_p)}. Now, suppose we do {k} iterations and in the beginning of the {j^{th}} iteration there are {n_j} nodes left. So, clearly the average length of the cycle we picked in the algorithm is {\leq \frac{OPT}{n_j}} and therefore, if {C_1, \hdots C_k} are the cycles chosen, we have:

\displaystyle c(E') = \sum_{j=1}^k c(C_j) \leq \sum_{j=1}^k \frac{OPT}{n_j} (n_j - n_{j+1} + 1)

since:

\displaystyle \frac{n_j - n_{j+1} + 1}{n_j} \leq \frac{1}{n_j} + \frac{1}{n_j - 1} + \hdots + \frac{1}{n_{j+1}}

we plug those two expressions together and we get the claim.

Entropy

August 27th, 2009 No comments

Today was the first day of classes here at Cornell and as usual, I attend to a lot of different classes to try to decide which ones to take. I usually feel like I wanted to take them all, but there is this constant struggle: if I take too many classes I have no time to do research and to read random things that happen to catch my attention at that moment, and if I don’t take many classes I feel like not learning a lot of interesting stuff I wanted to be learning. The solution in the middle of the way is to audit a lot of classes and start dropping them as a start needing more time: what happens usually quickly. This particular fall I decided that I need to build a stronger background in probability – since I am finding a lot of probabilistic stuff in my way and I have nothing more than my undergrad course and things I learned on demand. I attended at least three probability classes with different flavours today and I decided to blog about a simple, yet very impressive result I saw in one of them.

Since I took a class on “Principles of Telecommunications” in my undergrad, I became impressed by Shannon’s Information Theory and the concept of entropy. There was one theorem that I always heard about but never saw the proof. I thought it was a somewhat complicated proof, but it turned out not to be that much.

Consder an alphabet {\Omega} and a probability distribution over it. I want to associate to each {\omega \in \Omega} a string {c(\omega)} of {k(\omega)} {\{0,1\}}-digits to represent each simbol of the alphabet. One way of allowing the code to be decodable is to make them a proper code. A proper code is a code such that given any {\omega_1} and {\omega_2}, {c(\omega_1)} is not a prefix of {c(\omega_2)}. There are several codes like this, but some are more efficient then others. Since the letters have different frequencies, it makes sense to code a frequent letter (say ‘e’ in English) with few bits and a letter that doesn’t appear much, say ‘q’ with more bits. We want to find a proper code to minimize:

\displaystyle \mathop{\mathbb E}[k(\omega)] = \sum_{\omega \in \Omega} k(\omega) p(\omega)

The celebrated theorem by Shannon shows that for any proper code (actually it holds more generally for any decodable code), we have {\mathop{\mathbb E}[k(\omega)] \geq H} where {H} is the entropy of the alphabet, defined as:

\displaystyle H = - \sum_{\omega} p(\omega) \log_2 p(\omega)

even more impressive is that we can achieve something very close to it:

Theorem 1 There is a code such that {\mathop{\mathbb E}[k(\omega)] \leq H + 1}.

With an additional trick we can get {H + \epsilon} for any {\epsilon > 0}. The first part is trickier and I won’t do here (but again, it is not as hard as I thought it would be). For proving that there is a code with average length {\leq H + 1} we use the following lemma:

Lemma 2 There is a proper code for {\Omega} with code-lengths {k(\omega)} if and only if {\sum_\omega 2^{-k(\omega)} \leq 1}

Proof: Let {N = \max_\omega k(\omega)} and imagine all the possible codewords of length {\leq N} as a complete binary tree. Since it is a proper code, no two codes {c(\omega_1)} and {c(\omega_2)} are in the same path to the root. So, picking one node as a codeword means that we can’t pick any node in the subtree from it. Also, for each leave, the is at most one codeword in its path to the root. Therefore we can assign each leaf of the tree to a single codeword or to no codeword at all. It is easy to see that a codeword with size {k(\omega)} has associated with it {2^{N - k(\omega)}} leaves. Since there are {2^N} leaves in total, we have that:

\displaystyle \sum_\omega 2^{N-k(\omega)} \leq 2^N

what proves one direction of the result. Now, to prove the converse direction, we can propose a greedy algorithm: given {\Omega} and {k(\omega)} such that {\sum_\omega 2^{-k(\omega)} \leq 1}, let {N = \max_\omega k(\omega)}. Now, suppose {k(\omega_1) \leq k(\omega_2) \leq k(\omega_3) \leq \hdots}. Start with {2^N} leaves in a whole block. Start dividing them in {2^{k(\omega_1)}} blocks and assign one to {\omega_1}. Now we define the recursive step: when we analyze {\omega_j}, the leaves are divided in {2^{k(\omega_j-1)}} blocks, some occupied, some not. Divide each free block in {2^{k(\omega_j) - k(\omega_j-1)}} blocks and assign one of them to {\omega_j}. It is not hard to see that each block corresponds to one node in the tree (the common ancestor of all the leaves in that block) and that it corresponds to a proper code. \Box

Now, using this we show how to find a code with with {\mathop{\mathbb E}[k(\omega)] \leq H + 1}. For each {\omega}, since {p(\omega) \in (0,1]} we can always find {k(\omega)} such that {\frac{1}{2} p(\omega) \leq 2^{-k(\omega)} \leq p(\omega)}. Now, clearly:

\displaystyle \sum_\omega 2^{-k(\omega)} \leq \sum_\omega p(\omega) = 1

and:

\displaystyle \mathop{\mathbb E}[k(\omega)] = \sum_\omega k(\omega) p(\omega) \leq \sum_\omega [1 - \log_2 p(\omega)] p(\omega) = H + 1

Cool, but now how to bring it to {H + \epsilon} ? The idea is to code multiple blocks at the same time (even if they are independent, we are not taking advantage of correlation between the blocks). Consider {\Omega^k} and the probability function induced on it, i.e.:

\displaystyle p_k (\omega_1, \hdots, \omega_k) = \prod_{i=1}^k p(\omega_i)

It is not hard ot see that {\Omega^k} with {p_k} has entropy {kH} because:

\displaystyle \begin{aligned} \sum_{\omega_1, \hdots, \omega_k} p_k(\omega_1, \hdots, \omega_k) \log_2 p_k(\omega_1, \hdots, \omega_k) =\\ = \sum_{\omega_1, \hdots, \omega_k} \prod_i p(\omega_i) \sum_i \log_2 p(\omega_i) =\\ = \sum_i \sum_\omega p(\omega) \log_2 p(\omega) = kH \end{aligned}

and then we can just apply the last theorem to that: we can find a function that codifies {k} symbols {\omega = (\omega_1, \hdots, \omega_k)} with {l(\omega)} symbols such that:

{kH \leq \mathop{\mathbb E}[l(\omega)] \leq kH + 1} since {l(\omega)} codifies {k} symbols, we are actually interested in {\mathop{\mathbb E}[l(\omega)/k]} and therefore we get:

\displaystyle H \leq \mathop{\mathbb E}\left[\frac{l(\omega)}{k}\right] \leq H + \frac{1}{k}