WWW 2008, April 21-25, 2008, Beijing, China. 2008 978-1-60558-085-2/08/04
2
Sponsored search auction is an effective way of monetizing search activities where advertisers pay to place their ads on search results pages for specific user keyword queries. In this work we focus on the bidding optimization problem for an advertiser with budget constraints. Formally, we address the following problem: for each keyword and each time period, how much should the advertiser bid (i.e., which position to obtain), so as to maximize ROI of the ads given a fixed budget and a fixed time horizon?
We can view each ad position as an item with associated weight (cost)
and value (either revenue or profit). The advertiser has a budget constraint, and it
naturally corresponds to the knapsack capacity. Furthermore each advertiser can have
at most one ad appear on each keyword results page. This
corresponds to the constraint that at most one item from each item-set can be taken
in the multiple-choice knapsack problem (MCKP), a well-known variation
of the knapsack problem (KP).
The following model of the bidding problem is proposed: given multiple keywords , multiple time periods when the advertiser places bids
, and multiple positions
, the item-set consists of items
for all ad positions s. Formally and are defined as follows:
The bidding optimization problem has a strong stochastic flavor as bidding prices and click traffic change all the time. The online and stochastic variant of MCKP (S-MCKP) where item-sets are received one at a time, allows us to explicitly incorporate this uncertainty in the distribution of future item-sets. Although, decisions are made assuming that the item-sets are independent and identically distributed (iid), the algorithm performs well even when items in different time periods do not follow the same distribution as evidenced by experimental results for the ``auto insurance" dataset.
In the last few years, a number of papers addressed the problem of revenue maximization or bidding optimization in sponsored search auctions [1,4,2,3,7]. None of the previous work proposed a solution that employs distributional information about prices and solves the bidding problem with multiple ad position, keywords, and time periods. Zhou et al. [8] attack a very similar problem and model it as Online-MCKP; however their work focuses on competitive algorithms with worst-case performance guarantees while this work focuses on average-case performance with stochastic input. The threshold function we develop for S-MCKP is based on the threshold function for the stochastic knapsack problem by Lueker [6].
We first apply a technique for converting items from a MCKP item-set into incremental items with the following property: taking multiple incremental items with decreasing efficiency (value/weight) is equivalent to taking exactly one original item. The technique consists of two steps: (i) removing dominated and LP-dominated items from the item-set and (ii) creating incremental items from undominated items. For details, see Kellerer et al. [5], p.320.
After obtaining incremental items, we use a threshold function to decide which incremental items to take. Lueker [6] proposed solving Online-KP using an adaptive threshold: only items that meet the threshold efficiency are put in the knapsack. The threshold function (which we call )
maps the average remaining capacity per time period to the threshold efficiency . The threshold efficiency is such that the expected weight of the remaining items (all items are iid) with efficiency at
least is equal to the remaining capacity:
(2) |
The algorithm for S-MCKP is in Figure 1. It consists of two phases: the first phase (optional) is to generate the threshold function if training item-sets are available. In the second phase, at each time period the algorithm converts the current item-set to incremental items, checks which incremental items meet the threshold, and takes the corresponding item from the item-set. The threshold function can be updated in step 2 by incorporating information from the current item-set. In the case when no training item-sets are available in step 1, the first real item-set is used to generate the initial threshold function.
We evaluate the performance of the algorithm based on the ratio of the
value obtained by the algorithm and an upper bound on the
optimal solution to offline MCKP. Results below are for the
experiment with no training item-sets for generating the threshold
function in step 1 of Alg 1. The performance of the algorithm is almost
always within 10% of the optimal when and approaches the
optimal as goes to . A graph for the performance of Alg 1 with the
Uniform distribution is shown below. Graphs with the other
distributions look very similar, and are omitted due to space
constraints.
The second set of experiments uses a real dataset for the ``auto
insurance" keyword described in [8]. The performance of our
algorithm is around 99% while the algorithm in [8] achieves
around 90%-95%.