Tutorial T7-A - Learning to Rank for Information Retrieval

Tie-Yan Liu, Microsoft Research Asia

Abstract

An introduction will be given to the new research area, learning to rank. Taking document retrieval as an example, the process of learning to rank is as follows. In learning, a training set of queries and their associated documents (with relevance judgments) are provided, and the ranking model is trained with the data in a supervised fashion, by minimizing certain loss functions. In ranking, the model is applied to new queries and sorts their associated documents. Three major approaches have been proposed in the literature, i.e., the pointwise, pairwise and listwise approaches to learning to rank. The pointwise approach solves the problem of ranking by means of regression or classification on single documents. The pairwise approach transforms ranking to classification on document pairs. The listwise approach tackles the ranking problem directly, by adopting listwise loss functions, or optimizing evaluation measures (e.g., NDCG and MAP). We will introduce the frameworks and representative algorithms of these approaches, and then discuss their advantages, weaknesses, application scopes, and underlying theories. In addition, we will also introduce some other works on learning to rank, including training data creation, semi-supervised ranking, and advanced ranking models in this tutorial.

Presenter

Tie-Yan Liu is a lead researcher at Microsoft Research Asia. He leads a team working on learning to rank for information retrieval, and graph-based machine learning. So far, he has more than 70 quality papers published in referred conferences and journals, including SIGIR(9), WWW(3), ICML(3), etc. He has about 40 filed US / international patents or pending applications on learning to rank, general Web search, and multimedia signal processing. He is the co-author of the Best Student Paper for SIGIR 2008, and the Most Cited Paper for the Journal of Visual Communication and Image Representation (2004~2006). He is an Area Chair of SIGIR 2009, a Senior Program Committee member of SIGIR 2008, and Program Committee members for many other international conferences, such as WWW 2009, WWW 2008, ICML 2008, and ACL 2008. He is the co-chair of the SIGIR workshop on learning to rank for information retrieval (LR4IR) in 2007 and 2008. He has been on the Editorial Board of the Information Retrieval Journal (IRJ) since 2008, and is a guest editor of the special issue on learning to rank of IRJ. He has given tutorials on learning to rank for information retrieval at WWW 2008, SIGIR 2008, and AIRS 2008.