Tutorial T2-F - Query Log Mining


Fabrizio Silvestri, ISTI - CNR

Ricardo Baeza-Yates, Yahoo! Research

Abstract

Web Search Engines have stored in their logs information about users since they started to operate. This information often serves many purposes. The primary focus of this tutorial is to introduce to the discipline of query mining by showing its foundations and by analyzing the basic algorithms and techniques that could be used to extract and to exploit useful knowledge from this (potentially) infinite source of information. Furthermore, participants to this tutorial will be given a unified view on the literature on query log analysis.

Presenter

Ricardo Baeza-Yates is VP of Yahoo! Research for Europe and Latin America, Leading the labs at Barcelona, Spain and Santiago, Chile. Until 2005 he was the director of the Center for Web Research at the Department of Computer Science of the Engineering School of the University of Chile; and ICREA Professor at the Dept.of Technology of Univ. Pompeu Fabra in Barcelona, Spain.

He is co-author of the book Modern Information Retrieval, published in 1999 by Addison-Wesley, as well as co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 150 other publications. He has been PC-Chair of the most important conferences in the field of Web Search and Web Mining. He is one of the co-chairs of this year WWW2009 Web Search Track.

He has received the organization of American States award for young researches in exact sciences (1993) and with two Brazilian research article (1997). In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences. During 2007 he was awarded the Graham Medal for innovation in computing, given by the university of waterloo to distinguished ex-alumni.

Fabrizio Silvestri is currently a Researcher at ISTI - CNR in Pisa. He received his Ph.D. from the Computer Science Department of the University of Pisa in 2004. His research interests are mainly focused on Web Information Retrieval with particular focus on eficiency related problems like caching, collection partitioning, distributed IR in general.

In his professional activities Fabrizio Silvestri is member of the Program committee of many of the most important conferences in IR as well as organizer and, currently, member of the steering committee, of the workshop Large Scale and Distributed Systems for Information Retrieval (LSDS-IR). He has more than 40 publications on the field of eficiency in IR. In particular, in these last years his main research focus is on query log analysis for performance enhancement of web search engines. In the topic of the tutorial, Fabrizio Silvestri has written recently a survey paper for the journal Foundations and Trends in Information Retrieval, and has given a keynote speech at the LA-Web 2008 conference with a talk entitled “Past Searches Teach Everything: Including the Future!”