Analysis of web search engine query session and clicked documents

David Nettleton, Liliana Calderón-Benavides, Ricardo Baeza-Yates

Research output: Book / Book Chapter / ReportResearch Bookspeer-review

7 Scopus citations

Abstract

The identification of a user's intention or interest by the analysis of the queries submitted to a search engine and the documents selected as answers to these queries, can be very useful to offer more adequate results for that user. In this Chapter we present the analysis of a Web search engine query log from two different perspectives: the query session and the clicked document. In the first perspective, that of the query session, we process and analyze web search engine query and click data for the query session (query + clicked results) conducted by the user. We initially state some hypotheses for possible user types and quality profiles for the user session, based on descriptive variables of the session. In the second perspective, that of the clicked document, we repeat the process from the perspective of the documents (URL's) selected. We also initially define possible document categories and select descriptive variables to define the documents. We apply a systematic data mining process to click data, contrasting non- supervised (Kohonen) and supervised (C4.5) methods to cluster and model the data, in order to identify profiles and rules which relate to theoretical user behavior and user session "quality", from the point of view of user session, and to identify document profiles which relate to theoretical user behavior, and document (URL) organization, from the document perspective.

Original languageEnglish
Title of host publicationAdvances in Web Mining and Web Usage Analysis - 8th International Workshop on Knowledge Discovery on the Web, WebKDD 2006, Revised Papers
PublisherSpringer Verlag
Pages207-226
Number of pages20
ISBN (Print)354077484X, 9783540774846
DOIs
StatePublished - 2007
Externally publishedYes
Event8th International Workshop on Knowledge Discovery on the Web, WebKDD 2006 - Philadelphia, PA, United States
Duration: 20 Aug 200620 Aug 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4811 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Workshop on Knowledge Discovery on the Web, WebKDD 2006
Country/TerritoryUnited States
CityPhiladelphia, PA
Period20/08/0620/08/06

Fingerprint

Dive into the research topics of 'Analysis of web search engine query session and clicked documents'. Together they form a unique fingerprint.

Cite this