Analysis of Web search engine clicked documents

David F. Nettleton, Liliana Calderón-Benavides, Ricardo Baeza-Yates

Producción científica: Libro / Capitulo del libro / InformeLibros de Investigaciónrevisión exhaustiva

1 Cita (Scopus)

Resumen

In this paper we process and analyze web search engine query and click data from the perspective of the documents (URL's) selected. We initially define possible document categories and select descriptive variables to define the documents. The URL dataset is preprocessed and analyzed using some traditional statistical methods, and then processed by the Kohonen SOM clustering technique[5], which we use to produce a two level clustering. The clusters are interpreted in terms of the document categories and variables defined initially. Then we apply the C4.5[9] rule induction algorithm to produce a decision tree for the document category. The objective of the work is to apply a systematic data mining process to click data, contrasting non-supervised (Kohonen) and supervised (C4.5) methods to cluster and model the data, in order to identify document profiles which relate to theoretical user behavior, and document (URL) organization.

Idioma originalInglés
Título de la publicación alojadaProceedings - LA-Web 06
Subtítulo de la publicación alojadaFourth Latin American Web Congress
Páginas209-219
Número de páginas11
DOI
EstadoPublicada - 2006
Publicado de forma externa
EventoLA-Web 06: 4th Latin American Web Congress - Cholula, México
Duración: 25 oct. 200627 oct. 2006

Serie de la publicación

NombreProceedings - LA-Web 06: Fourth Latin American Web Congress

Conferencia

ConferenciaLA-Web 06: 4th Latin American Web Congress
País/TerritorioMéxico
CiudadCholula
Período25/10/0627/10/06

Huella

Profundice en los temas de investigación de 'Analysis of Web search engine clicked documents'. En conjunto forman una huella única.

Citar esto