TY - GEN
T1 - Scalable multi-dimensional user intent identification using tree structured distributions
AU - Jethava, Vinay
AU - Calderón-Benavides, Liliana
AU - Baeza-Yates, Ricardo
AU - Bhattacharyya, Chiranjib
AU - Dubhashi, Devdatt
PY - 2011
Y1 - 2011
N2 - The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.
AB - The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.
KW - Chow-liu
KW - Facets
KW - FastQ
KW - Query intent
KW - Web search
KW - WordNet
UR - http://www.scopus.com/inward/record.url?scp=80052126241&partnerID=8YFLogxK
U2 - 10.1145/2009916.2009971
DO - 10.1145/2009916.2009971
M3 - Libros de Investigación
AN - SCOPUS:80052126241
SN - 9781450309349
T3 - SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval
SP - 395
EP - 404
BT - SIGIR'11 - Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval
PB - Association for Computing Machinery
T2 - 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011
Y2 - 24 July 2011 through 28 July 2011
ER -