Search result diversification in short text streams

Shangsong Liang, Emine Yilmaz, Hong Shen, Maarten De Rijke, W. Bruce Croft

研究成果: Article同行評審

21 引文 斯高帕斯(Scopus)

摘要

We consider the problem of search result diversification for streams of short texts. Diversifying search results in short text streams is more challenging than in the case of long documents, as it is difficult to capture the latent topics of short documents. To capture the changes of topics and the probabilities of documents for a given query at a specific time in a short text stream, we propose a dynamic Dirichlet multinomial mixture topic model, called D2M3, as well as a Gibbs sampling algorithm for the inference. We also propose a streaming diversification algorithm, SDA, that integrates the information captured by D2M3 with our proposed modified version of the PM-2 (Proportionality-based diversification Method - second version) diversification algorithm. We conduct experiments on a Twitter dataset and find that SDA statistically significantly outperforms state-of-the-art non-streaming retrieval methods, plain streaming retrieval methods, as well as streaming diversification methods that use other dynamic topic models.

原文English
文章編號8
期刊ACM Transactions on Information Systems
36
發行號1
DOIs
出版狀態Published - 4月 2017
對外發佈

指紋

深入研究「Search result diversification in short text streams」主題。共同形成了獨特的指紋。

引用此