Sha, C. and Wang, K. and Zhang, Dell and Wang, X. and Zhou, A. (2014) Optimizing Top-k retrieval: submodularity analysis and search strategies. In: Li, F. and Li, G. and Hwang, S.-w. and Yao, B. and Zhang, Z. (eds.) Web-Age Information Management. Lecture Notes In Computer Science 8485. New York, U.S.: Springer, pp. 18-29. ISBN 9783319080093.
Text
FCS-15222.pdf - Published Version of Record Restricted to Repository staff only Download (240kB) | Request a copy |
Abstract
The key issue in top-k retrieval --- finding a set of k documents (from a large document collection) that can best answer a user’s query --- is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k retrieval problem in the framework of facility location analysis and prove the submodularity of that objective function which provides a theoretical approximation guarantee of factor 1 − 1/eps for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first obtains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach can supersede the existing ones effectively and efficiently.
Metadata
Item Type: | Book Section |
---|---|
Additional Information: | 15th International Conference, WAIM 2014, Macau, China, June 16-18, 2014. Proceedings |
Keyword(s) / Subject(s): | top-k retrieval, diversification, submodular function maximization |
School: | Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences |
Research Centres and Institutes: | Birkbeck Knowledge Lab |
Depositing User: | Dr Dell Zhang |
Date Deposited: | 22 Dec 2015 08:57 |
Last Modified: | 09 Aug 2023 12:37 |
URI: | https://eprints.bbk.ac.uk/id/eprint/13636 |
Statistics
Additional statistics are available via IRStats2.