Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation

Published in SIGIR-AP-2023, 2023

Recommended citation: Shuai Wang, Harrisen Scells, Martin Potthast, Bevan Koopman and Guido Zuccon. 2023. Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation. In Proceedings of the international ACM SIGIR Conference on Information Retrieval in the Asia Pacific November 26-29, 2023 (SIGIR-AP 2023). https://arxiv.org/abs/2309.05238

Abstract

Screening prioritisation in medical systematic reviews aims to rank the set of documents retrieved by complex Boolean queries. The goal is to prioritise the most important documents so that subsequent review steps can be carried out more efficiently and effectively. The current state of the art uses the final title of the review to rank documents using BERT-based neural neural rankers. However, the final title is only formulated at the end of the review process, which makes this approach impractical as it relies on ex post facto in- formation. At the time of screening, only a rough working title is available, with which the BERT-based ranker achieves is signifi- cantly worse than the final title. In this paper, we explore alternative sources of queries for screening prioritisation, such as the Boolean query used to retrieve the set of documents to be screened, and queries generated by instruction-based generative large language models such as ChatGPT and Alpaca. Our best approach is not only practical based on the information available at screening time, but is similar in effectiveness with the final title.