Shuai Wang (Dylan) UQ

Howdy! I’m Shuai Wang, a Postdoc and a finishing PhD student at IeLab, UQ. I’m under the eagle eyes of Professor. Guido Zuccon, A.Professor. Bevan Koopman, and Dr. Harrisen Scells.

I received my Bachelor of Science degree from the University of Western Australia in 2019, then obtained a Master of Engineering Science degree from the University of Queensland in 2021.

My research is strongly related to information retrieval and natural language processing (NLP). My PhD topic focuses on domain-specific applications (automation for medical systematic reviews), where I have been currently focused on the development of Automatic Mesh Term Suggestion, Screening Prioritisation, Seed-driven methods and Boolean query formulation. I also do some side projects on general IR and NLP tasks, such as Federated RAG, Fusion of Rankers etc.

I’m also a tutor at UQ, teaching courses like INFS7410 (information retrieval and web search). I’m not in it for the cash but for the chance to scout for other brainiacs to collaborate with. So, if you’re interested, let’s chat and see if we can cook up some research magic together!

I conducted my internship at Naver Lab Europe Feb-July 2024, with a research focus on Context Compression on Retrieval-augmented generation (RAG).

Job Opportunities

I’m currently working as a Postdoc at UQ, starting from Feb, 2025; I’m also looking for continued job opportunities in academia and industry. If you think I’m a good fit for your team, please feel free to contact me.

News

April 05, 2025

Three Papers Accepted in SIGIR 2025

[Reproducibility Paper]: 2D Matryoshka Training for Information Retrieval; Reassessing Large Language Model Boolean Query Generation for Systematic Reviews; Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition.

February 01, 2025

Started Postdoc at The University of Queensland

Started a postdoc position in information retrieval and natural language processing at The University of Queensland.

January 20, 2025

Paper Accepted in WWW 2025

[Short Paper]: ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search

Read all news

Services

I serve as a reviewer/PC member for the following journal/conference:

TOIS: ACM Transactions on Information Systems
ACM ICTIR 2023, SIGIR2024, SIGIR2025
ECIR2024
Journal of Data and Information Quality

Publications

Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition Reproduce

Zheng Yao, Shuai Wang and Guido Zuccon. 2025. Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition (Accepted SIGIR-2025).

Reassessing Large Language Model Boolean Query Generation for Systematic Reviews Reproduce

Shuai Wang, Harrisen Scells, Bevan Koopman and Guido Zuccon. 2025. Reassessing Large Language Model Boolean Query Generation for Systematic Reviews. (Accepted SIGIR-2025).

2D Matryoshka Training for Information Retrieval Reproduce

Shuai Wang, Shengyao Zhuang, Bevan Koopman and Guido Zuccon. 2025. 2D Matryoshka Training for Information Retrieval. (Accepted SIGIR-2025).

Corpus Subsampling: Estimating the Effectiveness of Neural Retrieval Models on Large Corpora Long

Maik Fröbe, Andrew Parry, Harrisen Scells, Shuai Wang, Shengyao Zhuang, Guido Zuccon, Martin Potthast and Matthias Hagen. 2025. Corpus Subsampling: Estimating the Effectiveness of Neural Retrieval Models on Large Corpora. In: Hauff, C., et al. Advances in Information Retrieval. ECIR 2025. Lecture Notes in Computer Science, vol 15572. Springer, Cham. https://doi.org/10.1007/978-3-031-88708-6_29.

Starbucks: Improved Training for 2D Matryoshka Embeddings Long

Shengyao Zhuang*, Shuai Wang*, Bevan Koopman and Guido Zuccon. 2024. Starbucks: Improved Training for 2D Matryoshka Embeddings. (Arxiv Preprint).

Context Embeddings for Efficient Answer Generation in RAG Long

David Rau*, Shuai Wang*, Hervé Déjean and Stéphane Clinchant. 2024. Context Embeddings for Efficient Answer Generation in RAG. (Accepted in WSDM2025).

BERGEN: A Benchmarking Library for Retrieval-Augmented Generation Resource

David Rau, Hervé Déjean, Nadezhda Chirkova, Thibault Formal, Shuai Wang, Vassilina Nikoulina and Stéphane Clinchant. 2024. BERGEN: A Benchmarking Library for Retrieval-Augmented Generation. (Accepted in EMNLP2024 Findings).

Zero-shot Generative Large Language Models for Systematic Review Screening Automation Long

Shuoqi Sun, Shengyao Zhuang, Shuai Wang and Guido Zuccon. 2024. Zero-shot Generative Large Language Models for Systematic Review Screening Automation. (Accepted in ECIR 2025).

Large Language Models for Stemming: Promises, Pitfalls and Failures Short

Shuai Wang, Shengyao Zhuang and Guido Zuccon. 2024. Large Language Models for Stemming: Promises, Pitfalls and Failures. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).

Evaluating Generative Ad Hoc Information Retrieval Long

Lukas Gienapp, Harrisen Scells, Niklas Deckers, Janek Bevendorff, Shuai Wang, Johannes Kiesel, Shahbaz Syed, Maik Fröbe, Guido Zuccon, Benno Stein, Matthias Hagen and Martin Potthast. 2024. Evaluating Generative Ad Hoc Information Retrieval. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).

FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation Resource

Shuai Wang, Ekaterina Khramtsova, Shengyao Zhuang and Guido Zuccon. 2024. FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).

ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search Short

Shuai Wang, Shengyao Zhuang, Bevan Koopman and Guido Zuccon. 2024. ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search. (Accepted in WWW2025).

Zero-shot Generative Large Language Models for Systematic Review Screening Automation Long

Shuai Wang, Harrisen Scells, Shengyao Zhuang, Martin Potthast, Bevan Koopman and Guido Zuccon. 2023. Zero-shot Generative Large Language Models for Systematic Review Screening Automation. In Proceedings of the 46th European Conference on Information Retrieval (ECIR 2024).

Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation Long

Shuai Wang, Harrisen Scells, Martin Potthast, Bevan Koopman and Guido Zuccon. 2023. Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation. In Proceedings of the international ACM SIGIR Conference on Information Retrieval in the Asia Pacific November 26-29, 2023 (SIGIR-AP 2023).

Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search? Long

Shuai Wang, Harrisen Scells, Bevan Koopman and Guido Zuccon. 2023. Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search? In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023).

Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study Reproduce

Shuai Wang, and Guido Zuccon. 2023. Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023).

MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction Short

Shuai Wang and Hang Li and Guido Zuccon. 2023. MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction. In the 16th Web Search and Data Mining Conference WSDM 2023 (WSDM2023).

Neural Rankers for Effective Screening Prioritization in Medical Systematic Review Literature Search Long

Shuai Wang and Harry Scells and Bevan Koopman and Guido Zuccon. 2022. Neural Rankers for Effective Screening Prioritization in Medical Systematic Review Literature Search. In Australasian Document Computing Symposium (ADCS 2022).

Automated MeSH Term Suggestion for Effective Query Formulation in Systematic Reviews Literature Search Journal

Shuai Wang and Harry Scells and Bevan Koopman and Guido Zuccon. 2022. Automated MeSH Term Suggestion for Effective Query Formulation in Systematic Reviews Literature Search. In Intelligent Systems with Applications (ISWA) Technology-Assisted Review Systems Special Issue.

To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers Short

Hang Li* and Shuai Wang* and Shengyao Zhuang and Ahmed Mourad and xueguang-ma and jimmy-lin and Guido Zuccon. 2022. To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022).

From Little Things Big Things Grow: A Collection with Seed Studies for Medical Systematic Review Literature Search Resource

Shuai Wang and Harry Scells and Justin Clark and Guido Zuccon and Bevan Koopman. 2022. From Little Things Big Things Grow: A Collection with Seed Studies for Medical Systematic Review Literature Search. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022).

SDR for Systematic Reviews: A Reproducibility Study Reproduce

Shuai Wang and Harry Scells and Ahmed Mourad and Guido Zuccon. 2022. SDR for Systematic Reviews: A Reproducibility Study. In Proceedings of the 44th European Conference on Information Retrieval (ECIR 2022).

MeSH Term Suggestion for Systematic Review Literature Search Long

Shuai Wang and Hang Li and Harry Scells and Daniel Locke and Guido Zuccon. 2021. MeSH Term Suggestion for Systematic Review Literature Search. In Australasian Document Computing Symposium (ADCS 2021).

IELAB at TREC Deep Learning Track 2021 Notebook

Shengyao Zhuang and Hang Li and Shuai Wang and Guido Zuccon. 2021. IELAB at TREC Deep Learning Track 2021. In TREC 2021 Deep Learning Track.

BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval Short

Shuai Wang and Shengyao Zhuang and Guido Zuccon. 2021. BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval. In The Proceedings of the 2021 ACM SIGIR on International Conference on Theory of Information Retrieval (ICTIR 2021).