👋 Welcome!
Howdy! I’m Shuai Wang, a Postdoc and finishing PhD student at IeLab, UQ. I’m under the guidance of Professor Guido Zuccon, A.Professor Bevan Koopman, and Dr. Harrisen Scells.
🎓 Academic Background
- PhD (finishing) - University of Queensland (2021-2025)
- Master of Engineering Science - University of Queensland (2021)
- Bachelor of Science - University of Western Australia (2019)
🔬 Research Focus
My research centers on information retrieval and natural language processing (NLP), with a particular focus on domain-specific applications. My PhD work concentrates on automation for medical systematic reviews, including:
- Automatic Mesh Term Suggestion
- Screening Prioritisation
- Seed-driven Methods
- Boolean Query Formulation
I also explore general IR and NLP challenges, including Federated RAG and Fusion of Rankers.
👨🏫 Teaching & Mentoring
Currently serving as Course Coordinator for INFS7410 (Information Retrieval and Web Search) at UQ. Previously tutored multiple courses (2021-2024) including INFS7410, INFS7205, and DATA7901/7902/7903.
I’m passionate about discovering brilliant minds to collaborate with—if you’re interested in research, let’s connect and create something amazing together!
🌍 Industry Experience
Research Intern at Naver Lab Europe (Feb-July 2024), focusing on Context Compression for Retrieval-Augmented Generation (RAG).
💼 Job Opportunities
Starting February 2025, I work as a Postdoc at UQ. I’m actively seeking exciting opportunities in both academia and industry. If you think I’m a great fit for your team, please feel free to reach out!
📰 Latest News
Three Papers Accepted in SIGIR 2025
[Reproducibility Paper]: 2D Matryoshka Training for Information Retrieval; Reassessing Large Language Model Boolean Query Generation for Systematic Reviews; Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition.
Started Postdoc at The University of Queensland
Started a postdoc position in information retrieval and natural language processing at The University of Queensland.
Paper Accepted in WWW 2025
[Short Paper]: ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search
🤝 Professional Services
I contribute to the academic community by serving as a reviewer/PC member for:
📚 Journals
- TOIS: ACM Transactions on Information Systems
- Journal of Data and Information Quality
🏛️ Conferences
- ACM ICTIR 2023, SIGIR 2024, SIGIR 2025
- ECIR 2024
📝 Publications
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition Reproduce
Zheng Yao, Shuai Wang and Guido Zuccon. 2025. Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition (Accepted SIGIR-2025).
Reassessing Large Language Model Boolean Query Generation for Systematic Reviews Reproduce
Shuai Wang, Harrisen Scells, Bevan Koopman and Guido Zuccon. 2025. Reassessing Large Language Model Boolean Query Generation for Systematic Reviews. (Accepted SIGIR-2025).
2D Matryoshka Training for Information Retrieval Reproduce
Shuai Wang, Shengyao Zhuang, Bevan Koopman and Guido Zuccon. 2025. 2D Matryoshka Training for Information Retrieval. (Accepted SIGIR-2025).
Corpus Subsampling: Estimating the Effectiveness of Neural Retrieval Models on Large Corpora Long
Maik Fröbe, Andrew Parry, Harrisen Scells, Shuai Wang, Shengyao Zhuang, Guido Zuccon, Martin Potthast and Matthias Hagen. 2025. Corpus Subsampling: Estimating the Effectiveness of Neural Retrieval Models on Large Corpora. In: Hauff, C., et al. Advances in Information Retrieval. ECIR 2025. Lecture Notes in Computer Science, vol 15572. Springer, Cham. https://doi.org/10.1007/978-3-031-88708-6_29.
Starbucks: Improved Training for 2D Matryoshka Embeddings Long
Shengyao Zhuang*, Shuai Wang*, Bevan Koopman and Guido Zuccon. 2024. Starbucks: Improved Training for 2D Matryoshka Embeddings. (Arxiv Preprint).
Context Embeddings for Efficient Answer Generation in RAG Long
David Rau*, Shuai Wang*, Hervé Déjean and Stéphane Clinchant. 2024. Context Embeddings for Efficient Answer Generation in RAG. (Accepted in WSDM2025).
BERGEN: A Benchmarking Library for Retrieval-Augmented Generation Resource
David Rau, Hervé Déjean, Nadezhda Chirkova, Thibault Formal, Shuai Wang, Vassilina Nikoulina and Stéphane Clinchant. 2024. BERGEN: A Benchmarking Library for Retrieval-Augmented Generation. (Accepted in EMNLP2024 Findings).
Zero-shot Generative Large Language Models for Systematic Review Screening Automation Long
Shuoqi Sun, Shengyao Zhuang, Shuai Wang and Guido Zuccon. 2024. Zero-shot Generative Large Language Models for Systematic Review Screening Automation. (Accepted in ECIR 2025).
Large Language Models for Stemming: Promises, Pitfalls and Failures Short
Shuai Wang, Shengyao Zhuang and Guido Zuccon. 2024. Large Language Models for Stemming: Promises, Pitfalls and Failures. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).
Evaluating Generative Ad Hoc Information Retrieval Long
Lukas Gienapp, Harrisen Scells, Niklas Deckers, Janek Bevendorff, Shuai Wang, Johannes Kiesel, Shahbaz Syed, Maik Fröbe, Guido Zuccon, Benno Stein, Matthias Hagen and Martin Potthast. 2024. Evaluating Generative Ad Hoc Information Retrieval. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation Resource
Shuai Wang, Ekaterina Khramtsova, Shengyao Zhuang and Guido Zuccon. 2024. FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024).
ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search Short
Shuai Wang, Shengyao Zhuang, Bevan Koopman and Guido Zuccon. 2024. ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search. (Accepted in WWW2025).
Zero-shot Generative Large Language Models for Systematic Review Screening Automation Long
Shuai Wang, Harrisen Scells, Shengyao Zhuang, Martin Potthast, Bevan Koopman and Guido Zuccon. 2023. Zero-shot Generative Large Language Models for Systematic Review Screening Automation. In Proceedings of the 46th European Conference on Information Retrieval (ECIR 2024).
Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation Long
Shuai Wang, Harrisen Scells, Martin Potthast, Bevan Koopman and Guido Zuccon. 2023. Generating Natural Language Queries for More Effective Systematic Review Screening Prioritisation. In Proceedings of the international ACM SIGIR Conference on Information Retrieval in the Asia Pacific November 26-29, 2023 (SIGIR-AP 2023).
Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search? Long
Shuai Wang, Harrisen Scells, Bevan Koopman and Guido Zuccon. 2023. Can ChatGPT Write a Good Boolean Query for Systematic Review Literature Search? In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023).
Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study Reproduce
Shuai Wang, and Guido Zuccon. 2023. Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023).
MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction Short
Shuai Wang and Hang Li and Guido Zuccon. 2023. MeSH Suggester: A Library and System for MeSH Term Suggestion for Systematic Review Boolean Query Construction. In the 16th Web Search and Data Mining Conference WSDM 2023 (WSDM2023).
Neural Rankers for Effective Screening Prioritization in Medical Systematic Review Literature Search Long
Shuai Wang and Harry Scells and Bevan Koopman and Guido Zuccon. 2022. Neural Rankers for Effective Screening Prioritization in Medical Systematic Review Literature Search. In Australasian Document Computing Symposium (ADCS 2022).
Automated MeSH Term Suggestion for Effective Query Formulation in Systematic Reviews Literature Search Journal
Shuai Wang and Harry Scells and Bevan Koopman and Guido Zuccon. 2022. Automated MeSH Term Suggestion for Effective Query Formulation in Systematic Reviews Literature Search. In Intelligent Systems with Applications (ISWA) Technology-Assisted Review Systems Special Issue.
To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers Short
Hang Li* and Shuai Wang* and Shengyao Zhuang and Ahmed Mourad and xueguang-ma and jimmy-lin and Guido Zuccon. 2022. To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022).
From Little Things Big Things Grow: A Collection with Seed Studies for Medical Systematic Review Literature Search Resource
Shuai Wang and Harry Scells and Justin Clark and Guido Zuccon and Bevan Koopman. 2022. From Little Things Big Things Grow: A Collection with Seed Studies for Medical Systematic Review Literature Search. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022).
SDR for Systematic Reviews: A Reproducibility Study Reproduce
Shuai Wang and Harry Scells and Ahmed Mourad and Guido Zuccon. 2022. SDR for Systematic Reviews: A Reproducibility Study. In Proceedings of the 44th European Conference on Information Retrieval (ECIR 2022).
MeSH Term Suggestion for Systematic Review Literature Search Long
Shuai Wang and Hang Li and Harry Scells and Daniel Locke and Guido Zuccon. 2021. MeSH Term Suggestion for Systematic Review Literature Search. In Australasian Document Computing Symposium (ADCS 2021).
IELAB at TREC Deep Learning Track 2021 Notebook
Shengyao Zhuang and Hang Li and Shuai Wang and Guido Zuccon. 2021. IELAB at TREC Deep Learning Track 2021. In TREC 2021 Deep Learning Track.
BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval Short
Shuai Wang and Shengyao Zhuang and Guido Zuccon. 2021. BERT-based Dense Retrievers Require Interpolation with BM25 for Effective Passage Retrieval. In The Proceedings of the 2021 ACM SIGIR on International Conference on Theory of Information Retrieval (ICTIR 2021).