Skip to Main Content

Text mining for searching and screening the literature

This guide is intended to provide an overview of the definition and application of text mining in search strategy development and study selection; it includes a list of tools and resources that librarians or other motivated searchers may wish to try

Tools for study selection

Tools for study selection

Given the high sensitivity/recall of most knowledge synthesis search strategies, researchers are investigating the feasibility of using text mining and machine learning in the record screening phase to reduce the burden on reviewers while still capturing relevant studies from the search set. I recommend that librarians understand what software is available for this, to allow them to advise users on their options.

The approaches that have been explored can be generally categorized into the following:

  1. Improving workflow through screening prioritisation - for process parallelisation, allowing reviewers to perform tasks in parallel (e.g., prioritising relevant records in screening phase so that full texts, data extraction, and synthesis can begin earlier)
  2. Using software as a second reviewer
  3. Speeding up the screening process

O'Mara-Eves, Thomas, McNaught, Miwa, and Ananiadou (2015) systematically reviewed the literature on text mining in the record screening (i.e., study selection or study identification) phase and concluded that prioritising records for screening could be considered a safe method in live reviews, but that using screening software as a second reviewer should be done with caution and that using text mining to automatically eliminate studies needs more investigation. Below are some of the tools they explored as well as others that have appeared on the scene since they published their review. Librarians and other advanced searchers are well situated to advise research groups on the use of these tools in the knowledge synthesis workflow.

Study selection text mining tools for non-programmers

  • See: van de Schoot, R., de Bruin, J., Schram, R., Zahedi, P., de Boer, J., Weijdema, F., Kramer, B., Huijts, M., Hoogerwerf, M., Ferdinands, G., Harkema, A., Willemsen, J., Ma, Y., Fang, Q., Hindriks, S., Tummers, L., & Oberski, D. L. (2021). An open source machine learning framework for efficient and transparent systematic reviews. Nature Machine Intelligence, 3(2), 125-133. https://doi.org/10.1038/s42256-020-00287-7 

 

  • See: Gartlehner, G., Wagner, G., Lux, L., Affengruber, L., Dobrescu, A., Kaminski-Hartenthaler, A., & Viswanathan, M. (2019). Assessing the accuracy of machine-assisted abstract screening with DistillerAI: A user study. Systematic Reviews, 8(1), 277. https://doi.org/10.1186/s13643-019-1221-3
  • See: Olofsson, H., Brolund, A., Hellberg, C., Silverstein, R., Stenström, K., Österberg, M., & Dagerhamn, J. (2017). Can abstract screening workload be reduced using text mining? User experiences of the tool rayyan. Research synthesis methods, 8(3), 275-280. https://doi.org/10.1002/jrsm.1237

References on text mining in study selection

Text mining in study selection/screening

In addition to the references below, you can use the following search strategy in Google Scholar to identify more literature on these and other tools (this is not a comprehensive search): 

intitle:"abstract screening"|asreview|abstrackr|"citation screening"|colandr|distillerai|distillersr|"eppi-reviewer"|rayyan|robotanalyst screening|eligibility|"study selection"|prioritization

Suggested references

  • Gartlehner, G., Wagner, G., Lux, L., Affengruber, L., Dobrescu, A., Kaminski-Hartenthaler, A., & Viswanathan, M. (2019). Assessing the Accuracy of Machine-Assisted Abstract Screening with DistillerAI: A User Study. Systematic Reviews, 8(1), 277. https://doi.org/10.1186/s13643-019-1221-3

  • Gates, A., Guitard, S., Pillay, J., Elliott, S. A., Dyson, M. P., Newton, A. S., & Hartling, L. (2019, Nov 15). Performance and Usability of Machine Learning for Screening in Systematic Reviews: A Comparative Evaluation of Three Tools. Systematic Reviews, 8(1), 278. https://doi.org/10.1186/s13643-019-1222-2 

  • Hamel, C., Hersi, M., Kelly, S. E., Tricco, A. C., Straus, S., Wells, G., Pham, B., & Hutton, B. (2021, Dec 20). Guidance for using artificial intelligence for title and abstract screening while conducting knowledge syntheses. BMC Medical Research Methodology, 21(1), 285. https://doi.org/10.1186/s12874-021-01451-2

  • O'Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., & Ananiadou, S. (2015). Using Text Mining for Study Identification in Systematic Reviews: A Systematic Review of Current Approaches. Systematic Reviews, 4, 5. doi:10.1186/2046-4053-4-5

  • Olorisade, B. K., Quincey, E. d., Brereton, P., & Andras, P. (2016). A Critical Analysis of Studies That Address the Use of Text Mining for Citation Screening in Systematic Reviews. Paper presented at the Proceedings of the 20th International Conference on Evaluation and Assessment in Software Engineering, Limerick, Ireland. doi:10.1145/2915970.2915982

  • Przybyla, P., Brockmeier, A. J., Kontonatsios, G., Le Pogam, M. A., McNaught, J., von Elm, E., . . . Ananiadou, S. (2018). Prioritising References for Systematic Reviews with Robotanalyst: A User Study. Research Synthesis Methods. doi:10.1002/jrsm.1311

  • van de Schoot, R., de Bruin, J., Schram, R., Zahedi, P., de Boer, J., Weijdema, F., Kramer, B., Huijts, M., Hoogerwerf, M., Ferdinands, G., Harkema, A., Willemsen, J., Ma, Y., Fang, Q., Hindriks, S., Tummers, L., & Oberski, D. L. (2021). An Open Source Machine Learning Framework for Efficient and Transparent Systematic Reviews. Nature Machine Intelligence, 3(2), 125-133. https://doi.org/10.1038/s42256-020-00287-7 

  • Wang, Z., Nayfeh, T., Tetzlaff, J., O’Blenis, P., & Murad, M. H. (2020). Error rates of human reviewers during abstract screening in systematic reviews. PLoS One, 15(1), e0227742. https://doi.org/10.1371/journal.pone.0227742

Liaison Librarian

Profile Photo
Genevieve Gore
Liaison Librarian, Schulich Library of Physical Sciences, Life Sciences, and Engineering
Contact: Website

McGill LibraryQuestions? Ask us!
Privacy notice