Skip to Main Content

Text mining for searching and screening the literature

This guide is intended to provide an overview of the definition and application of text mining in search strategy development and study selection; it includes a list of tools and resources that librarians or other motivated searchers may wish to try

Training in R

Text mining training in R

If you want to take your exploration of text mining further and are new to programming, I would suggest starting wth R, an open source programming language originally developed for statistical analysis (that said, Python, another open source language, is a powerful alternative to R, and more popular in data science in general).

R has a strong community developing many easy-to-use packages with preprogrammed functions and decent documentation. The main R package repository, CRAN (Comprehensive R Archive Network) is "strictly maintained" and includes "elaborate checks" (Welbers 2016). There can be a lot of overlap in the packages, so I would also suggest starting with one (such as the tm package or quanteda) and sticking to it, at least for starters. 

Liaison Librarian

Profile Photo
Genevieve Gore
Liaison Librarian, Schulich Library of Physical Sciences, Life Sciences, and Engineering
Contact: Website

McGill Library • Questions? Ask us!
Privacy notice