Guides: AI and Text Mining for Searching and Screening the Literature: TerMine

Using TerMine

Using TerMine to find multi-word terms

Available from http://www.nactem.ac.uk/software/termine/

TerMine is a useful tool for drawing out high-frequency multi-word terms from a corpus; however, it treats the corpus as a single file or document rather than also taking into account patterns across the documents, i.e., bibliographic records, so it is not possible to know if a term is highly frequent in, for example, only one record in a corpus, or common across many records in the corpus.

TerMine integrates an automatic term recognition algorithm using C-values (method combining linguistic and statistical analyses) and AcroMine acronym recognition (acronym dictionary generated from MEDLINE)
- Based on natural language processing techniques
Includes option to select a part of speech (POS) tagger for biomedical texts, GENIA Tagger, or a POS tagger for generic texts, Tree Tagger
TerMine is available through a free web demonstration or for download upon request; it is also built into the EPPI-Reviewer software, wihch is paid

Suggestions for using TerMine to identify multi-word terms

Collect the bibliographic records you would like to analyze in an EndNote library (or other citation software)
Using EndNote: Create an output style including the record fields you would like to analyze for high-frequency terms (e.g., title, abstract, keywords); export the records using said output style to a text file
From the TerMine Web Demonstration interface:
- Choose the text file you saved using the Local text file option
- Select GENIA Tagger version 2.1 for biomedical records
- Click Analyze
- From the resulting page, you can change the C-value threshold to highlight terms with a given C-value or higher if the number of terms is too high
- Select in table to display the list of terms by C-value, in descending order
- Copy the terms
- From Excel: Paste Special and select Text to maintain the table format in Excel

Liaison Librarian

Genevieve Gore

she/her

Email me

Contact:

Macdonald Stewart Library Building
809 Sherbrooke St W
Montreal QC H3A 0C1

Subjects: Biomedical ethics, Health sciences, Medicine, Public health

Librarian

Sabine Calleja

she/her/elle

Email me

Contact:

Schulich Library of Physical Sciences, Life Sciences, and Engineering

Subjects: Health sciences, Medicine, Nursing