Stansfield, O'Mara-Eves, and Thomas (2017) report five ways in which text mining tools can assist in search strategy development:
Using text mining techniques to increase the objectivity of search strategies requires a more sophisticated use of tools that librarians or other searchers may or may not be prepared to implement. Decisions about cutoffs for high frequency terms, for example, and calculations to establish high frequencies require somewhat large sets of relevant references (which can be derived based on the included studies of relevant systematic reviews, for example) as well as a population set of random records against which one can test whether a term is high frequency across documents in general (for example, words that are high-frequency due to common check tags such as 'human') or in the relevant documents only.
Text mining, like data science in general, also involves a great deal of preprocessing, which tools may or may not handle. Preprocessing includes data cleaning and normalization techniques such as:
Some of the tools listed allow for customization of these procedures, while some are preconfigured. Programming tools such as the tm package in R or quanteda allow for much more flexibility than some of the tools covered here, but they are also much more difficult to use if one is not accustomed to programming.
These tools should be used with caution: They may apply the correct syntax to translate one strategy to another database/interface and make it seem that the subject headings have also been mapped correctly when in fact they simply change the syntax but do not adjust the subject headings to the corresponding vocabulary (e.g., when translation from PubMed or Ovid MEDLINE to Embase, they will continue to use MeSH terms instead of EMTREE terms). They are useful if you understand the fundamentals of searching within the applicable databases and how the databases/platforms work, but the searches will require reviewing and editing.