Guides: AI and Text Mining for Searching and Screening the Literature: ChatGPT, Copilot, and other AI chatbots

ChatGPT, Copilot, and other AI chatbots

It is tempting to ask large language model (LLM)-based AI chatbots, like Copilot (officially supported when signed in through McGill) or ChatGPT/Gemini/DeepSeek and others, to develop database search strategies using your review question and some other contextual information (e.g., that you are conducting a systematic review, that you are searching PubMed) as a prompt.

ChatGPT can admittedly be very useful in identifying other terms that may be used as synonyms for a given concept, e.g., it may suggest other terms that authors may use for a given concept that you had not thought of including in your search—although it typically hallucinates other irrelevant terms. To see an example of this, see this prompt produced by the University of Toronto’s Health Sciences Library.

We do not recommend that searchers use LLMs to develop a database search strategy. Do not use it without being capable of reviewing, editing, and enhancing what it has done: That requires an understanding the basics of structuring searches properly with, for example, parentheses and Boolean operators, as well as the knowledge of how to search specific databases, including:

How to break down your question into the concepts that should reasonably be included in your search
- AI chatbots are often too restrictive or fail to break down the question properly

They may generate a search approach that seems perfectly reasonable to non-expert searchers but that does something like include a concept that is too limiting
- E.g., we have seen them include the concept of placebos AND randomized controlled trials (RCTs) when breaking down a question into concepts and search terms to identify RCTs

How to identify/find actual subject headings (e.g., MeSH terms in MEDLINE)
- AI chatbots are prone to making up subject headings; you need to know what subject headings are and how they work before trusting a chatbot with this task

How to use truncation
- AI chatbots may or may not apply truncation when appropriate, or they incorrectly truncate words

How to choose which fields to search
- AI chatbots may use overly restrictive or overly broad field searching, or invent their own fields

How to use proximity operators correctly
- AI chatbots may be incapable of applying these appropriately

For some examples of AI chatbot search suggestions gone awry, see:

Using ChatGPT for PubMed Searches: Be Smart!

More AI Tools: Using Gemini for PubMed Searches

An additional, major caveat is that these systems are creative: They may generate lists of suggested references that appear to be correct based on the authors or the journals cited, but these systems are highly prone to hallucinations: They often creatively generate references that do not exist at all! Databases like MEDLINE, CINAHL, Embase, PubMed, Cochrane, and others listed here should still be the first place to search for peer-reviewed literature. Large language models do not include many high-quality journals in their data corpora (i.e. the data "behind the scenes" that the large language model is deriving its information from) because these are proprietary products that can only be accessed with subscriptions (many of which McGill students, faculty, and staff have access to because we pay for it!) Therefore, tempting and alluring as it may seem, AI chatbots DO NOT and CANNOT replace good old fashioned database searching.