Skip to main content

Text Data Mining (TDM)

A research guide for helping you identify public and licensed text sources for text data mining as well as tools for text analysis.

Text Corpora with Data Mining Rights Licensed to McGill

McGill Library works with vendors whenever possible to include text and data mining into future agreements and can help negotiate access for specific projects. Some licensed databases that McGill has negotiated text data mining rights for are listed below.

If you would like access to any of these collections for text mining purposes, please contact your Liaison Librarian for assistance. 

Free Digital Text Corpora

Large digital archives and publishers are increasingly making large corposus of text available for researchers to text mine. Here is a select list of sources that make text corpora freely available.

McGill's.txtLAB has metadata and full text data sets available for download here

Tools for Mining Text Corpora

Tools and Tutorials for Mining Social Media

Tools and Tutorials for Cleaning, Visualizing and Analyzing Textual Data

McGill LibraryQuestions? Ask us!
Privacy notice