As with journal articles and books, or any other source of information that contributed towards research, it is important to cite datasets so that the data creator, producer or distributor is credited: this includes citing datasets you produce yourself that may not have been published. Citing data helps researchers more easily discover and access relevant datasets for their research and this helps promote data sharing, data reproducibility and the verification of research results.
While citing data and statistical sources is not as standardized as it is for other sources and has often been practiced inconsistently, a number of data providers and distributors as well as major citation style guides, include some data citation guidelines. The reality though is that most of the major style guides (APA, MLA, the Chicago Manual of Style) do not directly address data citation and data is not recognized as a format in many citation management tools and tutorials. This guide provides guidance and points to guidelines that do exist.
Please note: If you are asked to follow a specific citation format according to a particular style guide (e.g. APA) it is important to follow that citation format as consistently as possible. However, since not all style guides give recommendations for citing datasets, or may not explain in enough detail how to cite the dataset you are using, it is always better to try and provide more information so that the dataset can more easily be located. Keep in mind that datasets and statistical tables contain unique elements not specifically addressed by most citation styles.
If the recommended style guide does not include examples of how to cite data, you may consider some of these tips:
McGill Libraries • Questions? Ask us!
Privacy notice