Use citations to give credit for datasets and to promote reproducibility and reuse.
Why Cite Data?
As funding agencies and researchers increasingly move to make science more open, many organizations are pushing for increased data citation. Authors should cite datasets used in published papers for the same reason they would cite any work:
- To give credit to the authors/creators of the dataset
- To increase the transparency and reproducibility of the paper
- To enable reuse of the dataset by interested readers
Moreover, the move toward data citation reflects an increasing recognition of datasets as stand-alone pieces of academic work. A dataset citation can go on researchers’ CVs alongside peer reviewed articles. Citation metrics for the dataset can be tracked alongside papers.
What Should You Include in your Citation?
In general, a dataset citation should contain:
- Author: Names of each organization or individual responsible for creating the dataset.
- Title: The complete title of the dataset.
- Date: The date the dataset was published or disseminated.
- Version/Edition Number: If applicable
- Publisher/Distributor: For many datasets, this will be the archive or repository where the dataset is housed.
- Identifier or Location: Many datasets are assigned a DOI or accession number. A link might also be included in addition to, or in the absence of, these identifiers.
Some citation styles have specific formats for dataset citation (for example, APA and NLM), but many do not. In the absence of specific guidance, make sure to include the elements listed above.
Examples
APA Style: Smith, T.W., Marsden, P.V., & Hout, M. (2011). General social survey, 1972-2010 cumulative file (ICPSR31521-v1) [data file and codebook]. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor]. doi: 10.3886/ICPSR31521.v1
Chicago: Milberger, Sharon. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Detroit: Wayne State University, 2002. Distributed by Ann Arbor, MI: Inter-University Consortium for Political and Social Research, 2002. doi: 10.3886/ICPSR03414.
Citation Managers and Data Citation
Endnote and Zotero have datasets as an item type for citations. To cite a dataset with these citation managers, make sure you have selected the appropriate item type.
Sources
Ball, Alex, and Monica Duke. “How to Cite Datasets and Link to Publications.” Digital Curation Centre, July 30, 2015. https://www.dcc.ac.uk/guidance/how-guides/cite-datasets#sec:elements.
IASSIST. “Quick Guide to Data Citation.” ICPSR, 2012. https://www.icpsr.umich.edu/files/ICPSR/enewsletters/iassist2.html.
Oberdick, Benjamin. “How to Cite Data: General Info.” MSU Libraries. Accessed February 12, 2024. https://libguides.lib.msu.edu/c.php?g=96245&p=626236