Open Data is the movement to make research datasets open, thereby enabling the sharing, reuse, and transparency of research findings. Similar to the Open Access movement for articles, Open Data is critical for advancing science and humanities research. While there are many reasons to make your research data accessible and reusable, here we focus on two groups of motivations: policy requirements and individual and disciplinary benefits.
Benefits of Open Data
More and more funders, publishers (PLOS, Elsevier, Springer-Nature, and more), institutions, and other stakeholders in the research enterprise require that researchers make their data publicly accessible. See, for example, the data management plan requirements from the National Science Foundation and the National Institutes of Health. In 2022, the Office of Science and Technology Policy updated the public access policy to increase the number of federal agencies with data-sharing requirements and make publicly funded publications and research freely accessible without an embargo or cost.
- Credit – Data citation and altmetrics give proper recognition to researchers.
- Efficiency – Open data prevents duplication of effort, accelerating discovery.
- Discoverability – Open datasets are more easily found and cited.
- Quality – Open access promotes high data curation standards.
- Integrity – Open data ensures complete and verifiable research.
- Collaboration – Enables collaboration between researchers.
Planning for and Managing Research Data
Making your data open – and meeting other funder and institutional requirements – will go more smoothly if you plan ahead. You can create data management and sharing plans that meet institutional and funder requirements with the DMPTool.
The DMPTool is free to use and provides support on a variety of issues, including:
- File formats
- Data documentation
- Sample plans
- Generation of a persistent identifier for plans
Researchers are increasingly faced with new expectations and obligations in regards to data management. To assess and advance your data management practices throughout the course of a research project in this challenging landscape, see the RDM Guide for Researchers.
How to Open Up Your Data
There are several options for making data openly available and citable:
- A domain-specific or general repository. There are many domain standards for making data publicly accessible. If you are unsure whether your domain has a standard or if you would like to use a general repository, you may consult publisher repository guides, re3data, or, in some cases, campus libraries provide data consultation services (Campus Resources).
- UC Data Publishing Platform: Dryad is an open-source, research data curation and publication platform. All UC campuses are proud members of Dryad and, because of this, offer Dryad as a free service for all UC researchers to publish and archive their data. Datasets published in Dryad receive a permanent citation and can be versioned at any time. Dryad is integrated with hundreds of journals and is an easy way to both publish data and comply with funder and publisher mandates. Check out published datasets or submit yours at: https://datadryad.org.
Important factors to consider when sharing data:
- Document – Provide clear documentation (i.e., a README file) explaining the context and methodology.
- Format – Use standardized formats and naming conventions for accessibility.
- License – Apply an open license (i.e., CCO) to define permissible uses of the data.
- Publish – Deposit data in a repository that meets the requirements of the Desirable Characteristics of Data Repositories.
- Cite – Get a DOI for the data and use ORCID IDs to properly cite, track, and get credit for research.
- Anonymize – Remove or de-identify information that would cause harm to communities to ensure the protection of sensitive information as well as ensure reusability of research data.