Research data & code sharing

Policy on sharing research data, analytical code and related materials for articles published in IUMS journals, supporting transparency, reproducibility and responsible reuse in line with FAIR principles.

Applies to all research articles across IUMS journals
Policy v1.0 – last updated April 2025

Scope & purpose

Why IUMS encourages responsible data and code sharing.

IUMS journals support the responsible sharing of research data and analytical code as a core element of transparent and reproducible science. Making data and code findable and reusable:

  • helps others to verify results and build on published work;
  • reduces unnecessary duplication of studies, especially where participants are exposed to risk;
  • enables secondary analyses, meta-analyses and new insights; and
  • increases the visibility and potential impact of the original research.

This policy applies to all articles in IUMS journals that report results from empirical research, including clinical, laboratory, epidemiological, qualitative and mixed-methods studies. Some journals may impose stricter requirements; where this is the case, the journal-specific policy prevails.

What counts as “research data”?

Types of data and related materials covered by this policy.

For IUMS journals, “research data” is understood broadly as any information that underpins the findings and conclusions of an article. This may include, for example:

  • clinical and epidemiological datasets (e.g. participant-level data, registry data, survey responses);
  • laboratory measurements and assay results;
  • imaging data (radiology, pathology, microscopy) and derived metrics;
  • omics data (genomics, transcriptomics, proteomics, metabolomics);
  • qualitative data (interview transcripts, focus group notes, observational field notes), subject to appropriate de-identification;
  • simulation outputs, model parameters and calibration data; and
  • processed datasets used for statistical analyses and figures.

Related materials such as analytical code, software scripts, study protocols, data dictionaries and case report forms (CRFs) are also considered part of the broader research output and are covered by this policy.

FAIR data principles

Supporting data that are Findable, Accessible, Interoperable & Reusable.

Where feasible, IUMS journals encourage authors to manage and share data according to the FAIR principles:

  • Findable: data and metadata are assigned persistent identifiers (such as DOIs) and indexed in searchable repositories.
  • Accessible: data can be accessed using clear, standard protocols, subject to ethical and legal restrictions.
  • Interoperable: data use well-defined formats, vocabularies and standards that facilitate integration with other datasets.
  • Reusable: data are accompanied by rich metadata and clear licensing terms that enable appropriate reuse and citation.

Not all data can be shared openly (for example, highly sensitive clinical datasets or proprietary information). In such cases, authors should still aim for transparency by describing what can be shared, in what form and under which conditions.

Data Availability Statements (DAS)

A clear, mandatory statement on where and how data can be accessed.

All research articles in IUMS journals are expected to include a Data Availability Statement (DAS). The DAS should summarise, in a concise paragraph:

  • whether the data underlying the article are openly available, available on reasonable request, or not shareable;
  • where data are stored (repository name) and under what identifier (DOI or accession number); and
  • any conditions or restrictions on access (for example, controlled access due to participant privacy or legal/contractual constraints).

Example templates (to be adapted to the specific study):

  • “De-identified participant data and analysis scripts are available in the [Name] repository at [DOI/URL], under a CC BY 4.0 licence.”
  • “Data underlying this article cannot be shared publicly due to [brief reason, e.g. local legal restrictions or consent wording] but are available from the corresponding author upon reasonable request, subject to institutional approval.”
  • “This study uses data obtained under a data use agreement that does not permit redistribution. Aggregate outputs supporting the conclusions are included in the article and its supplementary files.”

The DAS must be consistent with ethical approvals, consent forms and any statements about data sharing in clinical trial registries or funder policies.

Data types & recommended formats

Using formats that support long-term access and reuse.

Authors should share data in formats that are as open, non-proprietary and widely readable as possible. Examples include:

  • Tabular data: CSV, TSV or other plain text formats (with clear encoding and column descriptions), alongside codebooks or data dictionaries.
  • Statistical data: if proprietary statistical formats are used (e.g. SPSS, Stata, SAS), consider including exported open formats plus code to reproduce key outputs.
  • Textual/qualitative data: de-identified transcripts in structured text formats (e.g. TXT, DOCX, PDF with appropriate metadata).
  • Images: standard formats such as TIFF, PNG, JPEG or DICOM where appropriate, with anonymisation applied at the source.
  • Omics data: deposit in domain-specific repositories that define their own recommended formats (e.g. FASTQ, BAM, VCF), and cite the accession numbers.

Metadata describing how data were collected, processed and labelled are essential for interpretation. Authors should provide sufficient documentation to allow other researchers to understand variables, units, coding schemes and any transformations applied.

Repositories & persistent identifiers

Where to deposit data and how to ensure citability.

When sharing data, authors should use repositories that:

  • provide persistent identifiers (e.g. DOIs or stable accession numbers);
  • are recognised and trusted within the relevant discipline; and
  • offer long-term preservation and clear access conditions.

Options include:

  • Discipline-specific repositories (e.g. for genomic, proteomic, imaging or clinical trial data);
  • Institutional repositories operated by universities or research organisations; and
  • General-purpose repositories that issue DOIs and support open or controlled access models.

The chosen repository should be cited in the manuscript’s Data Availability Statement and, where appropriate, in the reference list. Each dataset can be cited like a citable output, including author(s), year, dataset title, repository name and DOI.

Analytical code, software & pipelines

Sharing the logic behind your analyses.

Analytical code and software scripts are an essential part of many modern studies. Where feasible, IUMS journals encourage authors to:

  • share analysis scripts (for example in R, Python, Stata, SAS, MATLAB) used to generate key figures and tables;
  • document dependencies (software versions, packages, libraries) and execution instructions;
  • use version-controlled platforms (e.g. institutional Git servers) and archive a release version in a repository that assigns a DOI;
  • provide example input files or simulated data where real data are too sensitive to share.

Code and software should be licensed under terms that clarify what reuse is permitted (for example, MIT, GPL or other recognised open-source licences), taking into account institutional and funder policies.

When code cannot be shared (for example, due to third-party restrictions), authors should explain this in the Data Availability Statement and describe the main steps of the analysis as transparently as possible.

Sensitive, clinical & identifiable data

Balancing openness with participant privacy and legal requirements.

IUMS journals recognise that not all data can or should be openly shared. Special care is required when data involve:

  • patients or study participants in clinical or social research;
  • genetic, genomic or family-related information;
  • rare diseases, small or vulnerable populations where individuals may be re-identified even after de-identification; or
  • information subject to legal or contractual restrictions.

In such cases:

  • any shared dataset must be appropriately de-identified or anonymised, consistent with ethics approvals and local regulations;
  • access may need to be mediated by data access committees, secure data environments or controlled-use agreements;
  • re-identification attempts by data users are strictly prohibited;
  • authors should clearly explain in the DAS what level of access is possible and under which conditions.

If no data can be shared beyond aggregate results presented in the article and supplementary materials, authors must explain why and confirm that this is consistent with the original informed consent and ethics approval. This should be aligned with the Consent & privacy policy.

Licensing & conditions of reuse

Clarifying how others may reuse data and code.

To encourage legitimate reuse and citation, authors should assign clear licences to shared data and code, taking into account:

  • institutional policies;
  • funder requirements; and
  • compatibility with the article’s open access licence.

Common options for data include Creative Commons licences such as CC BY or CC BY-NC, or more restrictive terms where justified. Analytical code is often shared under recognised open-source licences (e.g. MIT, GPL, Apache).

The chosen licence should be indicated in the repository record and, where appropriate, mentioned in the Data Availability Statement. Users of shared data and code are expected to:

  • respect licence conditions and any additional access agreements;
  • avoid attempts at re-identification of individuals; and
  • acknowledge and cite data and code creators in any derived work.

Roles & responsibilities

Who is responsible for implementing this policy.

Responsibilities for data and code sharing are shared among several actors:

  • Authors and principal investigators:
    • plan data management and sharing from the start of the project;
    • ensure that data, code and metadata are organised and documented so that sharing is feasible;
    • select appropriate repositories and licences; and
    • provide accurate Data Availability Statements.
  • Institutions and sponsors:
    • support sustainable data management infrastructures and training;
    • ensure that contractual and legal frameworks enable appropriate data sharing where possible.
  • Editors & journals:
    • set expectations through clear policies;
    • check that manuscripts include appropriate DAS and repository links where required;
    • follow up on concerns about misrepresented or unavailable data.
  • Reviewers:
    • may comment on whether data and code seem sufficiently available to allow verification of key results;
    • can flag inconsistencies between statements in the manuscript and actual repository records.

Non-compliance & corrections

How IUMS journals respond when data sharing expectations are not met.

If, during review or after publication, editors become aware that:

  • Data Availability Statements are inaccurate or misleading;
  • promised data or code are not actually available without justified reason;
  • shared data contain identifiable information that should not be public; or
  • data underlying key results appear to be missing or inconsistent,

journals may:

  • request clarification, corrections or additional documentation from authors;
  • require that a dataset be deposited in an appropriate repository;
  • publish a correction, addendum or editorial note describing changes to data availability;
  • in serious cases, follow procedures outlined in the Misconduct and Corrections & retractions policies.

Where necessary, journals may contact institutions or funders to support formal investigations into potential research or publication misconduct.

Support & contact

Getting help with data and code sharing decisions.

Authors who are unsure how to apply this policy in specific circumstances—such as sensitive clinical datasets, multi-country data sharing, or complex software dependencies—are encouraged to:

  • consult their institutional data management or ethics support services; and
  • contact the editorial office of the target IUMS journal early in the process, ideally before submission.

General questions about this policy across IUMS journals can be directed to:

Policy version: v1.0 – last updated April 2025. This page will be reviewed periodically as funder requirements, repository services and best practices in data sharing evolve.