8 Reproducibility Resources

Model releases accompanied with claims on performance that are not reproducible, code that is unavailable, incomplete, or difficult to run costs the scientific community time and effort. The following resources are valuable to help others replicate and verify the claims.

Reproducibility Resources

Reproducibility

Text 8 Speech 7 Vision 7 Video 1 Tabular 1
  • LM Evaluation Harness

    Orchestration framework for standardizing LM prompted evaluation, supporting hundreds of subtasks.

  • Anaconda Webpage

    Anaconda

    An environment and dependency management tool.

    Text Speech Vision
  • Colab Notebooks Webpage

    Colab Notebooks

    A tool to execute and share reproducible code snippets.

    Text Speech Vision
  • Docker Webpage

    Docker

    An environment and dependency management tool.

    Text Speech Vision
  • Jupyter Notebooks Webpage

    Jupyter Notebooks

    A tool to execute and share reproducible code snippets.

    Text Speech Vision
  • Semver Webpage

    Semver

    A widely used protcol for versioning to software, to ensure easy reproducibility.

    Text Speech Vision
  • Reforms Webpage

    Reforms

    Reporting Standards for ML-based Science.

    Text Speech Vision
  • Croissant Website

    Croissant

    Croissant is an open-source metadata format developed by MLCommons to standardise the description of machine learning (ML) datasets, enhancing their discoverability, portability, and interoperability across various tools and platforms. It builds upon the schema.org vocabulary, extending it to encapsulate ML-specific attributes, including dataset structure, resources, and semantics. Croissant is particularly relevant in scenarios requiring consistent dataset documentation to facilitate seamless integration into ML workflows.

    Text Vision Speech Video Tabular