About the QCArchive

The Quantum Chemistry Archive aims to provide an open, community-wide quantum chemistry database to both facilitate and capture hundreds of millions of hours of computing time to enable large-scale forcefield construction, physical property prediction, new methodology assessment, and machine learning from data that would otherwise end up siloed or inaccessible.

Manage quantum chemistry computations at scale

I'd like to do machine learning on dataset X from a single line.

I want to generate bespoke parameters for a new drug from an existing database.

Which DFT method gives the best results for my set of reactions?

These are some of the use cases we at MolSSI heard from the quantum chemistry community when evaluating how best to serve this area. The emerging trend we saw was a need to have a regular structure of quantum chemistry results, the ability to access them on-demand without downloading several GB of data and run new calculations if need be ideally without repeating what someone has already done. To meet these goals, the QCArchive project was started to unite this otherwise isolated data which may only have been shared previously though massive, non-regular files; or through publications and supplementary material. Funded by the NSF, MolSSI seeks to provide solutions to these, and other use-cases in the quantum chemistry field through the QCArchive project.

The QCArchive is an Open-Source Ecosystem

The QCA project is designed from the start to be open-source and available to the computational molecular sciences community. The entire code base is public through GitHub repositories, and we are contributing back to other Open Source projects as good members of the community. Rather than a monolithic piece of software, the QCArchive software ecosystem is decomposed into several components so that it can be used as a full ecosystem to build on top of or in support of community software. See a full list of software components here.

A multilayered ecosystem

The QCArchive ecosystem is made up of a series of layers to suit a variety of different use cases. The QCArchive ecosystem consists of the following:

Support libraries for quantum chemistry, from physical constants to abstract program execution engines.
A distributed task engine for running and organizing arbitrary computations.
The MolSSI QCArchive instance for the community to share and archive data.
High-level organizational layers known as Collections to compute and analyize commonly found operations.

See our getting started page for help on finding the right place to start:

Get Started!

We support software best practices

QCArchive's projects are all based on the same software best practices. Both users and developers can follow the code structure between all of the projects, and these practices are provided back to the community through the external projects like the CMS-Cookiecutter. This cookiecutter supports:

Continuous integration with automated testing and testing coverage
Automated versionsing and package distribution via Conda-Forge and PyPI
Code quality and linting checks for beutiful code
Automatic package organization and documentation setup

Private data

The QCArchive software stack is fully open-source, the primary database and job distribution engine, QCFractal, can be created locally on your hardware and connected to your compute cluster without ever communicating outside your firewall.

Code quality is of utmost importance

The quality of the QCArchive code is checked through rigorous continuous integration, code linters, and test coverage tools which help reduce the chance that bugs are introduced. We take data accuracy very seriously and all results and calculations are versioned, provenance tracked, and will be preserved over any updates of the database.

Technology Stack

Help contribute to the QCArchive

Want to see your code in QCArchive? Find a bug? Have questions in general? Head to the GitHub page and contribute!

To QCArchive GitHub