Driven To Discover

software & databases

all software


Chemical Variational Autoencoder (chemical_VAE) is a free, open-source software for machine learning of molecular properties. chemical_VAE utilizes molecular SMILES that are encoded into a code vector representation and can be decoded from the code representation back to molecular SMILES. The autoencoder may also be jointly trained with property prediction to help shape the latent space. The new latent space can then be optimized upon to find the molecules with the most optimized properties of interest. chemical_VAE is currently being extended in conjunction with MOFid to capture adsorption of molecules in porous materials.

chemical_VAE can be downloaded from:

R. Gómez-Bombarelli, J. Wei, D. Duvenaud, J. Hernández-Lobato, B. Sánchez-Lengeling, D. Sheberla, J. Aguilera-Iparraguirre, T. Hirzel, R. Adams, and A. Aspuru-Guzik, "Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules," ACS Cent. Sci. 4, 268-276 (2018). DOI: 10.1021/acscentsci.7b00572


Computation-Ready Experimental (CoRE) MOF is a set of databases that enable high-throughput computational screening by using Github's versioning system to manage and curate the data. CoRE MOF 2014 provides cleaned atomic coordinates and pore characteristics of 5109 structures while solving issues such as solvent molecules and partially occupied/disordered atoms in experimental crystal structures. DDEC partial charges and geometry-optimized structures are available for subsets (2900 and 502 MOFs) of the CoRE MOF 2014 database. CoRE MOF 2019 includes 9869 porous MOFs with only free solvent molecules removed and 14,142 porous MOFs with all solvent molecules removed, together with pore characteristics and open-metal-site detection.

The CoRE MOF databases are available for download from


CP2K is a free, open-source quantum chemistry software package designed to perform molecular dynamics and Monte Carlo simulations of clusters and periodic systems. CP2K can be run in both MPI and OpenMP modes, and built-in farming procedures allow for capacity jobs at DOE Leadership Computing Facilities. The NMGC team contributes to CP2K through the development first principles Monte Carlo (FPMC) modules for simulations of phase, adsorption, and chemical equilibria and through algorithms for the incorporation of nuclear quantum effects.

CP2K can be downloaded from: (, and NMGC will make workflows for FPMC simulations of adsorption equilibria available to the community.

Funding for the Siepmann group’s development of modules ( for CP2K through grants from the Department of Energy (DE-FG02-12ER16362 and DE-FG02-17ER16362 for FPMC simulations of adsorption equilibria; Lawrence Livermore National Laboratory for FPMC simulations in the canonical and isobaric-isothermal ensembles) and the National Science Foundation (FPMC simulations of vapor-liquid equilibria and of reaction equilibria) is gratefully acknowledged.

J. Hutter, M. Iannuzzi, F. Schiffmann, and J. VandeVondele, "CP2K: Atomistic simulations of condensed matter systems," WIREs Comput. Mol. Sci. 4, 15-25 (2014). DOI: 10.1002/wcms.1159

E. O. Fetisov, M. S. Shah, J. R. Long, M. Tsapatsis, and J. I. Siepmann, "First principles Monte Carlo simulations of unary and binary adsorption: CO2, N2, and H2O in Mg-MOF-74," Chem. Comm. 54, 10816-10819 (2018). DOI: 10.1039/C8CC06178E

Master of Filtering
Master of Filtering

Master of Filtering (MOF) is a game being developed by the NMGC to engage youth and citizen scientists with concepts of porous materials and separations. A video tutorial of the game is available at The long range goal for this game is the crowdsourcing of nanoporous material design through gamification.


MCCCS-MN is a free, open-source Monte Carlo software tailored for simulations of phase and adsorption equilibria in the Gibbs ensemble using the TraPPE force field. MCCCS-MN is particularly efficient for equilibria involving multiple condensed phases and articulated molecules. MCCCS–MN uses hybrid MPI/OpenMP for parallel execution, has been adapted to processors with high-bandwidth MCDRAM, and workflows with specific I/O handling allow for capacity jobs at DOE Leadership Computing Facilities.

More information about MCCCS-MN can be found at

Funding for the development of MCCCS-MN through grants from the National Science Foundation (simulation of fluid phase equilibria) and the Department of Energy (simulation of adsorption equilibria and high-throughput workflows) is gratefully acknowledged.

MCCCS-Towhee, a more user-friendly version with support for a variety of force fields but slower version, is freely available for download (

P. Bai, M. Y. Jeon, L. Ren, C. Knight, M. W. Deem, M. Tsapatsis, and J. I. Siepmann, "Discovery of optimal zeolites for challenging separations and chemical transformations using predictive materials modeling," Nat. Commun. 6, 5912 (2014). DOI: 10.1038/ncomms6912

MCCCS‒MN is available via a GNU general public license. Specific versions of MCCCS‒MN used for a given publication are made available as part of the Supporting Information of the following publications:

Y. Sun, R. F. DeJaco, and J. I. Siepmann, "Deep neural network learning of complex binary sorption equilibria from molecular simulation data," Chem. Sci. 10, 4377–4388 (2019). DOI: 10.1039/C8SC05340E

T. R. Josephson, R. Singh, M. S. Minkara, and J. I. Siepmann, "Partial molar properties from molecular simulation using multiple linear regression," Mol. Phys. 117, 3589-3602 (2019). DOI: 10.1080/00268976.2019.1648898


MemPy v1.0 is a Python-based software tool for simulating the separation performance of gas separations with spiral wound membranes. It supports a wide variety of types of calculations, including those with variables depending on one or two dimensions, with the Peng-Robinson or ideal gas equation of state, and with a linear or nonlinear description of permeance. The models have been validated by comparing to an experimental system for air separation. As such, the software is useful for process intensification of gas separation with spiral wound membranes.

MemPy v1.0 is free for download at

R. F. DeJaco, K. Loprete, K. Pennisi, S. Majumdar, J. I. Siepmann, P. Daoutidis, H. Murnen, and M. Tsapatsis, "Modeling and simulation of gas separations with spiral‐wound membranes," AlChE Jour. online, e16727 (2020). DOI: 10.1002/aic.16274

MN Database 2019
Geometries for Minnesota Database 2019

Minnesota Database 2019 comprises of a diverse set of chemical data that can be used for benchmarking electronic structure calculations and/or optimizing density functionals or wave function methods. The reference values of the data have been published [P. Verma et al., J. Phys. Chem. A 123, 2966-2990 (2019);], and the present compendium provides the molecular geometries, basis set information, and settings that we have used for calculations to compare to the reference data. There are 56 subdatabases in Database 2019, and the data include a variety of atomic and molecular properties, including atomization energies, reaction energies, bond dissociation energies, isomerization energies, noncovalent complexation energies, proton affinities, electron affinities, ionization potentials, barrier heights, thermochemistry of hydrocarbons, absolute atomic energies, vertical and adiabatic electronic excitation energies, and geometries of molecules; both main-group and transition-metal-containing systems are present.

More information can be found at:

P. Verma, Y. Wang, S. Ghosh, X. He, and D. G. Truhlar, "Revised M11 Exchange-Correlation Functional for Electronic Excitation Energies and Ground-State Properties," J. Phys. Chem. A 123, 2966-2990 (2019) DOI: 10.1021/acs.jpca.8b11499


MOFid and MOFkey is a system for rapid identification and analysis of metal-organic frameworks. It is an open-source software for deconstructing MOFs into their building blocks and underlying topological network. The code is comprised of three overall parts: a main C++ code for deconstructing MOF structures into their building blocks, Python code to assemble the MOFid/MOFkey identifiers, and various analysis utilities.

MOFid and MOFkey is available at
A version that runs in your web browser is available at

B. J. Bucior, A. S. Rosen, M. Haranczyk, Z. Yao, M. E. Ziebel, O. K. Farha, J. T. Hupp, J. I. Siepmann, A. Aspuru-Guzik, and R. Q. Snurr, "Identification Schemes for Metal–Organic Frameworks To Enable Rapid Search and Cheminformatics Analysis," Cryst. Growth Des. 19, 6682-6697 (2019). DOI: 10.1021/acs.cgd.9b01050


The Nanoporous Materials Adsorption Energy (NMAE) Database is a freely available database currently under development that provides a repository for adsorption energies (internal energy of adsorption, enthalpy of adsorption, Gibbs free energy of adsorption) predicted and measured by the NMGC team.

Visit the NMAE page for more information.

The Materials Project
Nanoporous Materials Explorer

The Nanoporous Materials Explorer App is a database containing information on thousands of materials' computational properties. The application aims to present the accumulation of data in a new, interactive way. The Nanoporous Materials Explorer App data are predicted, measured, and maintained by the NMGC in partnership with the Materials Project. Currently, more than 500,000 nanoporous materials have data, such as structures and point charges, recorded in a searchable format through the Nanoporous Materials Explorer.

The App (requires registration) and a detailed manual are available at


pyIAST is a user-friendly, open-source Python that can fit data into analytical isotherm models or use interpolation to characterize the pure-component adsorption isotherms.

pyIAST is hosted on Github and is free for download at, with additional documents available at Users may also contribute to the source code via the Github system. Communication with pyIAST authors is done via email and Github's messaging system.

C. Simon, B. Smit, and M. Haranczyk, "pyIAST: Ideal Adsorbed Solution Theory (IAST) Python Package," Comput. Phys. Commun. 200, 364-380 (2016). DOI: 10.1016/j.cpc.2015.11.016


Python Isotherm Prediction (PyIsoP) is an open‐source software package that uses a fast and accurate, semi‐analytical algorithm to calculate the adsorption of single‐site molecules in NPMs using energy grids. The method is about 100 times faster than atomistic grand canonical MC simulations and is useful for obtaining quick estimates of adsorption for high‐throughput screening of large databases.

PyIsoP can be downloaded from:

A. Gopalan, B. J. Bucior, N. S. Bobbitt, R. Q. Snurr, "Prediction of hydrogen adsorption in nanoporous materials from the energy distribution of adsorption sites," Mol. Phys. 117, 3683–3694 (2019). DOI: 10.1080/00268976.2019.1658910


PySCF is a free, open-source quantum chemistry and solid-state physics software package designed to perform electronic structure calculations in molecular and periodic systems. The NMGC team contributes to PySCF through the development of approaches for quantum embedding calculations. This code is hosted on Github and is free for download at Future goals are the development of robust quantum methods for highly accurate calculations in large systems.

Q. Sun, T. C. Berkelbach, N. S. Blunt, G. H. Booth, S. Guo, Z. Li, J. Liu, J. McClain, E. R. Sayfutyarova, S. Sharma, S. Wouters, and G. K.-L. Chan, "PySCF: the Python‐based simulations of chemistry framework," WIREs Comput. Mol. Sci. 8, e1340 (2018). DOI: 10.1002/wcms.1340

D. V. Chulhai and J. D. Goodpaster, "Projection-Based Correlated Wave Function in Density Functional Theory Embedding for Periodic Systems," J. Chem. Theory Comput. 14, 1928-1942 (2018). DOI: 10.1021/acs.jctc.7b01154


QMMM is a computer program for performing single-point calculations (energies, gradients, and Hessians), geometry optimizations, and molecular dynamics using combined quantum mechanics (QM) and molecular mechanics (MM) methods. The boundary between the QM and MM regions can be treated by a number of schemes, including the redistributed charge (RC) scheme, the redistributed charge and dipole (RCD) scheme, the polarized-boundary RC (PBRC) scheme, the polarized-boundary RCD (PBRCD) scheme, the flexible-boundary RC (FBRC), and the flexible-boundary RCD (FBRCD) scheme. QMMM calls a QM package and an MM package to perform required single-level calculations. QMMM was tested with GAMESS, Gaussian (both Gaussian 09 and Gaussian 16), and ORCA for the QM package and with TINKER for the MM package; it contains 156 sample runs that can be used to learn and test the program.

After completing a free license form, the QMMM can be freely downloaded from:

QMMM 2017 by H. Lin, Y. Zhang, S. Pezeshki, B. Wang, X.-P. Wu, L. Gagliardi, and D. G. Truhlar, University of Minnesota, Minneapolis, 2017.

QMOF Database

The Quantum Metal–Organic Framework (QMOF) database contains quantum-chemical properties for over 14,000 experimental MOF crystal structures, computed using periodic density functional theory calculations.

More information can be found at the QMOF GitHUB repository

A. Rosen, S. Iyer, D. Ray, Z. Yao, A. Aspuru-Guzik, L. Gagliardi, et al. (2020): Machine Learning the Quantum-Chemical Properties of Metal–Organic Frameworks for Accelerated Materials Discovery with a New Electronic Structure Database. ChemRxiv. Preprint. DOI: 10.26434/chemrxiv.13147616.v1


RASPA is a software package for simulating adsorption and diffusion of molecules in flexible nanoporous materials. The code implements the latest state-of-the-art algorithms for molecular dynamics and Monte Carlo in various ensembles. Applications of RASPA include computing coexistence properties, adsorption isotherms for single and multiple components, self- and collective diffusivities, and visualization. RASPA is particularly efficient for gas adsorption in a wide variety of porous materials. The NMGC team contributes to the development of RASPA.

RASPA is available for download from a git server. Information on RASPA is provided at

D. Dubbeldam, S. Calero, D. E. Ellis, and R. Q. Snurr, "RASPA: Molecular simulation software for adsorption and diffusion in flexible nanoporous materials," Molec. Sim. 42, 81-101 (2016). DOI: 10.1080/08927022.2015.1010082


SorbMetaML is an open‐source meta-learning model for the prediction of unary adsorption for nanoporous materials based on example adsorption data for a material. SorbMetaML has been used to identify the optimal hydrogen storage temperature with the highest working capacity for a given pressure difference for diverse nanoporous materials. Datasets for the hydrogen adsorption of all-silica zeolites, hyper-cross-linked polymers, and metal-organic frameworks are provided.

SorbMetaML can be downloaded from:


SorbNet is an open‐source deep neural network for the prediction of adsorption data for binary mixtures over large temperature and pressure ranges that can be used to optimize adsorption/desorption conditions. Example datasets and Python notebooks are provided.

SorbNet can be downloaded from:

Y. Sun, R. F. DeJaco, and J. I. Siepmann, "Deep Neural Network Learning of Complex Binary Sorption Equilibria from Molecular Simulation Data," Chem. Sci. 10, 4377–4388 (2019). DOI: 10.1039/C8SC05340E


SupramolecularVAE is an open-source multi-component variational autoencoder for the property-guided inverse design of reticular frameworks including metal-organic frameworks and covalent-organic frameworks. Example datasets and Python notebooks are provided. The NMGC team is the sole developer.

For more information or to download SupramolecularVAE go to:


Zeo++ is an open-source software for performing high-throughput geometry-based analysis of porous materials and their voids. Future plans for Zeo++ include the addition of functionality for hard and soft nanoporous materials. Zeo++ serves approximately 800 registered users, who can communicate with the developer via email.

Registration is required by LBNL for downloading the code from

T. F. Willems, C. H. Rycroft, M. Kazi, J. C. Meza, and M. Haranczyk, "Algorithms and tools for high-throughput geometry-based analysis of crystalline porous materials," Microporous Mesoporous Mater. 149, 134-141 (2012). DOI: 10.1016/j.micromeso.2011.08.020


Department of Energy Logo

This research is supported by the U.S. Department of Energy, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences under Award DE-FG02-17ER16362 (Predictive Hierarchical Modeling of Chemical Separations and Transformations in Functional Nanoporous Materials: Synergy of Electronic Structure Theory, Molecular Simulations, Machine Learning, and Experiment) and was previously supported by DE-FG02-12ER16362 (Nanoporous Materials Genome: Methods and Software to Optimize Gas Storage, Separations, and Catalysis).

©2020 Regents of the University of Minnesota. All rights reserved. The University of Minnesota is an equal opportunity educator and employer. Privacy Statement