The quantum chemical search for novel materials and the issue of data processing: The InfoMol project

Document Type

Article

Abstract

In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

Department(s)

Chemistry

Publication Title

Journal of Computational Science

Volume

15

First Page

65

Last Page

73

Publication Date

7-1-2016

DOI

10.1016/j.jocs.2015.10.003

ISSN

18777503

Share

COinS