The quantum chemical search for novel materials and the issue of data processing: The InfoMol project
In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.
Journal of Computational Science
Lüthi, Hans P.; Heinen, Stefan; Schneider, Gisbert; Glöss, Andreas; Brändle, Martin P.; King, Rollin A.; Pyzer-Knapp, Edward; Alharbi, Fahhad H.; and Kais, Sabre, "The quantum chemical search for novel materials and the issue of data processing: The InfoMol project" (2016). Chemistry Faculty Publications. 6.