by David C Young
August 2025

Here are some observations related to training and using neural networks to generate accurate numerical values, particularly with applications to chemistry.
- A neural network is not a complete law of nature. As such, it is going to give good results only for the properties it was trained on, types of molecules it was trained on, range of values it was trained on, etc.
- Use neural networks for interpolation, not extrapolation.
- The amount of training data should be at least 10X the number of parameters in the AI model.
- Avoid noisy or incomplete data.
- Expect the neural network to come out a bit more approximate than the training data. Thus training data accuracy is as important as quantity.
- Test simple models before trying complex ones. For numerical data, a simple neural network or deep learning network may do better than a GPT model.
- Experiment with the number of layers and number of neurons in each layer. Too many and it overfits to give worse accuracy, particularly if there isn’t enough training data.
- Look for a way to normalize the data. A few large outliers can skew the results.
- Consider the best loss function. Should it be less sensitive to outliers?
- The output layer can have as little as one neuron per output variable.
- Have a test set that contains typical cases, trivial cases, extreme cases, out-of-distribution cases, and experimental results.
- Use domain knowledge to impose conservation laws, physical constraints, etc. Use a domain-relative loss function if available.
- Consider algorithms where a non-AI approximate calculation (i.e. semiempirical or molecular dynamics) is done first, then the neural network is used to provide a correction to more closely match the results from high accuracy methods. The AIQM1 and AIQM2 methods are a example of applying this strategy well.
- Consider combining a tree based model with a neural network.
- Multiple neural networks can be combined in an ensemble approach (as in the ANI-1ccx method).
- Consider fine tuning a large model, such as PubChemQC or Universal Model for Atoms (UMA), to more accurately predict a specific property based on your high accuracy training data.
- For extrapolation beyond the effective range of the neural network model, transition to using fundamental laws, theories, or mathematics for correct asymptotic behavior.
- Consider including a uncertainty estimation to indicate when the model is not working well.
- Expect to do a lot of work finding best practices for your problem, and delving into all of the nuances and aspects of neural network training. Casual creation of models easily leads to poor results.
Some methods that apply these to chemistry.
- If trained on high precision data, neural networks can typically give results within 1-5% of the experimental values for physics or chemistry applications. Some can do a bit better than this. However, the state of the art is often not achieving chemical accuracy (<1 kcal/mol from experiment) for methods that are generalizable to a wide range of molecules, even if only limited to organic compounds.
- The AIQM2 method is reported as generating molecular geometry and energy data more accurate than popular DFT methods, but less accurate than CISD(T) CBS. It is parameterized for organic molecules. It can scale up to systems of 1,000,000 atoms at this accuracy, given the availability of high end compute hardware with many GPUs. It is parameterized for C H N O.
- The ANI-1cxx method has multiple AI models for predicting energetics of molecules containing C H N O. It can predict enthalpy of formation, reaction thermochemistry, and conformational energies.
- The UAIQM framework auto-chooses between AIQM1, AIQM2, and ANI-1cxx. It incorporates continuous self-improvement. The makers are working on extending this to more elements such as F, Cl & S.
- Some researchers make small, special purpose models such as making a small neural network to interpolate the data points on a potential energy surface for one specific chemical system.
Other good sources
- “Generative AI for computational chemistry: A roadmap to predicting emergent phenomena” Pratyush Tiwarya, Lukas Herronb, Richard Johnd, Suemin Leeb, Disha Sanwala, and Ruiyu Wanga, PNAS 2025 Vol. 122 No. 41 https://www.pnas.org/doi/pdf/10.1073/pnas.2415655121