Keywords
chemical datasets
drug-like molecules
machine learning
multilayer perceptron
neural network
overfitting
partial atomic charges
random forest
transferability
Abstract
Overfitting in machine learning models for partial atomic charges was investigated for highly heterogeneous datasets common in medicinal chemistry. Random forest and multilayer perceptron models were trained and validated on a specially clustered dataset of drug-like molecules. Analysis of standard quality metrics for reproducing RESP charges showed that the trained models exhibit no evidence of overfitting.
Funders
Ministry of Education and Science of the Russian Federation
121021000105-7
References
1.
Cisneros G.A., Karttunen M., Ren P., Sagui C.
Chemical Reviews,
2013
2.
Riniker S.
Journal of Chemical Information and Modeling,
2018
3.
Kramer C., Gedeck P., Meuwly M.
Journal of Computational Chemistry,
2012
4.
Mulliken R.S.
Journal of Chemical Physics,
1955
5.
DDEC6: A Method for Computing Even-Tempered Net Atomic Charges in
Periodic and Nonperiodic Materials
Thomas A. Manz, Nidia Gabaldon Limas
ArXiv,
2015
6.
Bayly C.I., Cieplak P., Cornell W., Kollman P.A.
The Journal of Physical Chemistry,
1993
7.
Jakalian A., Jack D.B., Bayly C.I.
Journal of Computational Chemistry,
2002
8.
Marenich A.V., Jerome S.V., Cramer C.J., Truhlar D.G.
Journal of Chemical Theory and Computation,
2012
9.
Stewart J.J.
Journal of Molecular Modeling,
2012
10.
Gasteiger J., Marsili M.
Tetrahedron,
1980
11.
Mortier W.J., Van Genechten K., Gasteiger J.
Journal of the American Chemical Society,
1985
12.
Shulga D.A., Oliferenko A.A., Pisarev S.A., Palyulin V.A., Zefirov N.S.
SAR and QSAR in Environmental Research,
2008
13.
Frolov V.S., Shulga D.A., Shaimardanov A.R., Palyulin V.A.
Russian Chemical Bulletin,
2025
14.
Reynolds C.A., Essex J.W., Richards W.G.
Journal of the American Chemical Society,
1992
15.
Mitchell J.B.
Wiley Interdisciplinary Reviews: Computational Molecular Science,
2014
16.
Muller C., Rabal O., Diaz Gonzalez C.
Methods in Molecular Biology,
2021
17.
Fedik N., Zubatyuk R., Kulichenko M., Lubbers N., Smith J.S., Nebgen B., Messerly R., Li Y.W., Boldyrev A.I., Barros K., Isayev O., Tretiak S.
Nature Reviews Chemistry,
2022
18.
Rodríguez-Pérez R., Miljković F., Bajorath J.
Annual Review of Biomedical Data Science,
2022
19.
Dietterich T.
ACM Computing Surveys,
2002
20.
Ying X.
Journal of Physics: Conference Series,
2019
21.
Huang B., von Lilienfeld O.A.
Chemical Reviews,
2021
22.
Gould T., Chan B., Dale S.G., Vuckovic S.
Chemical Science,
2024
23.
Yang D.T., Gronenborn A.M., Chong L.T.
Journal of Physical Chemistry A,
2022
24.
Schneider A.L., Albrecht A.V., Huang K., Germann M.W., Poon G.M.
Life,
2022
25.
He X., Man V.H., Yang W., Lee T., Wang J.
Journal of Chemical Physics,
2020
26.
Lu C., Wu C., Ghoreishi D., Chen W., Wang L., Damm W., Ross G.A., Dahlgren M.K., Russell E., Von Bargen C.D., Abel R., Friesner R.A., Harder E.D.
Journal of Chemical Theory and Computation,
2021