Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters

Consistency of QSAR models: Correct split of training and test sets, ranking of models and... Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png SAR and QSAR in Environmental Research Taylor & Francis

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters

Consistency of QSAR models: Correct split of training and test sets, ranking of models and performance parameters

SAR and QSAR in Environmental Research , Volume 26 (7-9): 18 – Sep 2, 2015

Abstract

Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.

Loading next page...
 
/lp/taylor-francis/consistency-of-qsar-models-correct-split-of-training-and-test-sets-Uhp8dWRuhc

References (43)

Publisher
Taylor & Francis
Copyright
© 2015 Taylor & Francis
ISSN
1029-046X
eISSN
1062-936X
DOI
10.1080/1062936X.2015.1084647
Publisher site
See Article on Publisher Site

Abstract

Recent implementations of QSAR modelling software provide the user with numerous models and a wealth of information. In this work, we provide some guidance on how one should interpret the results of QSAR modelling, compare and assess the resulting models, and select the best and most consistent ones. Two QSAR datasets are applied as case studies for the comparison of model performance parameters and model selection methods. We demonstrate the capabilities of sum of ranking differences (SRD) in model selection and ranking, and identify the best performance indicators and models. While the exchange of the original training and (external) test sets does not affect the ranking of performance parameters, it provides improved models in certain cases (despite the lower number of molecules in the training set). Performance parameters for external validation are substantially separated from the other merits in SRD analyses, highlighting their value in data fusion.

Journal

SAR and QSAR in Environmental ResearchTaylor & Francis

Published: Sep 2, 2015

Keywords: model selection; performance parameters; ranking; cross-validation; sum of ranking differences

There are no references for this article.