To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Assessing the calibration in toxicological in vitro models with conformal prediction
In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin, Berlin, Germany.
Alzheimer's Research UK UCL Drug Discovery Institute, London, UK.
Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden; Division of Computational Science and Technology, KTH, Stockholm, Sweden.
Show others and affiliations
2021 (English)In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 13, no 1, article id 35Article in journal (Refereed) Published
Abstract [en]

Machine learning methods are widely used in drug discovery and toxicity prediction. While showing overall good performance in cross-validation studies, their predictive power (often) drops in cases where the query samples have drifted from the training data's descriptor space. Thus, the assumption for applying machine learning algorithms, that training and test data stem from the same distribution, might not always be fulfilled. In this work, conformal prediction is used to assess the calibration of the models. Deviations from the expected error may indicate that training and test data originate from different distributions. Exemplified on the Tox21 datasets, composed of chronologically released Tox21Train, Tox21Test and Tox21Score subsets, we observed that while internally valid models could be trained using cross-validation on Tox21Train, predictions on the external Tox21Score data resulted in higher error rates than expected. To improve the prediction on the external sets, a strategy exchanging the calibration set with more recent data, such as Tox21Test, has successfully been introduced. We conclude that conformal prediction can be used to diagnose data drifts and other issues related to model calibration. The proposed improvement strategy-exchanging the calibration data only-is convenient as it does not require retraining of the underlying model.

Place, publisher, year, edition, pages
BioMed Central, 2021. Vol. 13, no 1, article id 35
Keywords [en]
Applicability domain, Calibration plots, Conformal prediction, Data drifts, Tox21 datasets, Toxicity prediction
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:oru:diva-91683DOI: 10.1186/s13321-021-00511-5ISI: 000645643800001PubMedID: 33926567Scopus ID: 2-s2.0-85105178991OAI: oai:DiVA.org:oru-91683DiVA, id: diva2:1553579
Funder
Swedish Research Council Formas, 2018-00924Swedish Research Council, 2020-03731 2020-01865Swedish Foundation for Strategic Research , BD150008
Note

Funding Agencies:

Projekt DEAL  

FUBright Mobility Allowances  

HaVo-Stiftung  

Federal Ministry of Education & Research (BMBF) 031A262C

Alzheimer's Research UK (ARUK) 560832

Available from: 2021-05-10 Created: 2021-05-10 Last updated: 2024-01-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Norinder, Ulf

Search in DiVA

By author/editor
Norinder, Ulf
By organisation
School of Science and Technology
In the same journal
Journal of Cheminformatics
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 57 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf