To Örebro University

oru.seÖrebro University Publications
Change search
Refine search result
1 - 48 of 48
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ahlberg, Ernst
    et al.
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Winiwarter, Susanne
    Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Boström, Henrik
    Department of Computer and Systems Sciences, Stockholm University, Sweden.
    Linusson, Henrik
    Department of Information Technology, University of Borås, Sweden.
    Löfström, Tuve
    Högskolan i Jönköping, JTH. Forskningsmiljö Datavetenskap och informatik, Jönköping, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institutet, Unit of Toxicology Sciences, Stockholm, Sweden.
    Johansson, Ulf
    Högskolan i Jönköping, JTH, Datateknik och informatik, Jönköping, Sweden.
    Engkvist, Ola
    External Sciences, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Hammar, Oscar
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Bendtsen, Claus
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Cambridge, England.
    Carlsson, Lars
    Quantitative Biology, Discovery Sciences, AstraZeneca IMED Biotech Unit, Mölndal, Sweden.
    Using conformal prediction to prioritize compound synthesis in drug discovery2017In: Proceedings of Machine Learning Research: Volume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, and Harris Papadopoulos, Stockholm, 2017, p. 174-184Conference paper (Refereed)
    Abstract [en]

    The choice of how much money and resources to spend to understand certain problems is of high interest in many areas. This work illustrates how computational models can be more tightly coupled with experiments to generate decision data at lower cost without reducing the quality of the decision. Several different strategies are explored to illustrate the trade off between lowering costs and quality in decisions.

    AUC is used as a performance metric and the number of objects that can be learnt from is constrained. Some of the strategies described reach AUC values over 0.9 and outperforms strategies that are more random. The strategies that use conformal predictor p-values show varying results, although some are top performing.

    The application studied is taken from the drug discovery process. In the early stages of this process compounds, that potentially could become marketed drugs, are being routinely tested in experimental assays to understand the distribution and interactions in humans.

    Download full text (pdf)
    FULLTEXT01
  • 2.
    Alijagic, A.
    et al.
    Örebro University, School of Science and Technology. Inflammatory Response and Infection Susceptibility Center (iRiSC).
    Scherbak, N.
    Örebro University, School of Science and Technology.
    Kotlyar, O.
    Örebro University, School of Science and Technology.
    Karlsson, P.
    Örebro University, School of Science and Technology. Department of Mechanical Engineering.
    Persson, A.
    Örebro University, School of Medical Sciences. Inflammatory Response and Infection Susceptibility Center (iRiSC).
    Hedbrant, A.
    Örebro University, School of Medical Sciences. Inflammatory Response and Infection Susceptibility Center (iRiSC).
    Norinder, U.
    Örebro University, School of Science and Technology.
    Larsson, M.
    Örebro University, School of Science and Technology.
    Felth, J.
    Uddeholms AB, Hagfors, Sweden.
    Andersson, L.
    Örebro University, School of Medical Sciences. Örebro University Hospital. Inflammatory Response and Infection Susceptibility Center (iRiSC); , Department of Occupational and Environmental Medicine.
    Särndahl, E.
    Örebro University, School of Medical Sciences. Inflammatory Response and Infection Susceptibility Center (iRiSC).
    Engwall, M.
    Örebro University, School of Science and Technology.
    Cell Painting unveils cell response signatures to (nano)particles formed in additive manufacturing2022In: Toxicology Letters, ISSN 0378-4274, E-ISSN 1879-3169, P17-01, Vol. 368, no Suppl. 1, p. S226-S227, article id P17-01Article in journal (Other academic)
  • 3.
    Alvarsson, Jonathan
    et al.
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Arvidsson McShane, Staffan
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Spjuth, Ola
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Predicting with confidence: Using conformal prediction in drug discovery2021In: Journal of Pharmaceutical Sciences, ISSN 0022-3549, E-ISSN 1520-6017, Vol. 110, no 1, p. 42-49Article in journal (Refereed)
    Abstract [en]

    One of the challenges with predictive modeling is how to quantify the reliability of the models' predictions on new objects. In this work we give an introduction to conformal prediction, a framework that sits on top of traditional machine learning algorithms and which outputs valid confidence estimates to predictions from QSAR models in the form of prediction intervals that are specific to each predicted object. For regression, a prediction interval consists of an upper and a lower bound. For classification, a prediction interval is a set that contains none, one, or many of the potential classes. The size of the prediction interval is affected by a user-specified confidence/significance level, and by the nonconformity of the predicted object; i.e., the strangeness as defined by a nonconformity function. Conformal prediction provides a rigorous and mathematically proven framework for in silico modeling with guarantees on error rates as well as a consistent handling of the models' applicability domain intrinsically linked to the underlying machine learning model. Apart from introducing the concepts and types of conformal prediction, we also provide an example application for modeling ABC transporters using conformal prediction, as well as a discussion on general implications for drug discovery.

  • 4.
    Arvidsson McShane, Staffan
    et al.
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, 75124, Sweden.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, 75124, Sweden; Department of Computer and Systems Sciences, Stockholm University, Stockholm, 10587, Sweden.
    Alvarsson, Jonathan
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, 75124, Sweden.
    Ahlberg, Ernst
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, 75124, Sweden; Department of Computer Science, Royal Holloway University of London, Egham, TW20 0EX, UK.
    Carlsson, Lars
    Department of Computer Science, Royal Holloway University of London, Egham, TW20 0EX, UK; Department of Computing, Jönköping University, Jönköping, 55111, Sweden.
    Spjuth, Ola
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, 75124, Sweden.
    CPSign: conformal prediction for cheminformatics modeling2024In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 16, no 1, article id 75Article in journal (Refereed)
    Abstract [en]

    Conformal prediction has seen many applications in pharmaceutical science, being able to calibrate outputs of machine learning models and producing valid prediction intervals. We here present the open source software CPSign that is a complete implementation of conformal prediction for cheminformatics modeling. CPSign implements inductive and transductive conformal prediction for classification and regression, and probabilistic prediction with the Venn-ABERS methodology. The main chemical representation is signatures but other types of descriptors are also supported. The main modeling methodology is support vector machines (SVMs), but additional modeling methods are supported via an extension mechanism, e.g. DeepLearning4J models. We also describe features for visualizing results from conformal models including calibration and efficiency plots, as well as features to publish predictive models as REST services. We compare CPSign against other common cheminformatics modeling approaches including random forest, and a directed message-passing neural network. The results show that CPSign produces robust predictive performance with comparative predictive efficiency, with superior runtime and lower hardware requirements compared to neural network based models. CPSign has been used in several studies and is in production-use in multiple organizations. The ability to work directly with chemical input files, perform descriptor calculation and modeling with SVM in the conformal prediction framework, with a single software package having a low footprint and fast execution time makes CPSign a convenient and yet flexible package for training, deploying, and predicting on chemical data. CPSign can be downloaded from GitHub at https://github.com/arosbio/cpsign.

    Scientific contribution: CPSign provides a single software that allows users to perform data preprocessing, modeling and make predictions directly on chemical structures, using conformal and probabilistic prediction. Building and evaluating new models can be achieved at a high abstraction level, without sacrificing flexibility and predictive performance-showcased with a method evaluation against contemporary modeling approaches, where CPSign performs on par with a state-of-the-art deep learning based model.

  • 5.
    Attoff, Kristina
    et al.
    Department of Neurochemistry, Stockholm University, Stockholm, Sweden.
    Gliga, Anda
    Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden.
    Lundqvist, Jessica
    Department of Neurochemistry, Stockholm University, Stockholm, Sweden; Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden.
    Forsby, Anna
    Department of Neurochemistry, Stockholm University, Stockholm, Sweden; Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden.
    Whole genome microarray analysis of neural progenitor C17.2 cells during differentiation and validation of 30 neural mRNA biomarkers for estimation of developmental neurotoxicity2017In: PLOS ONE, E-ISSN 1932-6203, Vol. 12, no 12, article id e0190066Article in journal (Refereed)
    Abstract [en]

    Despite its high relevance, developmental neurotoxicity (DNT) is one of the least studied forms of toxicity. Current guidelines for DNT testing are based on in vivo testing and they require extensive resources. Transcriptomic approaches using relevant in vitro models have been suggested as a useful tool for identifying possible DNT-generating compounds. In this study, we performed whole genome microarray analysis on the murine progenitor cell line C17.2 following 5 and 10 days of differentiation. We identified 30 genes that are strongly associated with neural differentiation. The C17.2 cell line can be differentiated into a co-culture of both neurons and neuroglial cells, giving a more relevant picture of the brain than using neuronal cells alone. Among the most highly upregulated genes were genes involved in neurogenesis (CHRDL1), axonal guidance (BMP4), neuronal connectivity (PLXDC2), axonogenesis (RTN4R) and astrocyte differentiation (S100B). The 30 biomarkers were further validated by exposure to non-cytotoxic concentrations of two DNT-inducing compounds (valproic acid and methylmercury) and one neurotoxic chemical possessing a possible DNT activity (acrylamide). Twenty-eight of the 30 biomarkers were altered by at least one of the neurotoxic substances, proving the importance of these biomarkers during differentiation. These results suggest that gene expression profiling using a predefined set of biomarkers could be used as a sensitive tool for initial DNT screening of chemicals. Using a predefined set of mRNA biomarkers, instead of the whole genome, makes this model affordable and high-throughput. The use of such models could help speed up the initial screening of substances, possibly indicating alerts that need to be further studied in more sophisticated models.

  • 6.
    Benfenati, Emilio
    et al.
    Istituto di Ricerche Farmacologiche Mario Negri (IRCCS), Milano, Italy.
    Golbamaki, Azadi
    Istituto di Ricerche Farmacologiche Mario Negri (IRCCS), Milano, Italy.
    Raitano, Giuseppa
    Istituto di Ricerche Farmacologiche Mario Negri (IRCCS), Milano, Italy.
    Roncaglioni, Alessandra
    Istituto di Ricerche Farmacologiche Mario Negri (IRCCS), Milano, Italy.
    Manganelli, Serena
    Istituto di Ricerche Farmacologiche Mario Negri (IRCCS), Milano, Italy; Nestlé Research Center, Lausanne, Switzerland.
    Lemke, Fabian
    KnowledgeMiner, Berlin, Germany.
    Norinder, Ulf
    Swetox, Södertälje, Sweden; Dept of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Lo Piparo, Elena
    Nestlé Research Center, Lausanne, Switzerland.
    Honma, Masamitsu
    National Institute of Health Sciences, Kawasaki, Japan.
    Manganaro, Alberto
    KODE, Pisa, Italy.
    Gini, Giuseppina
    Politecnico di Milano, Milano, Italy.
    A large comparison of integrated SAR/QSAR models of the Ames test for mutagenicity($)2018In: SAR and QSAR in environmental research (Print), ISSN 1062-936X, E-ISSN 1029-046X, Vol. 29, no 8, p. 591-611Article in journal (Refereed)
    Abstract [en]

    Results from the Ames test are the first outcome considered to assess the possible mutagenicity of substances. Many QSAR models and structural alerts are available to predict this endpoint. From a regulatory point of view, the recommendation from international authorities is to consider the predictions of more than one model and to combine results in order to develop conclusions about the mutagenicity risk posed by chemicals. However, the results of those models are often conflicting, and the existing inconsistency in the predictions requires intelligent strategies to integrate them. In our study, we evaluated different strategies for combining results of models for Ames mutagenicity, starting from a set of 10 diverse individual models, each built on a dataset of around 6000 compounds. The novelty of our study is that we collected a much larger set of about 18,000 compounds and used the new data to build a family of integrated models. These integrations used probabilistic approaches, decision theory, machine learning, and voting strategies in the integration scheme. Results are discussed considering balanced or conservative perspectives, regarding the possible uses for different purposes, including screening of large collection of substances for prioritization.

  • 7.
    Béquignon, Olivier J. M.
    et al.
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Gómez-Tamayo, Jose C.
    Research Programme on Biomedical Informatics (GRIB), Department of Medicine and Life Sciences, Hospital del Mar Medical Research Institute, Universitat Pompeu Fabra, Carrer del Dr. Aiguader 88, 08002 Barcelona, Spain.
    Lenselink, Eelke B.
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Wink, Steven
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Hiemstra, Steven
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Lam, Chi Chung
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Gadaleta, Domenico
    Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, IRCCS─Istituto di Ricerche Farmacologiche Mario Negri, Via la Masa 19, 20156 Milano, Italy.
    Roncaglioni, Alessandra
    Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, IRCCS─Istituto di Ricerche Farmacologiche Mario Negri, Via la Masa 19, 20156 Milano, Italy.
    Norinder, Ulf
    Örebro University, School of Science and Technology.
    Water, Bob van de
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Pastor, Manuel
    Research Programme on Biomedical Informatics (GRIB), Department of Medicine and Life Sciences, Hospital del Mar Medical Research Institute, Universitat Pompeu Fabra, Carrer del Dr. Aiguader 88, 08002 Barcelona, Spain.
    van Westen, Gerard J. P.
    Leiden Academic Centre for Drug Research, Leiden University, Wassenaarseweg 76, 2333 AL Leiden, The Netherlands.
    Collaborative SAR Modeling and Prospective In Vitro Validation of Oxidative Stress Activation in Human HepG2 Cells2023In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 63, no 17, p. 5433-5445Article in journal (Refereed)
    Abstract [en]

    Oxidative stress is the consequence of an abnormal increase of reactive oxygen species (ROS). ROS are generated mainly during the metabolism in both normal and pathological conditions as well as from exposure to xenobiotics. Xenobiotics can, on the one hand, disrupt molecular machinery involved in redox processes and, on the other hand, reduce the effectiveness of the antioxidant activity. Such dysregulation may lead to oxidative damage when combined with oxidative stress overpassing the cell capacity to detoxify ROS. In this work, a green fluorescent protein (GFP)-tagged nuclear factor erythroid 2-related factor 2 (NRF2)-regulated sulfiredoxin reporter (Srxn1-GFP) was used to measure the antioxidant response of HepG2 cells to a large series of drug and drug-like compounds (2230 compounds). These compounds were then classified as positive or negative depending on cellular response and distributed among different modeling groups to establish structure-activity relationship (SAR) models. A selection of models was used to prospectively predict oxidative stress induced by a new set of compounds subsequently experimentally tested to validate the model predictions. Altogether, this exercise exemplifies the different challenges of developing SAR models of a phenotypic cellular readout, model combination, chemical space selection, and results interpretation.

  • 8.
    Capuccini, Marco
    et al.
    Uppsala University, Uppsala, Sweden.
    Carlsson, Lars
    AstraZeneca R&D, Sweden.
    Norinder, Ulf
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Spjuth, Ola
    Uppsala University, Uppsala, Sweden.
    Conformal prediction in Spark: Large-scale machine learning with confidence2015In: Proc. 2nd International Symposium on Big Data Computing / [ed] Raicu, I.; Rana, O.; Buyya, R., IEEE Computer Society , 2015, Vol. 1, p. 61-67Conference paper (Refereed)
  • 9.
    Eklund, Martin
    et al.
    Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; AstraZeneca Research and Development, Mölndal, Sweden.
    Norinder, Ulf
    H Lundbeck & Co AS, Valby, Denmark.
    Boyer, Scott
    AstraZeneca Research and Development, Mölndal, Sweden.
    Carlsson, Lars
    AstraZeneca Research and Development, Mölndal, Sweden.
    Choosing Feature Selection and Learning Algorithms in QSAR2014In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 54, no 3, p. 837-843Article in journal (Refereed)
    Abstract [en]

    Feature selection is an important part of contemporary QSAR analysis. In a recently published paper, we investigated the performance of different feature selection methods in a large number of in silico experiments conducted using real QSAR datasets. However, an interesting question that we did not address is whether certain feature selection methods are better than others in combination with certain learning methods, in terms of producing models with high prediction accuracy. In this report we extend our work from the previous investigation by using four different feature selection methods (wrapper, ReliefF, MARS, and elastic nets), together with eight learners (MARS, elastic net, random forest, SVM, neural networks, multiple linear regression, PLS, kNN) in an empirical investigation to address this question. The results indicate that state-of-the-art learners (random forest, SVM, and neural networks) do not gain prediction accuracy from feature selection, and we found no evidence that a certain feature selection is particularly well-suited for use in combination with a certain learner.

  • 10.
    Eklund, Martin
    et al.
    Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; AstraZeneca Research and Development, Mölndal, Sweden.
    Norinder, Ulf
    Lundbeck Corporation, Valby, Denmark.
    Boyer, Scott
    AstraZeneca Research and Development, Mölndal, Sweden.
    Carlsson, Lars
    AstraZeneca Research and Development, Mölndal, Sweden.
    The application of conformal prediction to the drug discovery process2015In: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470, Vol. 74, no 1-2, p. 117-132Article in journal (Refereed)
    Abstract [en]

    QSAR modeling is a method for predicting properties, e.g. the solubility or toxicity, of chemical compounds using machine learning techniques. QSAR is in widespread use within the pharmaceutical industry to prioritize compounds for experimental testing or to alert for potential toxicity during the drug discovery process. However, the confidence or reliability of predictions from a QSAR model are difficult to accurately assess. We frame the application of QSAR to preclinical drug development in an off-line inductive conformal prediction framework and apply it prospectively to historical data collected from four different assays within AstraZeneca over a time course of about five years. The results indicate weakened validity of the conformal predictor due to violations of the randomness assumption. The validity can be strengthen by adopting semi-off-line conformal prediction. The non-randomness of the data prevents exactly valid predictions but comparisons to the results of a traditional QSAR procedure applied to the same data indicate that conformal predictions are highly useful in the drug discovery process.

  • 11.
    Escher, Sylvia E.
    et al.
    Fraunhofer Institute for Toxicology and Experimental Medicine, Chemical Safety and Toxicology, Germany.
    Aguayo-Orozco, Alejandro
    Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Denmark.
    Benfenati, Emilio
    Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milano, Italy.
    Bitsch, Annette
    Fraunhofer Institute for Toxicology and Experimental Medicine, Chemical Safety and Toxicology, Germany.
    Braunbeck, Thomas
    Aquatic Ecology and Toxicology Group, Center for Organismal Studies, University of Heidelberg, Heidelberg, Germany.
    Brotzmann, Katharina
    Aquatic Ecology and Toxicology Group, Center for Organismal Studies, University of Heidelberg, Heidelberg, Germany.
    Bois, Frederic
    Certara UK Ltd, Simcyp Division, Sheffield, United Kingdom.
    van der Burg, Bart
    BioDetection Systems, Amsterdam, the Netherlands.
    Castel, Jose
    Instituto de Investigación Sanitaria La Fe, Valencia, Spain.
    Exner, Thomas
    Edelweiss Connect GmbH, Basel, Switzerland.
    Gadaleta, Domenico
    Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milano, Italy.
    Gardner, Iain
    Certara UK Ltd, Simcyp Division, Sheffield, United Kingdom.
    Goldmann, Daria
    University of Vienna, Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Vienna, Austria.
    Hatley, Oliver
    Certara UK Ltd, Simcyp Division, Sheffield, United Kingdom.
    Golbamaki, Nazanin
    Lhasa Limited, Leeds, United Kingdom.
    Graepel, Rabea
    Leiden Academic Centre for Drug Research (LACDR), Leiden University, Leiden, the Netherlands.
    Jennings, Paul
    Vrije Universiteit Amsterdam, Amsterdam, the Netherlands.
    Limonciel, Alice
    Vrije Universiteit Amsterdam, Amsterdam, the Netherlands.
    Long, Anthony
    Lhasa Limited, Leeds, United Kingdom.
    Maclennan, Richard
    Cyprotex, Cheshire, United Kingdom.
    Mombelli, Enrico
    Instituto de Investigación Sanitaria La Fe, Valencia, Spain.
    Norinder, Ulf
    Örebro University, School of Science and Technology.
    Jain, Sankalp
    University of Vienna, Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Vienna, Austria.
    Capinha, Liliana Santos
    Vrije Universiteit Amsterdam, Amsterdam, the Netherlands.
    Taboureau, Olivier T.
    Université de Paris, France.
    Tolosa, Laia
    Instituto de Investigación Sanitaria La Fe, Valencia, Spain.
    Vrijenhoek, Nanette G.
    Leiden Academic Centre for Drug Research (LACDR), Leiden University, Leiden, the Netherlands.
    van Vugt-Lussenburg, Barbara M. A.
    BioDetection Systems, Amsterdam, the Netherlands.
    Walker, Paul
    Cyprotex, Cheshire, United Kingdom.
    van de Water, Bob
    Leiden Academic Centre for Drug Research (LACDR), Leiden University, Leiden, the Netherlands.
    Wehr, Matthias
    Fraunhofer Institute for Toxicology and Experimental Medicine, Chemical Safety and Toxicology, Germany.
    White, Andrew
    Unilever Safety and Environmental Assurance Centre, Sharnbrook, Bedfordshire, United Kingdom.
    Zdrazil, Barbara
    University of Vienna, Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Vienna, Austria.
    Fisher, Ciarán
    Certara UK Ltd, Simcyp Division, Sheffield, United Kingdom.
    Integrate mechanistic evidence from new approach methodologies (NAMs) into a read-across assessment to characterise trends in shared mode of action2022In: Toxicology in Vitro, ISSN 0887-2333, E-ISSN 1879-3177, Vol. 79, article id 105269Article in journal (Refereed)
    Abstract [en]

    Read-across approaches often remain inconclusive as they do not provide sufficient evidence on a common mode of action across the category members. This read-across case study on thirteen, structurally similar, branched aliphatic carboxylic acids investigates the concept of using human-based new approach methods, such as in vitro and in silico models, to demonstrate biological similarity.

    Five out of the thirteen analogues have preclinical in vivo studies. Three out of them induced lipid accumulation or hypertrophy in preclinical studies with repeated exposure, which leads to the read-across hypothesis that the analogues can potentially induce hepatic steatosis.

    To confirm the selection of analogues, the expression patterns of the induced differentially expressed genes (DEGs) were analysed in a human liver model. With increasing dose, the expression pattern within the tested analogues got more similar, which serves as a first indication of a common mode of action and suggests differences in the potency of the analogues.

    Hepatic steatosis is a well-known adverse outcome, for which over 55 adverse outcome pathways have been identified. The resulting adverse outcome pathway (AOP) network, comprised a total 43 MIEs/KEs and enabled the design of an in vitro testing battery. From the AOP network, ten MIEs, early and late KEs were tested to systematically investigate a common mode of action among the grouped compounds.

    The targeted testing of AOP specific MIE/KEs shows that biological activity in the category decreases with side chain length. A similar trend was evident in measuring liver alterations in zebra fish embryos. However, activation of single MIEs or early KEs at in vivo relevant doses did not necessarily progress to the late KE “lipid accumulation”. KEs not related to the read-across hypothesis, testing for example general mitochondrial stress responses in liver cells, showed no trend or biological similarity.

    Testing scope is a key issue in the design of in vitro test batteries. The Dempster-Shafer decision theory predicted those analogues with in vivo reference data correctly using one human liver model or the CALUX reporter assays.

    The case study shows that the read-across hypothesis is the key element to designing the testing strategy. In the case of a good mechanistic understanding, an AOP facilitates the selection of reliable human in vitro models to demonstrate a common mode of action. Testing DEGs, MIEs and early KEs served to show biological similarity, whereas the late KEs become important for confirmation, as progression from MIEs to AO is not always guaranteed.

  • 12.
    Forreryd, Andy
    et al.
    Department of Immunotechnology, Lund University, Lund, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Lindberg, Tim
    Department of Immunotechnology, Lund University, Lund, Sweden.
    Lindstedt, Malin
    Department of Immunotechnology, Lund University, Lund, Sweden.
    Predicting skin sensitizers with confidence: Using conformal prediction to determine applicability domain of GARD2018In: Toxicology in Vitro, ISSN 0887-2333, E-ISSN 1879-3177, Vol. 48, p. 179-187Article in journal (Refereed)
    Abstract [en]

    GARD - Genomic Allergen Rapid Detection is a cell based alternative to animal testing for identification of skin sensitizers. The assay is based on a biomarker signature comprising 200 genes measured in an in vitro model of dendritic cells following chemical stimulations, and consistently reports predictive performances similar to 90% for classification of external test sets. Within the field of in vitro skin sensitization testing, definition of applicability domain is often neglected by test developers, and assays are often considered applicable across the entire chemical space. This study complements previous assessments of model performance with an estimate of confidence in individual classifications, as well as a statistically valid determination of the applicability domain for the GARD assay. Conformal prediction was implemented into current GARD protocols, and a large external test dataset (n = 70) was classified at a confidence level of 85%, to generate a valid model with a balanced accuracy of 88%, with none of the tested chemical reactivity domains identified as outside the applicability domain of the assay. In conclusion, results presented in this study complement previously reported predictive performances of GARD with a statistically valid assessment of uncertainty in each individual prediction, thus allowing for classification of skin sensitizers with confidence.

  • 13.
    Garcia de Lomana, Marina
    et al.
    BASF SE, Ludwigshafen am Rhein, Germany; Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
    Morger, Andrea
    In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin Berlin, Berlin, Germany.
    Norinder, Ulf
    Örebro University, School of Science and Technology.
    Buesen, Roland
    BASF SE, Ludwigshafen am Rhein, Germany.
    Landsiedel, Robert
    BASF SE, Ludwigshafen am Rhein, Germany.
    Volkamer, Andrea
    In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin Berlin, Berlin, Germany.
    Kirchmair, Johannes
    Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
    Mathea, Miriam
    BASF SE, Ludwigshafen am Rhein, Germany.
    ChemBioSim: Enhancing Conformal Prediction of In Vivo Toxicity by Use of Predicted Bioactivities2021In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 61, no 7, p. 3255-3272Article in journal (Refereed)
    Abstract [en]

    Computational methods such as machine learning approaches have a strong track record of success in predicting the outcomes of in vitro assays. In contrast, their ability to predict in vivo endpoints is more limited due to the high number of parameters and processes that may influence the outcome. Recent studies have shown that the combination of chemical and biological data can yield better models for in vivo endpoints. The ChemBioSim approach presented in this work aims to enhance the performance of conformal prediction models for in vivo endpoints by combining chemical information with (predicted) bioactivity assay outcomes. Three in vivo toxicological endpoints, capturing genotoxic (MNT), hepatic (DILI), and cardiological (DICC) issues, were selected for this study due to their high relevance for the registration and authorization of new compounds. Since the sparsity of available biological assay data is challenging for predictive modeling, predicted bioactivity descriptors were introduced instead. Thus, a machine learning model for each of the 373 collected biological assays was trained and applied on the compounds of the in vivo toxicity data sets. Besides the chemical descriptors (molecular fingerprints and physicochemical properties), these predicted bioactivities served as descriptors for the models of the three in vivo endpoints. For this study, a workflow based on a conformal prediction framework (a method for confidence estimation) built on random forest models was developed. Furthermore, the most relevant chemical and bioactivity descriptors for each in vivo endpoint were preselected with lasso models. The incorporation of bioactivity descriptors increased the mean F1 scores of the MNT model from 0.61 to 0.70 and for the DICC model from 0.72 to 0.82 while the mean efficiencies increased by roughly 0.10 for both endpoints. In contrast, for the DILI endpoint, no significant improvement in model performance was observed. Besides pure performance improvements, an analysis of the most important bioactivity features allowed detection of novel and less intuitive relationships between the predicted biological assay outcomes used as descriptors and the in vivo endpoints. This study presents how the prediction of in vivo toxicity endpoints can be improved by the incorporation of biological information-which is not necessarily captured by chemical descriptors-in an automated workflow without the need for adding experimental workload for the generation of bioactivity descriptors as predicted outcomes of bioactivity assays were utilized. All bioactivity CP models for deriving the predicted bioactivities, as well as the in vivo toxicity CP models, can be freely downloaded from https://doi.org/10.5281/zenodo.4761225.

  • 14.
    Geylan, Gökçe
    et al.
    Molecular AI, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden; Division of Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden.
    De Maria, Leonardo
    Medicinal Chemistry, Research and Early Development, Respiratory & Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden.
    Engkvist, Ola
    Molecular AI, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden; Department of Computer Science and Engineering, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden.
    David, Florian
    Division of Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden; Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    A methodology to correctly assess the applicability domain of cell membrane permeability predictors for cyclic peptides2024In: Digital Discovery, E-ISSN 2635-098X, Vol. 3, no 9, p. 1761-1775Article in journal (Refereed)
    Abstract [en]

    Being able to predict the cell permeability of cyclic peptides is essential for unlocking their potential as a drug modality for intracellular targets. With a wide range of studies of cell permeability but a limited number of data points, the reliability of the machine learning (ML) models to predict previously unexplored chemical spaces becomes a challenge. In this work, we systemically investigate the predictive capability of ML models from the perspective of their extrapolation to never-before-seen applicability domains, with a particular focus on the permeability task. Four predictive algorithms, namely Support-Vector Machine, Random Forest, LightGBM and XGBoost, jointly with a conformal prediction framework were employed to characterize and evaluate the applicability through uncertainty quantification. Efficiency and validity of the models' predictions with multiple calibration strategies were assessed with respect to several external datasets from different parts of the chemical space through a set of experiments. The experiments showed that the predictors generalizing well to the applicability domain defined by the training data, can fail to achieve similar model performance on other parts of the chemical spaces. Our study proposes an approach to overcome such limitations by the means of improving the efficiency of models without sacrificing the validity. The trade-off between the reliability and informativeness was balanced when the models were calibrated with a subset of the data from the new targeted domain. This study outlines an approach to enable the extrapolation of predictive power and restore the models' reliability via a recalibration strategy without the need for retraining the underlying model. This work outlines peptide predictive model methodology with conformal prediction, focusing on extrapolation task. Calibrating on the unseen chemical space recovers efficiency and validity enabling reliable predictions without retraining the models.

  • 15.
    Honma, Masamitsu
    et al.
    Division of Genetics and Mutagenesis, National Institute of Health Sciences, Kawasaki Ku, Japan.
    Kitazawa, Airi
    Division of Genetics and Mutagenesis, National Institute of Health Sciences, Kawasaki Ku, Japan.
    Cayley, Alex
    Lhasa Limited, Leeds, England.
    Williams, Richard V.
    Lhasa Limited, Leeds, England.
    Barber, Chris
    Lhasa Limited, Leeds, England.
    Hanser, Thierry
    Lhasa Limited, Leeds, England.
    Saiakhov, Roustem
    MultiCASE Inc., Beachwood, USA.
    Chakravarti, Suman
    MultiCASE Inc., Beachwood, USA.
    Myatt, Glenn J.
    Leadscope Inc., Columbus, USA.
    Cross, Kevin P.
    Leadscope Inc., Columbus, USA.
    Benfenati, Emilio
    Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milano, Italy.
    Raitano, Giuseppa
    Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milano, Italy.
    Mekenyan, Ovanes
    Laboratory of Mathematical Chemistry, As Zlatarov University, Bourgas, Bulgaria.
    Petkov, Petko
    Laboratory of Mathematical Chemistry, As Zlatarov University, Bourgas, Bulgaria.
    Bossa, Cecilia
    Istituto Superiore di Sanita', Rome, Italy.
    Benigni, Romualdo
    Istituto Superiore di Sanita', Rome, Italy; Alpha-Pretox, Rome, Italy.
    Battistelli, Chiara Laura
    Istituto Superiore di Sanita', Rome, Italy.
    Giuliani, Alessandro
    Istituto Superiore di Sanita', Rome, Italy.
    Tcheremenskaia, Olga
    Istituto Superiore di Sanita', Rome, Italy.
    DeMeo, Christine
    Prous Institute, Barcelona, Spain.
    Norinder, Ulf
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Koga, Hiromi
    Fujitsu Kyushu Systems Limited, Fukuoka, Japan.
    Jose, Ciloy
    Fujitsu Kyushu Systems Limited, Fukuoka, Japan.
    Jeliazkova, Nina
    IdeaConsult Ltd., Sofia, Bulgaria.
    Kochev, Nikolay
    IdeaConsult Ltd., Sofia, Bulgaria; Department of Analytical Chemistry and Computer Chemistry, University of Plovdiv, Plovdiv, Bulgaria.
    Paskaleva, Vesselina
    Department of Analytical Chemistry and Computer Chemistry, University of Plovdiv, Plovdiv, Bulgaria.
    Yang, Chihae
    Molecular Networks GmbH, Nürnberg, Germany; Altamira LLC, Columbus, USA.
    Daga, Pankaj R.
    Simulations Plus Inc., Lancaster, USA.
    Clark, Robert D.
    Simulations Plus Inc., Lancaster, USA.
    Rathman, James
    Molecular Networks GmbH, Nürnberg, Germany; Altamira LLC, Columbus, USA; Ohio State University, Columbus, USA.
    Improvement of quantitative structure-activity relationship (QSAR) tools for predicting Ames mutagenicity: outcomes of the Ames/QSAR International Challenge Project2019In: Mutagenesis, ISSN 0267-8357, E-ISSN 1464-3804, Vol. 34, no 1, p. 3-16Article in journal (Refereed)
    Abstract [en]

    The International Conference on Harmonization (ICH) M7 guideline allows the use of in silicoapproaches for predicting Ames mutagenicity for the initial assessment of impurities in pharmaceuticals. This is the first international guideline that addresses the use of quantitative structure–activity relationship (QSAR) models in lieu of actual toxicological studies for human health assessment. Therefore, QSAR models for Ames mutagenicity now require higher predictive power for identifying mutagenic chemicals. To increase the predictive power of QSAR models, larger experimental datasets from reliable sources are required. The Division of Genetics and Mutagenesis,National Institute of Health Sciences (DGM/NIHS) of Japan recently established a unique proprietary Ames mutagenicity database containing 12140 new chemicals that have not been previously used for developing QSAR models. The DGM/NIHS provided this Ames database to QSAR vendors to validate and improve their QSAR tools. The Ames/QSAR International Challenge Project was initiated in 2014 with 12 QSAR vendors testing 17 QSAR tools against these compounds in three phases. We now present the final results. All tools were considerably improved by participation in this project. Most tools achieved >50% sensitivity (positive prediction among all Ames positives) and predictive power (accuracy) was as high as 80%, almost equivalent to the inter-laboratory reproducibility of Ames tests. To further increase the predictive power of QSAR tools, accumulation of additional Ames test data is required as well as re-evaluation of some previous Ames test results. Indeed, some Ames-positive or Ames-negative chemicals may have previously been incorrectly classified because of methodological weakness, resulting in false-positive or false-negative predictions by QSAR tools. These incorrect data hamper prediction and are a source of noise in the development of QSAR models. It is thus essential to establish a large benchmark database consisting only of well-validated Ames test results to build more accurate QSAR models.

  • 16.
    Jesús Naveja, J.
    et al.
    Department of Pharmacy, Universidad Nacional Autónoma de México, Mexico City, Mexico; PECEM, Universidad Nacional Autónoma de México, Mexico City, Mexico; Department of Life Science Informatics, University of Bonn, Bonn, Germany.
    Norinder, Ulf
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Mucs, Daniel
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Unit of Work Environment Toxicology, Karolinska Institute, Stockholm, Sweden.
    López-López, Edgar
    Department of Pharmacy, Universidad Nacional Autónoma de México, Mexico City, Mexico; Medicinal Chemistry Laboratory, University of Veracruz, Veracruz, Mexico.
    Medina-Franco, Jose L.
    Department of Pharmacy, Universidad Nacional Autónoma de México, Mexico City, Mexico.
    Chemical space, diversity and activity landscape analysis of estrogen receptor binders2018In: RSC Advances, E-ISSN 2046-2069, Vol. 8, no 67, p. 38229-38237Article in journal (Refereed)
    Abstract [en]

    Understanding the structure-activity relationships (SAR) of endocrine-disrupting chemicals has a major importance in toxicology. Despite the fact that classifiers and predictive models have been developed for estrogens for the past 20 years, to the best of our knowledge, there are no studies of their activity landscape or the identification of activity cliffs. Herein, we report the first SAR of a public dataset of 121 chemicals with reported estrogen receptor binding affinities using activity landscape modeling. To this end, we conducted a systematic quantitative and visual analysis of the chemical space of the 121 chemicals. The global diversity of the dataset was characterized by means of Consensus Diversity Plot, a recently developed method. Adding pairwise activity difference information to the chemical space gave rise to the activity landscape of the data set uncovering a heterogeneous SAR, in particular for some structural classes. At least eight compounds were identified with high propensity to form activity cliffs. The findings of this work further expand the current knowledge of the underlying SAR of estrogenic compounds and can be the starting point to develop novel and potentially improved predictive models.

  • 17.
    Karunaratne, Thashmee
    et al.
    Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Boström, Henrik
    Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Norinder, Ulf
    AstraZeneca Research and Development, Södertälje, Sweden; Department of Pharmacy, Uppsala University, Uppsala, Sweden; Department of Computational Chemistry, H. Lundbeck A/S, Valby, Denmark.
    Comparative analysis of the use of chemoinformatics-based and substructure-based descriptors for quantitative structure-activity relationship (QSAR) modeling2013In: Intelligent Data Analysis, ISSN 1088-467X, E-ISSN 1571-4128, Vol. 17, no 2, p. 327-341Article in journal (Refereed)
    Abstract [en]

    Quantitative structure-activity relationship (QSAR) models have gained popularity in the pharmaceutical industry due to their potential to substantially decrease drug development costs by reducing expensive laboratory and clinical tests. QSAR modeling consists of two fundamental steps, namely, descriptor discovery and model building. Descriptor discovery methods are either based on chemical domain knowledge or purely data-driven. The former, chemoinformatics-based, and the latter, substructures-based, methods for QSAR modeling, have been developed quite independently. As a consequence, evaluations involving both types of descriptor discovery method are rarely seen. In this study, a comparative analysis of chemoinformatics-based and substructure-based approaches is presented. Two chemoinformatics-based approaches; ECFI and SELMA, are compared to five approaches for substructure discovery; CP, graphSig, MFI, MoFa and SUBDUE, using 18 QSAR datasets. The empirical investigation shows that one of the chemo-informatics-based approaches, ECFI, results in significantly more accurate models compared to all other methods, when used on their own. Results from combining descriptor sets are also presented, showing that the addition of ECFI descriptors to any other descriptor set leads to improved predictive performance for that set, while the use of ECFI descriptors in many cases also can be improved by adding descriptors generated by the other methods.

  • 18.
    Kensert, Alexander
    et al.
    Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Alvarsson, Jonathan
    Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Norinder, Ulf
    Unit of Toxicology Sciences, Karolinska Institutet, Swetox, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Spjuth, Ola
    Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Evaluating parameters for ligand-based modeling with random forest on sparse data sets2018In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 10, article id 49Article in journal (Refereed)
    Abstract [en]

    Ligand-based predictive modeling is widely used to generate predictive models aiding decision making in e.g. drug discovery projects. With growing data sets and requirements on low modeling time comes the necessity to analyze data sets efficiently to support rapid and robust modeling. In this study we analyzed four data sets and studied the efficiency of machine learning methods on sparse data structures, utilizing Morgan fingerprints of different radii and hash sizes, and compared with molecular signatures descriptor of different height. We specifically evaluated the effect these parameters had on modeling time, predictive performance, and memory requirements using two implementations of random forest; Scikit-learn as well as FEST. We also compared with a support vector machine implementation. Our results showed that unhashed fingerprints yield significantly better accuracy than hashed fingerprints (p <= 0.05), with no pronounced deterioration in modeling time and memory usage. Furthermore, the fast execution and low memory usage of the FEST algorithm suggest that it is a good alternative for large, high dimensional sparse data. Both support vector machines and random forest performed equally well but results indicate that the support vector machine was better at using the extra information from larger values of the Morgan fingerprint's radius.

    Download full text (pdf)
    FULLTEXT01
  • 19.
    Lindh, Martin
    et al.
    Department of Medicinal Chemistry, Uppsala University, Uppsala, Sweden.
    Karlen, Anders
    Department of Medicinal Chemistry, Uppsala University, Uppsala, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Predicting the Rate of Skin Penetration Using an Aggregated Conformal Prediction Framework2017In: Molecular Pharmaceutics, ISSN 1543-8384, E-ISSN 1543-8392, Vol. 14, no 5, p. 1571-1576Article in journal (Refereed)
    Abstract [en]

    Skin serves as a drug administration route, and skin permeability of chemicals is of significant interest in the pharmaceutical and cosmetic industries. An aggregated conformal prediction (ACP) framework was used to build models, for predicting the permeation rate (log K-p) of chemical compounds through human skin. The conformal prediction method gives as an output the prediction range at a given level of confidence for each compound, which enables the user to make a more informed decision when, for example, suggesting the next compound to prepare, Predictive models were built using;both the random forest and the support vector machine methods and were based on experimentally derived permeability data on 211 diverse compounds. The derived models were of similar predictive quality as compared to earlier published models but have the extra advantage of not only presenting a single predicted value for each, compound but also a reliable, individually assigned prediction range. The models use calculated descriptors and can quickly predict the skin permeation rate of new compounds.

  • 20.
    Linusson, Henrik
    et al.
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT, Borås, Sweden.
    Norinder, Ulf
    Swetox, Karolinska Institutet, Stockholm, Sweden.
    Boström, Henrik
    Deptartment of Computer Science and Informatics, Stockholm University, Stockholm, Sweden.
    Johansson, Ulf
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT, Borås, Sweden.
    Löfström, Tuve
    Högskolan i Borås, Akademin för bibliotek, information, pedagogik och IT, Borås, Sweden.
    On the Calibration of Aggregated Conformal Predictors2017In: Proceedings of Machine Learning Research, 2017Conference paper (Refereed)
    Abstract [en]

    Conformal prediction is a learning framework that produces models that associate witheach of their predictions a measure of statistically valid confidence. These models are typi-cally constructed on top of traditional machine learning algorithms. An important result ofconformal prediction theory is that the models produced are provably valid under relativelyweak assumptions—in particular, their validity is independent of the specific underlyinglearning algorithm on which they are based. Since validity is automatic, much research onconformal predictors has been focused on improving their informational and computationalefficiency. As part of the efforts in constructing efficient conformal predictors, aggregatedconformal predictors were developed, drawing inspiration from the field of classification andregression ensembles. Unlike early definitions of conformal prediction procedures, the va-lidity of aggregated conformal predictors is not fully understood—while it has been shownthat they might attain empirical exact validity under certain circumstances, their theo-retical validity is conditional on additional assumptions that require further clarification.In this paper, we show why validity is not automatic for aggregated conformal predictors,and provide a revised definition of aggregated conformal predictors that gains approximatevalidity conditional on properties of the underlying learning algorithm.

    Download full text (pdf)
    FULLTEXT01
  • 21.
    Ljunggren, Stefan A.
    et al.
    Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
    Helmfrid, Ingela
    Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
    Norinder, Ulf
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Fredriksson, Mats
    Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
    Wingren, Gun
    Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
    Karlsson, Helen
    Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
    Lindahl, Mats
    Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
    Alterations in high-density lipoprotein proteome and function associated with persistent organic pollutants2017In: Environment International, ISSN 0160-4120, E-ISSN 1873-6750, Vol. 98, p. 204-211Article in journal (Refereed)
    Abstract [en]

    There is a growing body of evidence that persistent organic pollutants (POPs) may increase the risk for cardiovascular disease (CVD), but the mechanisms remain unclear. High- density lipoprotein (HDL) acts protective against CVD by different processes, andwe have earlier found that HDL from subjects with CVD contains higher levels of POPs than healthy controls. In the present study, we have expanded analyses on the same individuals living in a contaminated community and investigated the relationship between the HDL POP levels and protein composition/ function. HDL from17 subjectswas isolated by ultracentrifugation. HDL protein composition, using nanoliquid chromatography tandemmass spectrometry, and antioxidant activity were analyzed. The associations of 16 POPs, including polychlorinated biphenyls (PCBs) and organochlorine pesticides, with HDL proteins/functionswere investigated by partial least square and multiple linear regression analysis. Proteomic analyses identified 118 HDL proteins, of which ten were significantly (p b 0.05) and positively associated with the combined level of POPs or with highly chlorinated PCB congeners. Among these, cholesteryl ester transfer protein and phospholipid transfer protein, as well as the inflammatory marker serum amyloid A, were found. The serum paraoxonase/arylesterase 1 activity was inversely associated with POPs. Pathway analysis demonstrated that up- regulated proteinswere associatedwith biological processes involving lipoproteinmetabolism, while down- regulated proteinswere associatedwith processes such as negative regulation of proteinases, acute phase response, platelet degranulation, and complement activation. These results indicate an association between POP levels, especially highly chlorinated PCBs, and HDL protein alterations that may result in a less functional particle. Further studies are needed to determine causality and the importance of other environmental factors. Nevertheless, this study provides a first insight into a possible link between exposure to POPs and risk of CVD.

  • 22.
    Lupu, Diana
    et al.
    Department of Toxicology, Iuliu Haţieganu University of Medicine and Pharmacy, Cluj-Napoca, Romania; Unit Toxicol Sci, Karolinska Institute, Södertälje, Sweden.
    Varshney, Mukesh K.
    Department of Biosciences and Nutrition, Karolinska Institute, Huddinge, Sweden.
    Mucs, Daniel
    Unit Work Environm Toxicol, Karolinska Institute, Stockholm, Sweden; Unit Toxicol Sci, Karolinska Institute, Södertälje, Sweden.
    Inzunza, Jose
    Department of Biosciences and Nutrition, Karolinska Institute, Huddinge, Sweden.
    Norinder, Ulf
    Dept Comp & Syst Sci, Stockholm University, Kista, Sweden; Unit Toxicol Sci, Karolinska Institute, Södertälje, Sweden.
    Loghin, Felicia
    Department of Toxicology, Iuliu Haţieganu University of Medicine and Pharmacy, Cluj-Napoca, Romania.
    Nalvarte, Ivan
    Department of Biosciences and Nutrition, Karolinska Institute, Huddinge, Sweden.
    Ruegg, Joelle
    Dept Clin Neurosci, Karolinska Institute, Stockholm, Sweden; Unit Toxicol Sci, Karolinska Institute, Södertälje, Sweden.
    Fluoxetine Affects Differentiation of Midbrain Dopaminergic Neurons In Vitro2018In: Molecular Pharmacology, ISSN 0026-895X, E-ISSN 1521-0111, Vol. 94, no 4, p. 1220-1231Article in journal (Refereed)
    Abstract [en]

    Recent meta-analyses found an association between prenatal exposure to the antidepressant fluoxetine (FLX) and an increased risk of autism in children. This developmental disorder has been related to dysfunctions in the brains' rewards circuitry, which, in turn, has been linked to dysfunctions in dopaminergic (DA) signaling. The present study investigated if FLX affects processes involved in dopaminergic neuronal differentiation. Mouse neuronal precursors were differentiated into midbrain dopaminergic precursor cells (mDPCs) and concomitantly exposed to clinically relevant doses of FLX. Subsequently, dopaminergic precursors were evaluated for expression of differentiation and stemness markers using quantitative polymerase chain reaction. FLX treatment led to increases in early regional specification markers orthodenticle homeobox 2 (Otx2) and homeobox engrailed-1 and -2 (En1 and En2). On the other hand, two transcription factors essential for midbrain dopaminergic (mDA) neurogenesis, LIM homeobox transcription factor 1 alpha (Lmx1a) and paired-like homeodomain transcription factor 3 (Pitx3) were downregulated by FLX treatment. The stemness marker nestin (Nes) was increased, whereas the neuronal differentiation marker beta 3-tubulin (Tubb3) decreased. Additionally, we observed that FLX modulates the expression of several genes associated with autism spectrum disorder and downregulates the estrogen receptors (ERs) alpha and beta. Further investigations using ER beta knockout (BERKO) mDPCs showed that FLX had no or even opposite effects on several of the genes analyzed. These findings suggest that FLX affects differentiation of the dopaminergic system by increasing production of dopaminergic precursors, yet decreasing their maturation, partly via interference with the estrogen system.

  • 23.
    Morger, Andrea
    et al.
    In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin Berlin, Berlin, Germany.
    Garcia de Lomana, Marina
    BASF SE, Ludwigshafen, Germany; Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; Dept Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Svensson, Fredrik
    Alzheimer's Research UK UCL Drug Discovery Institute, London, UK.
    Kirchmair, Johannes
    Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, University of Vienna, Vienna, Austria.
    Mathea, Miriam
    BASF SE, Ludwigshafen, Germany.
    Volkamer, Andrea
    In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin Berlin, Berlin, Germany.
    Studying and mitigating the effects of data drifts on ML model performance at the example of chemical toxicity data2022In: Scientific Reports, E-ISSN 2045-2322, Vol. 12, no 1, article id 7244Article in journal (Refereed)
    Abstract [en]

    Machine learning models are widely applied to predict molecular properties or the biological activity of small molecules on a specific protein. Models can be integrated in a conformal prediction (CP) framework which adds a calibration step to estimate the confidence of the predictions. CP models present the advantage of ensuring a predefined error rate under the assumption that test and calibration set are exchangeable. In cases where the test data have drifted away from the descriptor space of the training data, or where assay setups have changed, this assumption might not be fulfilled and the models are not guaranteed to be valid. In this study, the performance of internally valid CP models when applied to either newer time-split data or to external data was evaluated. In detail, temporal data drifts were analysed based on twelve datasets from the ChEMBL database. In addition, discrepancies between models trained on publicly-available data and applied to proprietary data for the liver toxicity and MNT in vivo endpoints were investigated. In most cases, a drastic decrease in the validity of the models was observed when applied to the time-split or external (holdout) test sets. To overcome the decrease in model validity, a strategy for updating the calibration set with data more similar to the holdout set was investigated. Updating the calibration set generally improved the validity, restoring it completely to its expected value in many cases. The restored validity is the first requisite for applying the CP models with confidence. However, the increased validity comes at the cost of a decrease in model efficiency, as more predictions are identified as inconclusive. This study presents a strategy to recalibrate CP models to mitigate the effects of data drifts. Updating the calibration sets without having to retrain the model has proven to be a useful approach to restore the validity of most models.

  • 24.
    Morger, Andrea
    et al.
    In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin, Berlin, Germany.
    Svensson, Fredrik
    Alzheimer's Research UK UCL Drug Discovery Institute, London, UK.
    Arvidsson McShane, Staffan
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Gauraha, Niharika
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden; Division of Computational Science and Technology, KTH, Stockholm, Sweden.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden; Dept. Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Spjuth, Ola
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Volkamer, Andrea
    In Silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin, Berlin, Germany.
    Assessing the calibration in toxicological in vitro models with conformal prediction2021In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 13, no 1, article id 35Article in journal (Refereed)
    Abstract [en]

    Machine learning methods are widely used in drug discovery and toxicity prediction. While showing overall good performance in cross-validation studies, their predictive power (often) drops in cases where the query samples have drifted from the training data's descriptor space. Thus, the assumption for applying machine learning algorithms, that training and test data stem from the same distribution, might not always be fulfilled. In this work, conformal prediction is used to assess the calibration of the models. Deviations from the expected error may indicate that training and test data originate from different distributions. Exemplified on the Tox21 datasets, composed of chronologically released Tox21Train, Tox21Test and Tox21Score subsets, we observed that while internally valid models could be trained using cross-validation on Tox21Train, predictions on the external Tox21Score data resulted in higher error rates than expected. To improve the prediction on the external sets, a strategy exchanging the calibration set with more recent data, such as Tox21Test, has successfully been introduced. We conclude that conformal prediction can be used to diagnose data drifts and other issues related to model calibration. The proposed improvement strategy-exchanging the calibration data only-is convenient as it does not require retraining of the underlying model.

  • 25.
    Norinder, Ulf
    et al.
    Swetox, Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Ahlberg, Ernst
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    Carlsson, Lars
    Computer Learning Research Centre, University of London Egham, Surrey, England.
    Predicting Ames Mutagenicity Using Conformal Prediction in the Ames/QSAR International Challenge Project2019In: Mutagenesis, ISSN 0267-8357, E-ISSN 1464-3804, Vol. 34, no 1, p. 33-40Article in journal (Refereed)
    Abstract [en]

    Valid and predictive models for classifying Ames mutagenicity have been developed using conformal prediction. The models are Random Forest models using signature molecular descriptors. The investigation indicates, on excluding not-strongly mutagenic compounds (class B), that the validity for mutagenic compounds is increased for the predictions based on both public and the Division of Genetics and Mutagenesis, National Institute of Health Sciences of Japan (DGM/NIHS) data while less so when using only the latter data source. The former models only result in valid predictions for the majority, non-mutagenic, class whereas the latter models are valid for both classes, i.e. mutagenic and non-mutagenic compounds. These results demonstrate the importance of data consistency manifested through the superior predictive quality and validity of the models based only on DGM/NIHS generated data compared to a combination of this data with public data sources.

  • 26.
    Norinder, Ulf
    et al.
    Department of Pharmacy, Uppsala University, Uppsala, Sweden; AstraZeneca R&D, Sodertalje, Sweden.
    Boström, Henrik
    Dept Comp & Syst Stockholm University, Kista, Sweden .
    Representing descriptors derived from multiple conformations as uncertain features for machine learning2013In: Journal of Molecular Modeling, ISSN 1610-2940, E-ISSN 0948-5023, Vol. 19, no 6, p. 2679-2685Article in journal (Refereed)
    Abstract [en]

    Uncertainty was introduced into the chemical descriptors of 11 datasets by conformational analysis in order to incorporate three-dimensional information and to investigate the resulting predictive performance of a state-of-the-art machine learning method, random forests, for binary classification tasks. A number of strategies for handling uncertainty in random forests were evaluated. The study showed that when incorporating three-dimensional information as uncertainty into chemical descriptors, the use of uniform probability distributions over the range of possible values, in conjunction with fractional distribution of compounds clearly outperforms the use of normal distributions as well as sampling from both normal and uniform distributions. The main conclusion of this study is that, even when distributions of uncertain values are provided, the random forest method can generate models that are almost as accurate from the expected values of these distributions alone. Hence, there seems to be little advantage to using the more elaborate methods of incorporating uncertainty in chemical descriptors when using random forests rather than replacing the distributions with single-point values. The results also show that random forest models with similar performances can also be generated using three-dimensional descriptor information derived from single (lowest-energy or Corina-derived) conformations.

  • 27.
    Norinder, Ulf
    et al.
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Jesús Naveja, J.
    Department of Pharmacy, Universidad Nacional Autónoma de México, Mexico City, Mexico; PECEM, Universidad Nacional Autónoma de México, Mexico City, Mexico; Department of Life Science Informatics, University of Bonn, Bonn, Germany.
    Lopez-Lopez, Edgar
    Department of Pharmacy, Universidad Nacional Autónoma de México, Mexico City, Mexico.
    Mucs, Dániel
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Unit of Work Environment Toxicology, Karolinska Institute, Stockholm, Sweden.
    Medina-Franco, José L.
    Department of Pharmacy, Universidad Nacional Autónoma de México, Mexico City, Mexico.
    Conformal prediction of HDAC inhibitors2019In: SAR and QSAR in environmental research (Print), ISSN 1062-936X, E-ISSN 1029-046X, Vol. 30, no 4, p. 265-277Article in journal (Refereed)
    Abstract [en]

    The growing interest in epigenetic probes and drug discovery, as revealed by several epigenetic drugs in clinical use or in the lineup of the drug development pipeline, is boosting the generation of screening data. In order to maximize the use of structure-activity relationships there is a clear need to develop robust and accurate models to understand the underlying structure-activity relationship. Similarly, accurate models should be able to guide the rational screening of compound libraries. Herein we introduce a novel approach for epigenetic quantitative structure-activity relationship (QSAR) modelling using conformal prediction. As a case study, we discuss the development of models for 11 sets of inhibitors of histone deacetylases (HDACs), which are one of the major epigenetic target families that have been screened. It was found that all derived models, for every HDAC endpoint and all three significance levels, are valid with respect to predictions for the external test sets as well as the internal validation of the corresponding training sets. Furthermore, the efficiencies for the predictions are above 80% for most data sets and above 90% for four data sets at different significant levels. The findings of this work encourage prospective applications of conformal prediction for other epigenetic target data sets.

  • 28.
    Norinder, Ulf
    et al.
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Lowry, Stephanie
    Örebro University, School of Science and Technology.
    Predicting Larch Casebearer damage with confidence using Yolo network models and conformal prediction2023In: Remote Sensing Letters, ISSN 2150-704X, E-ISSN 2150-7058, Vol. 14, no 10, p. 1023-1035Article in journal (Refereed)
    Abstract [en]

    This investigation shows that successful forecasting models for monitoring forest health status with respect to Larch Casebearer damages can be derived using a combination of a confidence predictor framework (Conformal Prediction) in combination with a deep learning architecture (Yolo v5). A confidence predictor framework can predict the current types of diseases used to develop the model and also provide indication of new, unseen, types or degrees of disease. The user of the models is also, at the same time, provided with reliable predictions and a well-established applicability domain for the model where such reliable predictions can and cannot be expected. Furthermore, the framework gracefully handles class imbalances without explicit over- or under-sampling or category weighting which may be of crucial importance in cases of highly imbalanced datasets. The present approach also provides indication of when insufficient information has been provided as input to the model at the level of accuracy (reliability) need by the user to make subsequent decisions based on the model predictions.

  • 29.
    Norinder, Ulf
    et al.
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Munic Kos, Vesna
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Physiology and Pharmacology, Karolinska Institute, Stockholm, Sweden.
    QSAR Models for Predicting Five Levels of Cellular Accumulation of Lysosomotropic Macrocycles2019In: International Journal of Molecular Sciences, ISSN 1661-6596, E-ISSN 1422-0067, Vol. 20, no 23, article id 5938Article in journal (Refereed)
    Abstract [en]

    Drugs that accumulate in lysosomes reach very high tissue concentrations, which is evident in the high volume of distribution and often lower clearance of these compounds. Such a pharmacokinetic profile is beneficial for indications where high tissue penetration and a less frequent dosing regime is required. Here, we show how the level of lysosomotropic accumulation in cells can be predicted solely from molecular structure. To develop quantitative structure-activity relationship (QSAR) models, we used cellular accumulation data for 69 lysosomotropic macrocycles, the pharmaceutical class for which this type of prediction model is extremely valuable due to the importance of cellular accumulation for their anti-infective and anti-inflammatory applications as well as due to the fact that they are extremely difficult to model by computational methods because of their large size (M-w > 500). For the first time, we show that five levels of intracellular lysosomotropic accumulation (as measured by liquid chromatography coupled to tandem mass spectrometry-LC-MS/MS), from low/no to extremely high, can be predicted with 60% balanced accuracy solely from the compound's structure. Although largely built on macrocycles, the eight non-macrocyclic compounds that were added to the set were found to be well incorporated by the models, indicating their possible broader application. By uncovering the link between the molecular structure and cellular accumulation as the key process in tissue distribution of lysosomotropic compounds, these models are applicable for directing the drug discovery process and prioritizing the compounds for synthesis with fine-tuned accumulation properties, according to the desired pharmacokinetic profile.

  • 30.
    Norinder, Ulf
    et al.
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Myatt, Glenn
    Leadscope, Columbus, USA.
    Ahlberg, Ernst
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    Predicting Aromatic Amine Mutagenicity With Confidence: A Case Study Using Conformal Prediction2018In: Biomolecules, E-ISSN 2218-273X, Vol. 8, no 3, article id 85Article in journal (Refereed)
    Abstract [en]

    The occurrence of mutagenicity in primary aromatic amines has been investigated using conformal prediction. The results of the investigation show that it is possible to develop mathematically proven valid models using conformal prediction and that the existence of uncertain classes of prediction, such as both (both classes assigned to a compound) and empty (no class assigned to a compound), provides the user with additional information on how to use, further develop, and possibly improve future models. The study also indicates that the use of different sets of fingerprints results in models, for which the ability to discriminate varies with respect to the set level of acceptable errors.

  • 31.
    Norinder, Ulf
    et al.
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Rybacka, Aleksandra
    Department of Chemistry, Umeå University, Umeå, Sweden.
    Andersson, Patrik L.
    Department of Chemistry, Umeå University, Umeå, Sweden.
    Conformal prediction to define applicability domain: A case study on predicting ER and AR binding2016In: SAR and QSAR in environmental research (Print), ISSN 1062-936X, E-ISSN 1029-046X, Vol. 27, no 4, p. 303-316Article in journal (Refereed)
    Abstract [en]

    A fundamental element when deriving a robust and predictive in silico model is not only the statistical quality of the model in question but, equally important, the estimate of its predictive boundaries. This work presents a new method, conformal prediction, for applicability domain estimation in the field of endocrine disruptors. The method is applied to binders and non-binders related to the oestrogen and androgen receptors. Ensembles of decision trees are used as statistical method and three different sets (dragon, rdkit and signature fingerprints) are investigated as chemical descriptors. The conformal prediction method results in valid models where there is an excellent balance in quality between the internally validated training set and the corresponding external test set, both in terms of validity and with respect to sensitivity and specificity. With this method the level of confidence can be readily altered by the user and the consequences thereof immediately inspected. Furthermore, the predictive boundaries for the derived models are rigorously defined by using the conformal prediction framework, thus no ambiguity exists as to the level of similarity needed for new compounds to be in or out of the predictive boundaries of the derived models where reliable predictions can be expected.

  • 32.
    Norinder, Ulf
    et al.
    Örebro University, School of Science and Technology. Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Spjuth, Ola
    Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Svensson, Fredrik
    Alzheimer's Research UK UCL Drug Discovery Institute, University College London, The Cruciform Building, Gower Street, London, UK.
    Synergy conformal prediction applied to large-scale bioactivity datasets and in federated learning2021In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 13, no 1, article id 77Article in journal (Refereed)
    Abstract [en]

    Confidence predictors can deliver predictions with the associated confidence required for decision making and can play an important role in drug discovery and toxicity predictions. In this work we investigate a recently introduced version of conformal prediction, synergy conformal prediction, focusing on the predictive performance when applied to bioactivity data. We compare the performance to other variants of conformal predictors for multiple partitioned datasets and demonstrate the utility of synergy conformal predictors for federated learning where data cannot be pooled in one location. Our results show that synergy conformal predictors based on training data randomly sampled with replacement can compete with other conformal setups, while using completely separate training sets often results in worse performance. However, in a federated setup where no method has access to all the data, synergy conformal prediction is shown to give promising results. Based on our study, we conclude that synergy conformal predictors are a valuable addition to the conformal prediction toolbox.

  • 33.
    Norinder, Ulf
    et al.
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden; Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Spjuth, Ola
    Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden; Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
    Svensson, Fredrik
    The Alzheimer's Research UK University College London Drug Discovery Institute, London, U.K..
    Using Predicted Bioactivity Profiles to Improve Predictive Modeling2020In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 60, no 6, p. 2830-2837Article in journal (Refereed)
    Abstract [en]

    Predictive modeling is a cornerstone in early drug development. Using information for multiple domains or across prediction tasks has the potential to improve the performance of predictive modeling. However, aggregating data often leads to incomplete data matrices that might be limiting for modeling. In line with previous studies, we show that by generating predicted bioactivity profiles, and using these as additional features, prediction accuracy of biological endpoints can be improved. Using conformal prediction, a type of confidence predictor, we present a robust framework for the calculation of these profiles and the evaluation of their impact. We report on the outcomes from several approaches to generate the predicted profiles on 16 datasets in cytotoxicity and bioactivity and show that efficiency is improved the most when including the p-values from conformal prediction as bioactivity profiles.

  • 34.
    Norinder, Ulf
    et al.
    Swetox, Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Svensson, Fredrik
    Alzheimer's Research UK UCL Drug Discovery Institute, University College, London, England; Francis Crick Institute, London, England.
    Multitask Modeling with Confidence Using Matrix Factorization and Conformal Prediction2019In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 59, no 4, p. 1598-1604Article in journal (Refereed)
    Abstract [en]

    Multitask prediction of bioactivities is often faced with challenges relating to the sparsity of data and imbalance between different labels. We propose class conditional (Mondrian) conformal predictors using underlying Macau models as a novel approach for large scale bioactivity prediction. This approach handles both high degrees of missing data and label imbalances while still producing high quality predictive models. When applied to ten assay end points from PubChem, the models generated valid models with an efficiency of 74.0-80.1% at the 80% confidence level with similar performance both for the minority and majority class. Also when deleting progressively larger portions of the available data (0-80%) the performance of the models remained robust with only minor deterioration (reduction in efficiency between 5 and 10%). Compared to using Macau without conformal prediction the method presented here significantly improves the performance on imbalanced data sets.

  • 35.
    Norinder, Ulf
    et al.
    Örebro University, School of Science and Technology.
    Tuck, Astrud
    Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
    Norgren, Kalle
    Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
    Munic Kos, Vesna
    Department of Physiology and Pharmacology, Karolinska Institutet, Stockholm, Sweden.
    Existing highly accumulating lysosomotropic drugs with potential for repurposing to target COVID-192020In: Biomedicine and Pharmacotherapy, ISSN 0753-3322, E-ISSN 1950-6007, Vol. 130, article id 110582Article in journal (Refereed)
    Abstract [en]

    Given the speed of viral infection spread, repurposing of existing drugs has been given the highest priority in combating the ongoing COVID-19 pandemic. Only drugs that are already registered or close to registration, and therefore have passed lengthy safety assessments, have a chance to be tested in clinical trials and reach patients quickly enough to help in the current disease outbreak.

    Here, we have reviewed available evidence and possible ways forward to identify already existing pharmaceuticals displaying modest broad-spectrum antiviral activity which is likely linked to their high accumulation in cells. Several well studied examples indicate that these drugs accumulate in lysosomes, endosomes and biological membranes in general, and thereby interfere with endosomal pathway and intracellular membrane trafficking crucial for viral infection. With the aim to identify other lysosomotropic drugs with possible inherent antiviral activity, we have applied a set of clear physicochemical, pharmacokinetic and molecular criteria on 530 existing drugs. In addition to publicly available data, we have also used our in silico model for the prediction of accumulation in lysosomes and endosomes. By this approach we have identified 36 compounds with possible antiviral effects, also against coronaviruses. For 14 of them evidence of broad-spectrum antiviral activity has already been reported, adding support to the value of this approach.

    Presented pros and cons, knowledge gaps and methods to identify lysosomotropic antivirals, can help in the evaluation of many drugs currently in clinical trials considered for repurposing to target COVID-19, as well as open doors to finding more potent and safer alternatives.

  • 36.
    Over, Bjorn
    et al.
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    Matsson, Pär
    Department of Pharmacy, Uppsala University, Uppsala, Sweden; Uppsala University Drug Optimization and Pharmaceutical Profiling Platform (UDOPP), Uppsala University, Uppsala, Sweden.
    Tyrchan, Christian
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    Artursson, Per
    Department of Pharmacy, Uppsala University, Uppsala, Sweden; Uppsala University Drug Optimization and Pharmaceutical Profiling Platform (UDOPP), Uppsala University, Uppsala, Sweden.
    Doak, Bradley C.
    Department of Chemistry, Uppsala University, Uppsala, Sweden.
    Foley, Michael A.
    Broad Institute, Cambridge, USA; Triinst Therapeut Discovery Inst, New York, USA.
    Hilgendorf, Constanze
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    Johnston, Stephen E.
    Broad Institute, Cambridge, USA.
    Lee, Maurice D.
    Broad Institute, Cambridge, USA; Ensemble Therapeut, Cambridge, USA.
    Lewis, Richard J.
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    McCarren, Patrick
    Broad Institute, Cambridge, USA.
    Muncipinto, Giovanni
    Broad Institute, Cambridge, USA; Ensemble Therapeut, Cambridge, USA.
    Norinder, Ulf
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Perry, Matthew W. D.
    AstraZeneca R&D Gothenburg, Mölndal, Sweden.
    Duvall, Jeremy R.
    Broad Institute, Cambridge, USA; Ensemble Therapeut, Cambridge, USA.
    Kihlberg, Jan
    Department of Chemistry, Uppsala University, Uppsala, Sweden.
    Structural and conformational determinants of macrocycle cell permeability2016In: Nature Chemical Biology, ISSN 1552-4450, E-ISSN 1552-4469, Vol. 12, no 12, p. 1065-1074Article in journal (Refereed)
    Abstract [en]

    Macrocycles are of increasing interest as chemical probes and drugs for intractable targets like protein-protein interactions, but the determinants of their cell permeability and oral absorption are poorly understood. To enable rational design of cell-permeable macrocycles, we generated an extensive data set under consistent experimental conditions for more than 200 nonpeptidic, de novo-designed macrocycles from the Broad Institute's diversity-oriented screening collection. This revealed how specific functional groups, substituents and molecular properties impact cell permeability. Analysis of energy-minimized structures for stereo- and regioisomeric sets provided fundamental insight into how dynamic, intramolecular interactions in the 3D conformations of macrocycles may be linked to physicochemical properties and permeability. Combined use of quantitative structure-permeability modeling and the procedure for conformational analysis now, for the first time, provides chemists with a rational approach to design cell-permeable non-peptidic macrocycles with potential for oral absorption.

  • 37.
    Sapounidou, M.
    et al.
    Umeå University, Chemistry Department, Umeå, Sweden.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Stockholm University, Computer and Systems Sciences Department, Kista, Sweden.
    Andersson, P. L.
    Umeå University, Chemistry Department, Umeå, Sweden.
    Application of conformal prediction for in silico definition of molecular initiating events linked to endocrine disruption2021In: Toxicology Letters, ISSN 0378-4274, E-ISSN 1879-3169, Vol. 350, no Suppl., p. S86-S86Article in journal (Other academic)
    Abstract [en]

    The adverse outcome pathway (AOP) paradigm has brought mechanism of action in the spotlight of regulatory toxicology, linking biochemical interactions on cellular level (i.e. molecular initiating event, MIE) via key events to adverse outcomes (AOs) on population level. Developments on mechanistic understanding of endocrine disruption (ED) has brought forward MIEs associated with early neurode-velopmental interference  [1] and metabolic disruption [2], describing agonistic and antagonistic interactions with receptors such as constitutive androstane receptor (CAR), estrogen receptor alpha (ERα), farsenoid X receptor (FXR), and glucocorticoid receptor (GR). High confidence on in silico predictions is dictated by high quality training data  on  mechanistically  relevant  endpoints,  where  well-defined  chemistry is covered. Based on Tox21 in vitro assays describing events of agonism and antagonism for 13 receptors linked to ED, 23 in silico models were developed using Random Forest Classification. To quantify measures of uncertainty per prediction a Conformal Prediction framework was employed. In order to assess whether currently available models can confidently predict endocrine disrupting chemicals (EDCs), screening of EURION reference chemicals was conducted. The EURION cluster is a constellation of 8 research consortia aiming to improve endocrine disruption identification. Preliminary results revealed strengths in the use of in silico models for screening of current ED chemical landscape, and data gaps that need to be considered for next steps.

  • 38.
    Sapounidou, Maria
    et al.
    Chemistry Department, Umeå University, 901 87 Umeå, Sweden.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Box 7003, 164 07 Kista, Sweden; Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 75 124 Uppsala, Sweden.
    Andersson, Patrik L
    Chemistry Department, Umeå University, 901 87 Umeå, Sweden.
    Predicting Endocrine Disruption Using Conformal Prediction: A Prioritization Strategy to Identify Hazardous Chemicals with Confidence2023In: Chemical Research in Toxicology, ISSN 0893-228X, E-ISSN 1520-5010, Vol. 36, no 1, p. 53-65Article in journal (Refereed)
    Abstract [en]

    Receptor-mediated molecular initiating events (MIEs) and their relevance in endocrine activity (EA) have been highlighted in literature. More than 15 receptors have been associated with neurodevelopmental adversity and metabolic disruption. MIEs describe chemical interactions with defined biological outcomes, a relationship that could be described with quantitative structure-activity relationship (QSAR) models. QSAR uncertainty can be assessed using the conformal prediction (CP) framework, which provides similarity (i.e., nonconformity) scores relative to the defined classes per prediction. CP calibration can indirectly mitigate data imbalance during model development, and the nonconformity scores serve as intrinsic measures of chemical applicability domain assessment during screening. The focus of this work was to propose an in silico predictive strategy for EA. First, 23 QSAR models for MIEs associated with EA were developed using high-throughput data for 14 receptors. To handle the data imbalance, five protocols were compared, and CP provided the most balanced class definition. Second, the developed QSAR models were applied to a large data set (∼55,000 chemicals), comprising chemicals representative of potential risk for human exposure. Using CP, it was possible to assess the uncertainty of the screening results and identify model strengths and out of domain chemicals. Last, two clustering methods, t-distributed stochastic neighbor embedding and Tanimoto similarity, were used to identify compounds with potential EA using known endocrine disruptors as reference. The cluster overlap between methods produced 23 chemicals with suspected or demonstrated EA potential. The presented models could be utilized for first-tier screening and identification of compounds with potential biological activity across the studied MIEs.

  • 39.
    Svensson, Fredrik
    et al.
    Department of Chemistry, University of Cambridge, Cambridge, England; IOTA Pharmaceuticals, Cambridge, England.
    Afzal, Avid M.
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Norinder, Ulf
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Bender, Andreas
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Maximizing gain in high-throughput screening using conformal prediction2018In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 10, no 1, article id 7Article in journal (Refereed)
    Abstract [en]

    Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in more efficient screening. One way to evaluate screening is to consider the cost of screening compared to the gain associated with finding an active compound. In this work, we introduce a conformal predictor coupled with a gain-cost function with the aim to maximise gain in iterative screening. Using this setup we were able to show that by evaluating the predictions on the training data, very accurate predictions on what settings will produce the highest gain on the test data can be made. We evaluate the approach on 12 bioactivity datasets from PubChem training the models using 20% of the data. Depending on the settings of the gain-cost function, the settings generating the maximum gain were accurately identified in 8-10 out of the 12 datasets. Broadly, our approach can predict what strategy generates the highest gain based on the results of the cost-gain evaluation: to screen the compounds predicted to be active, to screen all the remaining data, or not to screen any additional compounds. When the algorithm indicates that the predicted active compounds should be screened, our approach also indicates what confidence level to apply in order to maximize gain. Hence, our approach facilitates decision-making and allocation of the resources where they deliver the most value by indicating in advance the likely outcome of a screening campaign.

  • 40.
    Svensson, Fredrik
    et al.
    Department of Chemistry, University of Cambridge, Cambridge, England; IOTA Pharmaceut, Cambridge, England.
    Aniceto, Natalia
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Norinder, Ulf
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Cortes-Ciriano, Isidro
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Spjuth, Ola
    Uppsala University, Uppsala, Sweden.
    Carlsson, Lars
    AstraZeneca, Mölndal, Sweden; Department of Computer Science, University of London, Surrey, England.
    Bender, Andreas
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty2018In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 58, no 5, p. 1132-1140Article in journal (Refereed)
    Abstract [en]

    Making predictions with an associated confidence is highly desirable as it facilitates decision making and resource prioritization. Conformal regression is a machine learning framework that allows the user to define the required confidence and delivers predictions that are guaranteed to be correct to the selected extent. In this study, we apply conformal regression to model molecular properties and bioactivity values and investigate different ways to scale the outputted prediction intervals to create as efficient (i.e. narrow) regressors as possible. Different algorithms to estimate the prediction uncertainty were used to normalize the prediction ranges and the different approaches were evaluated on 29 publicly available datasets. Our results show that the most efficient conformal regressors are obtained when using the natural exponential of the ensemble standard deviation from the underlying random forest to scale the prediction intervals. This approach afforded an average prediction range of 1.65 pIC50 units at the 80 % confidence level when applied to bioactivity modeling. The choice of nonconformity function has a pronounced impact on the average prediction range with a difference of close to one log unit in bioactivity between the tightest and widest prediction range. Overall, conformal regression is a robust approach to generate bioactivity predictions with associated confidence.

  • 41.
    Svensson, Fredrik
    et al.
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Norinder, Ulf
    Karolinska Institute, Unit of Toxicology Sciences, Södertälje, Sweden; Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
    Bender, Andreas
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Improving Screening Efficiency through Iterative Screening Using Docking and Conformal Prediction2017In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 57, no 3, p. 439-444Article in journal (Refereed)
    Abstract [en]

    High-throughput screening, where thousands of molecules rapidly can be assessed for activity against a protein, has been the dominating approach in drug discovery for many years. However, these methods are costly and require much time and effort. In order to suggest an improvement to this situation, in this study, we apply an iterative screening process, where an initial set of compounds are selected for screening based on molecular docking. The outcome of the initial screen is then used to classify the remaining compounds through a conformal predictor. The approach was retrospectively validated using 41 targets from the Directory of Useful Decoys, Enhanced (DUD-E), ensuring scaffold diversity among the active compounds. The results show that 57% of the remaining active compounds could be identified while only screening 9.4% of the database. The overall hit rate (7.6%) was also higher than, when using docking alone (5.2%). When limiting the search to the top scored compounds from docking, 39.6% of the active compounds could be identified, compared to 13.5% when screening the same number of compounds solely based on docking. The use of conformal predictors also gives a clear indication of the number of compounds to screen in the next iteration. These results indicate that iterative screening based on molecular docking and conformal prediction can be an efficient way to find active compounds while screening only a small part of the compound collection.

  • 42.
    Svensson, Fredrik
    et al.
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Norinder, Ulf
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden; Deptartment of Computer and Systems Sciences, Stockholm University, Kista, Sweden .
    Bender, Andreas
    Department of Chemistry, University of Cambridge, Cambridge, England.
    Modelling compound cytotoxicity using conformal prediction and PubChem HTS data2017In: Toxicology Research, ISSN 2045-452X, E-ISSN 2045-4538, Vol. 6, no 1, p. 73-80Article in journal (Refereed)
    Abstract [en]

    The assessment of compound cytotoxicity is an important part of the drug discovery process. Accurate predictions of cytotoxicity have the potential to expedite decision making and save considerable time and effort. In this work we apply class conditional conformal prediction to model the cytotoxicity of compounds based on 16 high throughput cytotoxicity assays from PubChem. The data span 16 cell lines and comprise more than 440 000 unique compounds. The data sets are heavily imbalanced with only 0.8% of the tested compounds being cytotoxic. We trained one classification model for each cell line and validated the performance with respect to validity and accuracy. The generated models deliver high quality predictions for both toxic and non-toxic compounds despite the imbalance between the two classes. On external data collected from the same assay provider as one of the investigated cell lines the model had a sensitivity of 74% and a specificity of 65% at the 80% confidence level among the compounds assigned to a single class. Compared to previous approaches for large scale cytotoxicity modelling, this represents a balanced performance in the prediction of the toxic and non-toxic classes. The conformal prediction framework also allows the modeller to control the error frequency of the predictions, allowing predictions of cytotoxicity outcomes with confidence.

  • 43.
    Tuerkova, Alzbeta
    et al.
    Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria.
    Bongers, Brandon J.
    Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Ungvári, Orsolya
    Drug Resistance Research Group, Institute of Enzymology, RCNS, Eötvös Loránd Research Network, Budapest, Hungary; Doctoral School of Biology and Institute of Biology, ELTE Eötvös Loránd University, Budapest, Hungary.
    Székely, Virág
    Drug Resistance Research Group, Institute of Enzymology, RCNS, Eötvös Loránd Research Network, Budapest, Hungary.
    Tarnovskiy, Andrey
    Enamine Ltd., Kyiv, Ukraine.
    Szakács, Gergely
    Drug Resistance Research Group, Institute of Enzymology, RCNS, Eötvös Loránd Research Network, Budapest, Hungary; Department of Medicine I, Institute of Cancer Research, Comprehensive Cancer Center, Medical University of Vienna, Vienna, Austria.
    Özvegy-Laczka, Csilla
    Drug Resistance Research Group, Institute of Enzymology, RCNS, Eötvös Loránd Research Network, Budapest, Hungary.
    van Westen, Gerard J. P.
    Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, Leiden, The Netherlands.
    Zdrazil, Barbara
    Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria.
    Identifying Novel Inhibitors for Hepatic Organic Anion Transporting Polypeptides by Machine Learning-Based Virtual Screening2022In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 62, no 24, p. 6323-6335Article in journal (Refereed)
    Abstract [en]

    Integration of statistical learning methods with structure-based modeling approaches is a contemporary strategy to identify novel lead compounds in drug discovery. Hepatic organic anion transporting polypeptides (OATP1B1, OATP1B3, and OATP2B1) are classical off-targets, and it is well recognized that their ability to interfere with a wide range of chemically unrelated drugs, environmental chemicals, or food additives can lead to unwanted adverse effects like liver toxicity and drug-drug or drug-food interactions. Therefore, the identification of novel (tool) compounds for hepatic OATPs by virtual screening approaches and subsequent experimental validation is a major asset for elucidating structure-function relationships of (related) transporters: they enhance our understanding about molecular determinants and structural aspects of hepatic OATPs driving ligand binding and selectivity. In the present study, we performed a consensus virtual screening approach by using different types of machine learning models (proteochemometric models, conformal prediction models, and XGBoost models for hepatic OATPs), followed by molecular docking of preselected hits using previously established structural models for hepatic OATPs. Screening the diverse REAL drug-like set (Enamine) shows a comparable hit rate for OATP1B1 (36% actives) and OATP1B3 (32% actives), while the hit rate for OATP2B1 was even higher (66% actives). Percentage inhibition values for 44 selected compounds were determined using dedicated in vitro assays and guided the prioritization of several highly potent novel hepatic OATP inhibitors: six (strong) OATP2B1 inhibitors (IC50 values ranging from 0.04 to 6 μM), three OATP1B1 inhibitors (2.69 to 10 μM), and five OATP1B3 inhibitors (1.53 to 10 μM) were identified. Strikingly, two novel OATP2B1 inhibitors were uncovered (C7 and H5) which show high affinity (IC50 values: 40 nM and 390 nM) comparable to the recently described estrone-based inhibitor (IC50 = 41 nM). A molecularly detailed explanation for the observed differences in ligand binding to the three transporters is given by means of structural comparison of the detected binding sites and docking poses.

  • 44.
    Vandenberg, Laura N.
    et al.
    Department of Environmental Health Sciences, University of Massachusetts, Amherst, USA.
    Agerstrand, Marlene
    Department of Environmental Science and Analytical Chemistry, Stockholm University, Stockholm, Sweden.
    Beronius, Anna
    Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden.
    Beausoleil, Claire
    ANSES French Agcy Food Environm & Occupat Hlth Sa, Maisons Alfort, France.
    Bergman, Åke
    Department of Environmental Science and Analytical Chemistry, Stockholm University, Stockholm, Sweden; Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Bero, Lisa A.
    University of Sydney, Sydney, Australia.
    Bornehag, Carl-Gustaf
    Department of Health Sciences, Karlstad University, Karlstad, Sweden; Icahn School of Medicine at Mount Sinai, New York City, USA.
    Boyer, C. Scott
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Cooper, Glinda S.
    US Environmental Protection Agency, Washington, DC, USA.
    Cotgreave, Ian
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Gee, David
    Institute of Environment, Health and Societies, Brunel University London, Uxbridge, England.
    Grandjean, Philippe
    Department of Environmental Medicine, University of Southern Denmark, Odense, Denmark.
    Guyton, Kathryn Z.
    International Agency for Research on Cancer, Lyon, France.
    Hass, Ulla
    National Food Institute, Technical University of Denmark, Søborg, Denmark.
    Heindel, Jerrold J.
    National Institute of Environmental Health Sciences, NC, USA.
    Jobling, Susan
    Institute of Environment, Health and Societies, Brunel University London, Uxbridge, England.
    Kidd, Karen A.
    Biology Department and Canadian Rivers Institute, University of New Brunswick, Saint John, Canada; Canadian Rivers Inst, University of New Brunswick, St John, Canada.
    Kortenkamp, Andreas
    Institute of Environment, Health and Societies, Brunel University London, Uxbridge, England.
    Macleod, Malcolm R.
    Centre for Clinical Brain Sciences, University of Edinburgh, Scotland.
    Martin, Olwenn V.
    Institute of Environment, Health and Societies, Brunel University London, Uxbridge, England.
    Norinder, Ulf
    Swedish Toxicology Sciences Research Center, Södertälje, Sweden.
    Scheringer, Martin
    Institute for Chemical and Bioengineering, ETH Zürich, Zürich, Switzerland.
    Thayer, Kristina A.
    Department of Health and Human Services, National Institute of Environmental Health Sciences, NC, USA.
    Toppari, Jorma
    University of Turku, Turku University Hospital, Turku, Finland.
    Whaley, Paul
    Lancaster Environment Centre, Lancaster University, Lancaster, England.
    Woodruff, Tracey J.
    Program on Reproductive Health and the Environment, University of California, San Francisco, USA.
    Rudén, Christina
    Department of Environmental Science and Analytical Chemistry, Stockholm University, Stockholm, Sweden.
    A proposed framework for the systematic review and integrated assessment (SYRINA) of endocrine disrupting chemicals2016In: Environmental Health, E-ISSN 1476-069X, Vol. 15, no 1, article id 74Article in journal (Refereed)
    Abstract [en]

    Background: The issue of endocrine disrupting chemicals (EDCs) is receiving wide attention from both the scientific and regulatory communities. Recent analyses of the EDC literature have been criticized for failing to use transparent and objective approaches to draw conclusions about the strength of evidence linking EDC exposures to adverse health or environmental outcomes. Systematic review methodologies are ideal for addressing this issue as they provide transparent and consistent approaches to study selection and evaluation. Objective methods are needed for integrating the multiple streams of evidence (epidemiology, wildlife, laboratory animal, in vitro, and in silico data) that are relevant in assessing EDCs.

    Methods: We have developed a framework for the systematic review and integrated assessment (SYRINA) of EDC studies. The framework was designed for use with the International Program on Chemical Safety (IPCS) and World Health Organization (WHO) definition of an EDC, which requires appraisal of evidence regarding 1) association between exposure and an adverse effect, 2) association between exposure and endocrine disrupting activity, and 3) a plausible link between the adverse effect and the endocrine disrupting activity.

    Results: Building from existing methodologies for evaluating and synthesizing evidence, the SYRINA framework includes seven steps: 1) Formulate the problem; 2) Develop the review protocol; 3) Identify relevant evidence; 4) Evaluate evidence from individual studies; 5) Summarize and evaluate each stream of evidence; 6) Integrate evidence across all streams; 7) Draw conclusions, make recommendations, and evaluate uncertainties. The proposed method is tailored to the IPCS/WHO definition of an EDC but offers flexibility for use in the context of other definitions of EDCs.

    Conclusions: When using the SYRINA framework, the overall objective is to provide the evidence base needed to support decision making, including any action to avoid/minimise potential adverse effects of exposures. This framework allows for the evaluation and synthesis of evidence from multiple evidence streams. Finally, a decision regarding regulatory action is not only dependent on the strength of evidence, but also the consequences of action/inaction, e.g. limited or weak evidence may be sufficient to justify action if consequences are serious or irreversible.

  • 45.
    Wilm, Anke
    et al.
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany; HITeC e.V., Hamburg, Germany.
    Garcia de Lomana, Marina
    Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
    Stork, Conrad
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany.
    Mathai, Neann
    Computational Biology Unit (CBU), Department of Chemistry, University of Bergen, Bergen, Norway.
    Hirte, Steffen
    Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden; Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Kühnl, Jochen
    Front End Innovation, Beiersdorf AG, Hamburg, Germany.
    Kirchmair, Johannes
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany; Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
    Predicting the Skin Sensitization Potential of Small Molecules with Machine Learning Models Trained on Biologically Meaningful Descriptors2021In: Pharmaceuticals, E-ISSN 1424-8247, Vol. 14, no 8, article id 790Article in journal (Refereed)
    Abstract [en]

    In recent years, a number of machine learning models for the prediction of the skin sensitization potential of small organic molecules have been reported and become available. These models generally perform well within their applicability domains but, as a result of the use of molecular fingerprints and other non-intuitive descriptors, the interpretability of the existing models is limited. The aim of this work is to develop a strategy to replace the non-intuitive features by predicted outcomes of bioassays. We show that such replacement is indeed possible and that as few as ten interpretable, predicted bioactivities are sufficient to reach competitive performance. On a holdout data set of 257 compounds, the best model ("Skin Doctor CP:Bio") obtained an efficiency of 0.82 and an MCC of 0.52 (at the significance level of 0.20). Skin Doctor CP:Bio is available free of charge for academic research. The modeling strategies explored in this work are easily transferable and could be adopted for the development of more interpretable machine learning models for the prediction of the bioactivity and toxicity of small organic compounds.

  • 46.
    Wilm, Anke
    et al.
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany; HITeC e.V., Hamburg, Germany.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden; Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Agea, M. Isabel
    Department of Informatics and Chemistry, University of Chemistry and Technology Prague, Prague, Czech Republic.
    de Bruyn Kops, Christina
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany.
    Stork, Conrad
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany.
    Kühnl, Jochen
    Front End Innovation, Beiersdorf AG, Hamburg, Germany.
    Kirchmair, Johannes
    Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, Hamburg, Germany; Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria.
    Skin Doctor CP: Conformal Prediction of the Skin Sensitization Potential of Small Organic Molecules2021In: Chemical Research in Toxicology, ISSN 0893-228X, E-ISSN 1520-5010, Vol. 34, no 2, p. 330-344Article in journal (Refereed)
    Abstract [en]

    Skin sensitization potential or potency is an important end point in the safety assessment of new chemicals and new chemical mixtures. Formerly, animal experiments such as the local lymph node assay (LLNA) were the main form of assessment. Today, however, the focus lies on the development of nonanimal testing approaches (i.e., in vitro and in chemico assays) and computational models. In this work, we investigate, based on publicly available LLNA data, the ability of aggregated, Mondrian conformal prediction classifiers to differentiate between non- sensitizing and sensitizing compounds as well as between two levels of skin sensitization potential (weak to moderate sensitizers, and strong to extreme sensitizers). The advantage of the conformal prediction framework over other modeling approaches is that it assigns compounds to activity classes only if a defined minimum level of confidence is reached for the individual predictions. This eliminates the need for applicability domain criteria that often are arbitrary in their nature and less flexible. Our new binary classifier, named Skin Doctor CP, differentiates nonsensitizers from sensitizers with a higher reliability-to-efficiency ratio than the corresponding nonconformal prediction workflow that we presented earlier. When tested on a set of 257 compounds at the significance levels of 0.10 and 0.30, the model reached an efficiency of 0.49 and 0.92, and an accuracy of 0.83 and 0.75, respectively. In addition, we developed a ternary classification workflow to differentiate nonsensitizers, weak to moderate sensitizers, and strong to extreme sensitizers. Although this model achieved satisfactory overall performance (accuracies of 0.90 and 0.73, and efficiencies of 0.42 and 0.90, at significance levels 0.10 and 0.30, respectively), it did not obtain satisfying class-wise results (at a significance level of 0.30, the validities obtained for nonsensitizers, weak to moderate sensitizers, and strong to extreme sensitizers were 0.70, 0.58, and 0.63, respectively). We argue that the model is, in consequence, unable to reliably identify strong to extreme sensitizers and suggest that other ternary models derived from the currently accessible LLNA data might suffer from the same problem. Skin Doctor CP is available via a public web service at https://nerdd.zbh.uni-hamburg.de/skinDoctorII/.

  • 47.
    Zhang, Jin
    et al.
    Department of Chemistry, Umeå University, Umeå, Sweden.
    Mucs, Daniel
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden.
    Norinder, Ulf
    Unit of Toxicology Sciences, Karolinska Institute, Södertälje, Sweden; Department of Computer and System Sciences, Stockholm University, Kista, Sweden.
    Svensson, Fredrik
    Drug Discovery Institute, London, England.
    LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity–Application to the Tox21 and Mutagenicity Data Sets2019In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 59, no 10, p. 4150-4158Article in journal (Refereed)
    Abstract [en]

    Machine learning algorithms have attained widespread use in assessing the potential toxicities of pharmaceuticals and industrial chemicals because of their faster speed and lower cost compared to experimental bioassays. Gradient boosting is an effective algorithm that often achieves high predictivity, but historically the relative long computational time limited its applications in predicting large compound libraries or developing in silico predictive models that require frequent retraining. LightGBM, a recent improvement of the gradient boosting algorithm, inherited its high predictivity but resolved its scalability and long computational time by adopting a leaf-wise tree growth strategy and introducing novel techniques. In this study, we compared the predictive performance and the computational time of LightGBM to deep neural networks, random forests, support vector machines, and XGBoost. All algorithms were rigorously evaluated on publicly available Tox21 and mutagenicity data sets using a Bayesian optimization integrated nested 10-fold cross-validation scheme that performs hyperparameter optimization while examining model generalizability and transferability to new data. The evaluation results demonstrated that LightGBM is an effective and highly scalable algorithm offering the best predictive performance while consuming significantly shorter computational time than the other investigated algorithms across all Tox21 and mutagenicity data sets. We recommend LightGBM for applications of in silico safety assessment and also other areas of cheminformatics to fulfill the ever-growing demand for accurate and rapid prediction of various toxicity or activity related end points of large compound libraries present in the pharmaceutical and chemical industry.

  • 48.
    Zhang, Jin
    et al.
    Department of Drug Metabolism and Pharmacokinetics, Janssen Pharmaceutica NV, Beerse, Belgium.
    Norinder, Ulf
    Örebro University, School of Science and Technology. Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden; Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
    Svensson, Fredrik
    The Alzheimer's Research UK University College London Drug Discovery Institute, The Cruciform Building, London, U.K.
    Deep Learning-Based Conformal Prediction of Toxicity2021In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 61, no 6, p. 2648-2657Article in journal (Refereed)
    Abstract [en]

    Predictive modeling for toxicity can help reduce risks in a range of applications and potentially serve as the basis for regulatory decisions. However, the utility of these predictions can be limited if the associated uncertainty is not adequately quantified. With recent studies showing great promise for deep learning-based models also for toxicity predictions, we investigate the combination of deep learning-based predictors with the conformal prediction framework to generate highly predictive models with well-defined uncertainties. We use a range of deep feedforward neural networks and graph neural networks in a conformal prediction setting and evaluate their performance on data from the Tox21 challenge. We also compare the results from the conformal predictors to those of the underlying machine learning models. The results indicate that highly predictive models can be obtained that result in very efficient conformal predictors even at high confidence levels. Taken together, our results highlight the utility of conformal predictors as a convenient way to deliver toxicity predictions with confidence, adding both statistical guarantees on the model performance as well as better predictions of the minority class compared to the underlying models.

1 - 48 of 48
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf