To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Comparative analysis of the use of chemoinformatics-based and substructure-based descriptors for quantitative structure-activity relationship (QSAR) modeling
Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
Department of Computer and Systems Sciences, Stockholm University, Kista, Sweden.
AstraZeneca Research and Development, Södertälje, Sweden; Department of Pharmacy, Uppsala University, Uppsala, Sweden; Department of Computational Chemistry, H. Lundbeck A/S, Valby, Denmark.ORCID iD: 0000-0003-3107-331X
2013 (English)In: Intelligent Data Analysis, ISSN 1088-467X, E-ISSN 1571-4128, Vol. 17, no 2, p. 327-341Article in journal (Refereed) Published
Abstract [en]

Quantitative structure-activity relationship (QSAR) models have gained popularity in the pharmaceutical industry due to their potential to substantially decrease drug development costs by reducing expensive laboratory and clinical tests. QSAR modeling consists of two fundamental steps, namely, descriptor discovery and model building. Descriptor discovery methods are either based on chemical domain knowledge or purely data-driven. The former, chemoinformatics-based, and the latter, substructures-based, methods for QSAR modeling, have been developed quite independently. As a consequence, evaluations involving both types of descriptor discovery method are rarely seen. In this study, a comparative analysis of chemoinformatics-based and substructure-based approaches is presented. Two chemoinformatics-based approaches; ECFI and SELMA, are compared to five approaches for substructure discovery; CP, graphSig, MFI, MoFa and SUBDUE, using 18 QSAR datasets. The empirical investigation shows that one of the chemo-informatics-based approaches, ECFI, results in significantly more accurate models compared to all other methods, when used on their own. Results from combining descriptor sets are also presented, showing that the addition of ECFI descriptors to any other descriptor set leads to improved predictive performance for that set, while the use of ECFI descriptors in many cases also can be improved by adding descriptors generated by the other methods.

Place, publisher, year, edition, pages
IOS Press, 2013. Vol. 17, no 2, p. 327-341
Keywords [en]
QSAR modeling, chemical descriptors, graph mining
National Category
Natural Sciences Computer Sciences
Identifiers
URN: urn:nbn:se:oru:diva-83046DOI: 10.3233/IDA-130581ISI: 000319344300010Scopus ID: 2-s2.0-84873594844OAI: oai:DiVA.org:oru-83046DiVA, id: diva2:1439321
Note

Forskningsfinansiär: Swedish Foundation for Strategic Research through the project High-Performance Data Mining for Drug Effect Detection at Stockholm University, Sweden, Grant Number: IIS11-0053

Available from: 2013-07-01 Created: 2020-06-12 Last updated: 2024-01-16Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Norinder, Ulf

Search in DiVA

By author/editor
Norinder, Ulf
In the same journal
Intelligent Data Analysis
Natural SciencesComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 51 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf