oru.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Biomarker discovery: classification using pooled samples
Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany.
Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany.ORCID-id: 0000-0002-7173-5579
Leibniz Institute for Farm Animal Biology, Dummerstorf, Germany.
2013 (engelsk)Inngår i: Computational statistics (Zeitschrift), ISSN 0943-4062, E-ISSN 1613-9658, Vol. 28, nr 1, s. 67-106Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

RNA-sample pooling is sometimes inevitable, but should be avoided in classification tasks like biomarker studies. Our simulation framework investigates a two-class classification study based on gene expression profiles to point out howstrong the outcomes of single sample designs differ to those of pooling designs. The results show how the effects of pooling depend on pool size, discriminating pattern, number of informative features and the statistical learning method used (support vector machines with linear and radial kernel, random forest (RF), linear discriminant analysis, powered partial least squares discriminant analysis (PPLS-DA) and partial least squares discriminant analysis (PLS-DA)). As a measure for the pooling effect, we consider prediction error (PE) and the coincidence of important feature sets for classification based on PLS-DA, PPLS-DAand RF. In general, PPLS-DAand PLS-DAshow constant PE with increasing pool size and low PE for patterns for which the convex hull of one class is not a cover of the other class. The coincidence of important feature sets is larger for PLS-DA and PPLS-DA as it is for RF. RF shows the best results for patterns in which the convex hull of one class is a cover of the other class, but these depend strongly on the pool size. We complete the PE results with experimental data whichwe pool artificially. The PE of PPLS-DAand PLS-DAare again least influenced by pooling and are low. Additionally, we show under which assumption the PLS-DA loading weights, as a measure for importance of features regarding classification, are equal for the different designs.

sted, utgiver, år, opplag, sider
Heidelberg, Germany: Springer, 2013. Vol. 28, nr 1, s. 67-106
Emneord [en]
Sample pooling, biomarker search, statistical learning methods, partial least squares discriminant analysis, prediction error
HSV kategori
Identifikatorer
URN: urn:nbn:se:oru:diva-40740DOI: 10.1007/s00180-011-0302-0ISI: 000315163600006Scopus ID: 2-s2.0-84874221732OAI: oai:DiVA.org:oru-40740DiVA, id: diva2:778578
Tilgjengelig fra: 2015-01-11 Laget: 2015-01-11 Sist oppdatert: 2018-01-30bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Repsilber, Dirk

Søk i DiVA

Av forfatter/redaktør
Repsilber, Dirk
I samme tidsskrift
Computational statistics (Zeitschrift)

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 180 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf