oru.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Learning biomarkers of pluripotent stem cells in mouse
Institute of Computer Science, University of Osnabrück, Osnabrück, Germany.
Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, Rostoch, Germany.
Leibniz Institute for Farm Animal Biology (FBN Dummerstorf ), Dummerstorf, Germany.ORCID iD: 0000-0002-7173-5579
Institute for Biostatistics and Informatics in Medicine and Ageing Research, University of Rostock, Rostoch, Germany; Department of Intelligent Systems, Jozef Stefan Institute, Ljubljana, Slovenia.
Show others and affiliations
2011 (English)In: DNA research, ISSN 1340-2838, E-ISSN 1756-1663, Vol. 18, no 4, p. 233-51Article in journal (Refereed) Published
Abstract [en]

Pluripotent stem cells are able to self-renew, and to differentiate into all adult cell types. Many studies report data describing these cells, and characterize them in molecular terms. Machine learning yields classifiers that can accurately identify pluripotent stem cells, but there is a lack of studies yielding minimal sets of best biomarkers (genes/features). We assembled gene expression data of pluripotent stem cells and non-pluripotent cells from the mouse. After normalization and filtering, we applied machine learning, classifying samples into pluripotent and non-pluripotent with high cross-validated accuracy. Furthermore, to identify minimal sets of best biomarkers, we used three methods: information gain, random forests and a wrapper of genetic algorithm and support vector machine (GA/SVM). We demonstrate that the GA/SVM biomarkers work best in combination with each other; pathway and enrichment analyses show that they cover the widest variety of processes implicated in pluripotency. The GA/SVM wrapper yields best biomarkers, no matter which classification method is used. The consensus best biomarker based on the three methods is Tet1, implicated in pluripotency just recently. The best biomarker based on the GA/SVM wrapper approach alone is Fam134b, possibly a missing link between pluripotency and some standard surface markers of unknown function processed by the Golgi apparatus.

Place, publisher, year, edition, pages
Oxford, UK: Oxford University Press, 2011. Vol. 18, no 4, p. 233-51
Keywords [en]
Pluripotency; machine learning; feature selection; genetic algorithm; support vector machine
National Category
Bioinformatics and Systems Biology
Identifiers
URN: urn:nbn:se:oru:diva-40621DOI: 10.1093/dnares/dsr016ISI: 000294936100004PubMedID: 21791477Scopus ID: 2-s2.0-80051964025OAI: oai:DiVA.org:oru-40621DiVA, id: diva2:777969
Available from: 2015-01-09 Created: 2015-01-09 Last updated: 2018-05-06Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records BETA

Repsilber, Dirk

Search in DiVA

By author/editor
Repsilber, Dirk
In the same journal
DNA research
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 44 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf