To Örebro University

oru.seÖrebro University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Integrating functional knowledge during sample clustering for microarray data using unsupervised decision trees
Max Planck Institute for Molecular Plant Physiology, Golm, Germany.
University of Potsdam, Potsdam, Germany.ORCID iD: 0000-0002-7173-5579
Institute for Informatics, Ludwig Maximilians University,Munich, Germany.
Max Planck Institute for Molecular Plant Physiology, Golm, Germany; University of Potsdam, Potsdam, Germany.
2007 (English)In: Biometrical Journal, ISSN 0323-3847, E-ISSN 1521-4036, Vol. 49, no 2, p. 214-29Article in journal (Refereed) Published
Abstract [en]

Clustering of microarray gene expression data is performed routinely, for genes as well as for samples. Clustering of genes can exhibit functional relationships between genes; clustering of samples on the other hand is important for finding e.g. disease subtypes, relevant patient groups for stratification or related treatments. Usually this is done by first filtering the genes for high-variance under the assumption that they carry most of the information needed for separating different sample groups. If this assumption is violated, important groupings in the data might be lost. Furthermore, classical clustering methods do not facilitate the biological interpretation of the results. Therefore, we propose to methodologically integrate the clustering algorithm with prior biological information. This is different from other approaches as knowledge about classes of genes can be directly used to ease the interpretation of the results and possibly boost clustering performance. Our approach computes dendrograms that resemble decision trees with gene classes used to split the data at each node which can help to find biologically meaningful differences between the sample groups. We have tested the proposed method both on simulated and real data and conclude its usefulness as a complementary method, especially when assumptions of few differentially expressed genes along with an informative mapping of genes to different classes are met.

Place, publisher, year, edition, pages
Berlin, Germany: Akademie Verlag, 2007. Vol. 49, no 2, p. 214-29
Keywords [en]
Clustering; functional ontologies; gene expression; integrative data analysis; microarray; samplewise clustering; UDT; unsupervised decision trees
National Category
Bioinformatics and Computational Biology
Identifiers
URN: urn:nbn:se:oru:diva-40639DOI: 10.1002/bimj.200610278ISI: 000245911800004PubMedID: 17476945Scopus ID: 2-s2.0-34247388808OAI: oai:DiVA.org:oru-40639DiVA, id: diva2:778031
Available from: 2015-01-09 Created: 2015-01-09 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Repsilber, Dirk

Search in DiVA

By author/editor
Repsilber, Dirk
In the same journal
Biometrical Journal
Bioinformatics and Computational Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 351 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf