oru.sePublications
Change search
Refine search result
1 - 10 of 10
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Dalevi, Daniel
    et al.
    Department of Computing Science and Engineering, Chalmers University of Technology, Gothenburg.
    Eriksen, Niklas
    Department of Mathematical Sciences, Gothenburg University and Chalmers University of Technology, Gotenhburg.
    Expected Gene Order Distances and Model Selection in Bacteria2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 11, p. 1332-1338Article in journal (Refereed)
    Abstract [en]

    Motivation: The evolutionary distance inferred from gene order comparisons of related bacteria is dependent on the model. Therefore, it is highly important to establish reliable assumptions before inferring its magnitude.

    Results: We investigate the patterns of dotplots between species of bacteria with the purpose of model selection in gene order problems. We find several categories of data which can be explained by carefully weighing the contributions of reversals, transpositions, symmetrical reversals, single gene transpositions, and single gene reversals. We also derive method of moments distance estimates for some previously uncomputed cases, such as symmetrical reversals, single gene reversals and their combinations, as well as the single gene transpositions edit distance.

  • 2.
    Demissie, Meaza
    et al.
    Örebro University, Swedish Business School at Örebro University.
    Mascialino, Barbara
    Calza, Stefano
    Pawitan, Yudi
    Unequal group variances in microarray data analyses2008In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 24, no 9, p. 1168-1174Article in journal (Refereed)
    Abstract [en]

    Motivation: In searching for differentially expressed (DE) genes in microarray data, we often observe a fraction of the genes to have unequal variability between groups. This is not an issue in large samples, where a valid test exists that uses individual variances separately. The problem arises in the small-sample setting, where the approximately valid Welch test lacks sensitivity, while the more sensitive moderated t-test assumes equal variance. Methods: We introduce a moderated Welch test (MWT) that allows unequal variance between groups. It is based on (i) weighting of pooled and unpooled standard errors and (ii) improved estimation of the gene-level variance that exploits the information from across the genes. Results: When a non-trivial proportion of genes has unequal variability, false discovery rate (FDR) estimates based on the standard t and moderated t-tests are often too optimistic, while the standard Welch test has low sensitivity. The MWT is shown to (i) perform better than the standard t, the standard Welch and the moderated t-tests when the variances are unequal between groups and (ii) perform similarly to the moderated t, and better than the standard t and Welch tests when the group variances are equal. These results mean that MWT is more reliable than other existing tests over wider range of data conditions. Availability: R package to perform MWT is available at http://www.meb.ki.se/similar to yudpaw Contact: yudi.pawitan@ki.se Supplementary information: Supplementary data are available at Bioinformatics online.

  • 3.
    Elo, Laura L.
    et al.
    Department of Mathematics, University of Turku, Turku, Finland; Turku Centre for Biotechnology, Turku, Finland.
    Järvenpää, Henna
    Turku Centre for Biotechnology, Turku, Finland.
    Oresic, Matej
    Turku Centre for Biotechnology, Turku, Finland; VTT Biotechnology, Espoo, Finland.
    Lahesmaa, Riitta
    Turku Centre for Biotechnology, Turku, Finland.
    Aittokallio, Tero
    Department of Mathematics, University of Turku, Turku, Finland; Turku Centre for Biotechnology, Turku, Finland; Systems Biology Unit, Institut Pasteur, Paris, France.
    Systematic construction of gene coexpression networks with applications to human T helper cell differentiation process2007In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 23, no 16, p. 2096-2103Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Coexpression networks have recently emerged as a novel holistic approach to microarray data analysis and interpretation. Choosing an appropriate cutoff threshold, above which a gene-gene interaction is considered as relevant, is a critical task in most network-centric applications, especially when two or more networks are being compared.

    RESULTS: We demonstrate that the performance of traditional approaches, which are based on a pre-defined cutoff or significance level, can vary drastically depending on the type of data and application. Therefore, we introduce a systematic procedure for estimating a cutoff threshold of coexpression networks directly from their topological properties. Both synthetic and real datasets show clear benefits of our data-driven approach under various practical circumstances. In particular, the procedure provides a robust estimate of individual degree distributions, even from multiple microarray studies performed with different array platforms or experimental designs, which can be used to discriminate the corresponding phenotypes. Application to human T helper cell differentiation process provides useful insights into the components and interactions controlling this process, many of which would have remained unidentified on the basis of expression change alone. Moreover, several human-mouse orthologs showed conserved topological changes in both systems, suggesting their potential importance in the differentiation process.

    SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

  • 4.
    Gopalacharyulu, Peddinti V.
    et al.
    VTT Biotechnology, Espoo, Finland.
    Lindfors, Erno
    VTT Biotechnology, Espoo, Finland.
    Bounsaythip, Catherine
    VTT Biotechnology, Espoo, Finland.
    Kivioja, Teemu
    VTT Biotechnology, Espoo, Finland.
    Yetukuri, Laxman
    VTT Biotechnology, Espoo, Finland.
    Hollmén, Jaakko
    Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland.
    Oresic, Matej
    VTT Biotechnology, Espoo, Finland.
    Data integration and visualization system for enabling conceptual biology2005In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 21 Suppl 1, p. i177-i185Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Integration of heterogeneous data in life sciences is a growing and recognized challenge. The problem is not only to enable the study of such data within the context of a biological question but also more fundamentally, how to represent the available knowledge and make it accessible for mining.

    RESULTS: Our integration approach is based on the premise that relationships between biological entities can be represented as a complex network. The context dependency is achieved by a judicious use of distance measures on these networks. The biological entities and the distances between them are mapped for the purpose of visualization into the lower dimensional space using the Sammon's mapping. The system implementation is based on a multi-tier architecture using a native XML database and a software tool for querying and visualizing complex biological networks. The functionality of our system is demonstrated with two examples: (1) A multiple pathway retrieval, in which, given a pathway name, the system finds all the relationships related to the query by checking available metabolic pathway, transcriptional, signaling, protein-protein interaction and ontology annotation resources and (2) A protein neighborhood search, in which given a protein name, the system finds all its connected entities within a specified depth. These two examples show that our system is able to conceptually traverse different databases to produce testable hypotheses and lead towards answers to complex biological questions.

  • 5.
    Huopaniemi, Ilkka
    et al.
    School of Science and Technology, Department of Information and Computer Science, Aalto University, Espoo, Finland; Helsinki Institute for Information Techology HIIT, Helsinki, Finland.
    Suvitaival, Tommi
    School of Science and Technology, Department of Information and Computer Science, Aalto University, Espoo, Finland; Helsinki Institute for Information Techology HIIT, Helsinki, Finland.
    Nikkilä, Janne
    School of Science and Technology, Department of Information and Computer Science, Aalto University, Espoo, Finland; Helsinki Institute for Information Techology HIIT, Helsinki, Finland; Department of Veterinary Biosciences, Faculty of Veterinary Medicine, University of Helsinki, Helsinki, Finland.
    Oresic, Matej
    Örebro University, School of Medical Sciences. VTT Technical Research Centre of Finland, Espoo, Finland.
    Kaski, Samuel
    School of Science and Technology, Department of Information and Computer Science, Aalto University, Espoo, Finland; Helsinki Institute for Information Techology HIIT, Helsinki, Finland.
    Multivariate multi-way analysis of multi-source data2010In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 26, no 12, p. i391-i398Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Analysis of variance (ANOVA)-type methods are the default tool for the analysis of data with multiple covariates. These tools have been generalized to the multivariate analysis of high-throughput biological datasets, where the main challenge is the problem of small sample size and high dimensionality. However, the existing multi-way analysis methods are not designed for the currently increasingly important experiments where data is obtained from multiple sources. Common examples of such settings include integrated analysis of metabolic and gene expression profiles, or metabolic profiles from several tissues in our case, in a controlled multi-way experimental setup where disease status, medical treatment, gender and time-series are usual covariates.

    RESULTS: We extend the applicability area of multivariate, multi-way ANOVA-type methods to multi-source cases by introducing a novel Bayesian model. The method is capable of finding covariate-related dependencies between the sources. It assumes the measurements consist of groups of similarly behaving variables, and estimates the multivariate covariate effects and their interaction effects for the discovered groups of variables. In particular, the method partitions the effects to those shared between the sources and to source-specific ones. The method is specifically designed for datasets with small sample sizes and high dimensionality. We apply the method to a lipidomics dataset from a lung cancer study with two-way experimental setup, where measurements from several tissues with mostly distinct lipids have been taken. The method is also directly applicable to gene expression and proteomics.

    AVAILABILITY: An R-implementation is available at http://www.cis.hut.fi/projects/mi/software/multiWayCCA/.

  • 6.
    Kankainen, Matti
    et al.
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Gopalacharyulu, Peddinti
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Holm, Liisa
    Institute of Biotechnology, Department of Biological Sciences, University of Helsinki, Helsinki, Finland.
    Oresic, Matej
    Örebro University, School of Medical Sciences. VTT Technical Research Centre of Finland, Espoo, Finland.
    MPEA--metabolite pathway enrichment analysis2011In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 27, no 13, p. 1878-1879Article in journal (Refereed)
    Abstract [en]

    UNLABELLED: We present metabolite pathway enrichment analysis (MPEA) for the visualization and biological interpretation of metabolite data at the system level. Our tool follows the concept of gene set enrichment analysis (GSEA) and tests whether metabolites involved in some predefined pathway occur towards the top (or bottom) of a ranked query compound list. In particular, MPEA is designed to handle many-to-many relationships that may occur between the query compounds and metabolite annotations. For a demonstration, we analysed metabolite profiles of 14 twin pairs with differing body weights. MPEA found significant pathways from data that had no significant individual query compounds, its results were congruent with those discovered from transcriptomics data and it detected more pathways than the competing metabolic pathway method did.

    AVAILABILITY: The web server and source code of MPEA are available at http://ekhidna.biocenter.helsinki.fi/poxo/mpea/.

  • 7.
    Katajamaa, Mikko
    et al.
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Miettinen, Jarkko
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Oresic, Matej
    Turku Centre for Biotechnology, Turku, Finland.
    MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data2006In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 22, no 5, p. 634-636Article in journal (Refereed)
    Abstract [en]

    Summary: New additional methods are presented for processing and visualizing mass spectrometry based molecular profile data, implemented as part of the recently introduced MZmine software. They include new features and extensions such as support for mzXML data format, capability to perform batch processing for large number of files, support for parallel processing, new methods for calculating peak areas using post-alignment peak picking algorithm and implementation of Sammon's mapping and curvilinear distance analysis for data visualization and exploratory analysis.

    Avalibility: MZmine is available under GNU Public license from http://mzmine.sourceforge.net/.

  • 8.
    Sköld, Martin
    et al.
    Örebro University, Department of Business, Economics, Statistics and Informatics.
    Rydén, Tobias
    Lund University.
    Samuelsson, Viktoria
    Lund University.
    Bratt, Charlotte
    Lund University.
    Ekblad, Lars
    Lund University.
    Olsson, Håkan
    Lund University.
    Baldetorp, Bo
    Lund University.
    Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry2007In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 23, no 11, p. 1401-1409Article in journal (Refereed)
    Abstract [en]

    Motivation: Pre-processing of SELDI-TOF mass spectrometry data is currently performed on a largel y ad hoc basis. This makes comparison of results from independent analyses troublesome and does not provide a framework for distinguishing different sources of variation in data.

    Results: In this article, we consider the task of pooling a large number of single-shot spectra, a task commonly performed automatically by the instrument software. By viewing the underlying statistical problem as one of heteroscedastic linear regression, we provide a framework for introducing robust methods and for dealing with missing data resulting from a limited span of recordable intensity values provided by the instrument. Our framework provides an interpretation of currently used methods as a maximum-likelihood estimator and allows theoretical derivation of its variance. We observe that this variance depends crucially on the total number of ionic species, which can vary considerably between different pooled spectra. This variation in variance can potentially invalidate the results from naive methods of discrimination/classification and we outline appropriate data transformations. Introducing methods from robust statistics did not improve the standard errors of the pooled samples. Imputing missing values however—using the EM algorithm—had a notable effect on the result; for our data, the pooled height of peaks which were frequently truncated increased by up to 30%.

  • 9.
    Stjernqvist, Susann
    et al.
    Lund University.
    Rydén, Tobias
    Lund University.
    Sköld, Martin
    Örebro University, Department of Business, Economics, Statistics and Informatics.
    Staaf, Johan
    Continuous-index hidden Markov modelling of array CGH copy number data2007In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 23, no 8, p. 1006-1014Article in journal (Refereed)
    Abstract [en]

    Motivation: In recent years, a range of techniques for analysis and segmentation of array comparative genomic hybridization (aCGH) data have been proposed. For array designs in which clones are of unequal lengths, are unevenly spaced or overlap, the discrete-index view typically adopted by such methods may be questionable or improved.

    Results: We describe a continuous-index hidden Markov model for aCGH data as well as a Monte Carlo EM algorithm to estimate its parameters. It is shown that for a dataset from the BT-474 cell line analysed on 32K BAC tiling microarrays, this model yields considerably better model fit in terms of lag-1 residual autocorrelations compared to a discrete-index HMM, and it is also shown how to use the model for e.g. estimation of change points on the base-pair scale and for estimation of conditional state probabilities across the genome. In addition, the model is applied to the Glioblastoma Multiforme data used in the comparative study by Lai et al. (Lai,W.R. et al. (2005) Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics, 21, 3763–3370.) giving result similar to theirs but with certain features highlighted in the continuous-index setting

  • 10.
    Sysi-Aho, Marko
    et al.
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Vehtari, Aki
    Helsinki University of Technology, Espoo, Finland.
    Velagapudi, Vidya R.
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Westerbacka, Jukka
    Helsinki University Hospital, Biomedicum, Helsink, Finland.
    Yetukuri, Laxman
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Bergholm, Robert
    Helsinki University Hospital, Biomedicum, Helsink, Finland.
    Taskinen, Marja-Riitta
    Helsinki University Hospital, Biomedicum, Helsink, Finland.
    Yki-Järvinen, Hannele
    Department of Medicine, University of Helsinki, Helsinki, Finland.
    Oresic, Matej
    VTT Technical Research Centre of Finland, Espoo, Finland.
    Exploring the lipoprotein composition using Bayesian regression on serum lipidomic profiles2007In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 23, no 13, p. i519-i528Article in journal (Refereed)
    Abstract [en]

    MOTIVATION: Serum lipids have been traditionally studied in the context of lipoprotein particles. Today's emerging lipidomics technologies afford sensitive detection of individual lipid molecular species, i.e. to a much greater detail than the scale of lipoproteins. However, such global serum lipidomic profiles do not inherently contain any information on where the detected lipid species are coming from. Since it is too laborious and time consuming to routinely perform serum fractionation and lipidomics analysis on each lipoprotein fraction separately, this presents a challenge for the interpretation of lipidomic profile data. An exciting and medically important new bioinformatics challenge today is therefore how to build on extensive knowledge of lipid metabolism at lipoprotein levels in order to develop better models and bioinformatics tools based on high-dimensional lipidomic data becoming available today.

    RESULTS: We developed a hierarchical Bayesian regression model to study lipidomic profiles in serum and in different lipoprotein classes. As a background data for the model building, we utilized lipidomic data for each of the lipoprotein fractions from 5 subjects with metabolic syndrome and 12 healthy controls. We clustered the lipid profiles and applied a regression model within each cluster separately. We found that the amount of a lipid in serum can be adequately described by the amounts of lipids in the lipoprotein classes. In addition to improved ability to interpret lipidomic data, we expect that our approach will also facilitate dynamic modelling of lipid metabolism at the individual molecular species level.

1 - 10 of 10
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf