umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Evaluation of microarray data normalization procedures using spike-in experiments
Umeå University, Faculty of Medicine, Department of Clinical Microbiology, Clinical Bacteriology. Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics. (Patrik Rydén)
Umeå University, Faculty of Medicine, Department of Clinical Microbiology, Clinical Bacteriology.
Umeå University, Faculty of Medicine, Department of Clinical Microbiology, Clinical Bacteriology.
Umeå University, Faculty of Medicine, Department of Clinical Microbiology, Clinical Bacteriology.
Show others and affiliations
2006 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 7, no 300, 17- p.Article in journal (Refereed) Published
Abstract [en]

Background: Recently, a large number of methods for the analysis of microarray data have been proposed but there are few comparisons of their relative performances. By using so-called spike-in experiments, it is possible to characterize the analyzed data and thereby enable comparisons of different analysis methods.

Results: A spike-in experiment using eight in-house produced arrays was used to evaluate established and novel methods for filtration, background adjustment, scanning, channel adjustment, and censoring. The S-plus package EDMA, a stand-alone tool providing characterization of analyzed cDNA-microarray data obtained from spike-in experiments, was developed and used to evaluate 252 normalization methods. For all analyses, the sensitivities at low false positive rates were observed together with estimates of the overall bias and the standard deviation. In general, there was a trade-off between the ability of the analyses to identify differentially expressed genes (i.e. the analyses' sensitivities) and their ability to provide unbiased estimators of the desired ratios. Virtually all analysis underestimated the magnitude of the regulations; often less than 50% of the true regulations were observed. Moreover, the bias depended on the underlying mRNA-concentration; low concentration resulted in high bias. Many of the analyses had relatively low sensitivities, but analyses that used either the constrained model (i.e. a procedure that combines data from several scans) or partial filtration (a novel method for treating data from so-called not-found spots) had with few exceptions high sensitivities. These methods gave considerable higher sensitivities than some commonly used analysis methods.

Conclusion: The use of spike-in experiments is a powerful approach for evaluating microarray preprocessing procedures. Analyzed data are characterized by properties of the observed log-ratios and the analysis' ability to detect differentially expressed genes. If bias is not a major problem; we recommend the use of either the CM-procedure or partial filtration.

 

Place, publisher, year, edition, pages
London: BioMed Central Ltd , 2006. Vol. 7, no 300, 17- p.
Keyword [en]
microarray, data analysis, normalization, evaluation, spike-in experiments
National Category
Computational Mathematics
Research subject
Mathematical Statistics
Identifiers
URN: urn:nbn:se:umu:diva-30806DOI: 10.1186/1471-2105-7-300OAI: oai:DiVA.org:umu-30806DiVA: diva2:287189
Available from: 2010-01-19 Created: 2010-01-18 Last updated: 2017-12-12Bibliographically approved
In thesis
1. Normalization and analysis of high-dimensional genomics data
Open this publication in new window or tab >>Normalization and analysis of high-dimensional genomics data
2012 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In the middle of the 1990’s the microarray technology was introduced. The technology allowed for genome wide analysis of gene expression in one experiment. Since its introduction similar high through-put methods have been developed in other fields of molecular biology. These high through-put methods provide measurements for hundred up to millions of variables in a single experiment and a rigorous data analysis is necessary in order to answer the underlying biological questions.

Further complications arise in data analysis as technological variation is introduced in the data, due to the complexity of the experimental procedures in these experiments. This technological variation needs to be removed in order to draw relevant biological conclusions from the data. The process of removing the technical variation is referred to as normalization or pre-processing. During the last decade a large number of normalization and data analysis methods have been proposed.

In this thesis, data from two types of high through-put methods are used to evaluate the effect pre-processing methods have on further analyzes. In areas where problems in current methods are identified, novel normalization methods are proposed. The evaluations of known and novel methods are performed on simulated data, real data and data from an in-house produced spike-in experiment.

Place, publisher, year, edition, pages
Umeå: Umeå Universitet, 2012. 43 p.
Keyword
normalization, pre-processing, microarray, downstream analysis, evaluation, sensitivity, bias, genomics data, gene expression, spike-in data, ChIP-chip
National Category
Bioinformatics and Systems Biology Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-53486 (URN)978-91-7459-402-7 (ISBN)
Public defence
2012-04-20, MA121, Mit-huset, Umeå Universitet, Umeå, 19:25 (English)
Opponent
Supervisors
Available from: 2012-03-30 Created: 2012-03-28 Last updated: 2012-04-02Bibliographically approved

Open Access in DiVA

fulltext(2718 kB)184 downloads
File information
File name FULLTEXT01.pdfFile size 2718 kBChecksum SHA-512
a8159b75aaa2d31fab5788d2f0cddaaf46de2752a22b3f5088fa858896d158d6a729ae34167a296e68de9bf6d8e8d8e0ef3e319af9587ff3e51ce76a7ef370f3
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Rydén, PatrikAndersson, HenrikLandfors, MattiasNäslund, LindaNoppa, LailaSjöstedt, Anders
By organisation
Clinical BacteriologyDepartment of Mathematics and Mathematical Statistics
In the same journal
BMC Bioinformatics
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 184 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 160 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf