umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
MC-normalization: a novel method for dye-normalization of two-channel microarray data
Umeå universitet, Medicinska fakulteten, Institutionen för klinisk mikrobiologi, Klinisk bakteriologi. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för matematik och matematisk statistik. (Patrik Rydén)
Umeå universitet, Samhällsvetenskapliga fakulteten, Statistiska institutionen. (Patrik Rydén)
Umeå universitet, Samhällsvetenskapliga fakulteten, Statistiska institutionen. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för matematik och matematisk statistik. (Patrik Rydén)
2009 (Engelska)Ingår i: Statistical Applications in Genetics and Molecular Biology, ISSN 1544-6115, E-ISSN 1544-6115, Vol. 8, nr 1, s. 42-Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Motivation: Pre-processing plays a vital role in two-color microarray data analysis. An analysis is characterized by its ability to identify differentially expressed genes (its sensitivity) and its ability to provide unbiased estimators of the true regulation (its bias). It has been shown that microarray experiments regularly underestimate the true regulation of differentially expressed genes. We introduce the MC-normalization, where C stands for channel-wise normalization, with considerably lower bias than the commonly used standard methods.

Methods: The idea behind the MC-normalization is that the channels’ individual intensities determine the correction, rather than the average intensity which is the case for the widely used MA-normalization. The two methods were evaluated using spike-in data from an in-house produced cDNA-experiment and a public available Agilent-experiment. The methods were applied on background corrected and non-background corrected data. For the cDNA-experiment the methods were either applied separately on data from each of the print-tips or applied on the complete array data. Altogether 24 analyses were evaluated. For each analysis the sensitivity, the bias and two variance measures were estimated.

Results: We prove that the MC-normalization has lower bias than the MA-normalization. The spike-in data confirmed the theoretical result and suggest that the difference is significant. Furthermore, the empirical data suggest that the MC-and MA-normalization have similar sensitivity. A striking result is that print-tip normalizations did have considerably higher sensitivity than analyses using the complete array data.

Ort, förlag, år, upplaga, sidor
Berkeley: The Berkeley Electronic Press (bepress) , 2009. Vol. 8, nr 1, s. 42-
Nyckelord [en]
microarray analysis, dye-normalization, background correction, gene expression, spike-in data, agilent
Nationell ämneskategori
Bioinformatik (beräkningsbiologi)
Forskningsämne
matematisk statistik; statistik
Identifikatorer
URN: urn:nbn:se:umu:diva-26566DOI: 10.2202/1544-6115.1459OAI: oai:DiVA.org:umu-26566DiVA, id: diva2:272575
Tillgänglig från: 2009-12-15 Skapad: 2009-10-15 Senast uppdaterad: 2018-06-08Bibliografiskt granskad
Ingår i avhandling
1. Essays on spatial point processes and bioinformatics
Öppna denna publikation i ny flik eller fönster >>Essays on spatial point processes and bioinformatics
2010 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

This thesis consists of two separate parts. The first part consists of one paper and considers problems concerning spatial point processes and the second part includes three papers in the field of bioinformatics.

The first part of the thesis is based on a forestry problem of estimating the number of trees in a region by using the information in an aerial photo, showing the area covered by the trees. The positions of the trees are assumed to follow either a binomial point process or a hard-core Strauss process. Furthermore, discs of equal size are used to represent the tree-crowns. We provide formulas for the expectation and the variance of the relative vacancy for both processes. The formulas are approximate for the hard-core Strauss process. Simulations indicate that the approximations are accurate. 

The second part of this thesis focuses on pre-processing of microarray data. The microarray technology can be used to measure the expression of thousands of genes simultaneously in a single experiment. The technique is used to identify genes that are differentially expressed between two populations, e.g. diseased versus healthy individuals. This information can be used in several different ways, for example as diagnostic tools and in drug discovery.

The microarray technique involves a number of complex experimental steps, where each step introduces variability in the data. Pre-processing aims to reduce this variation and is a crucial part of the data analysis. Paper II gives a review over several pre-processing methods. Spike-in data are used to describe how the different methods affect the sensitivity and bias of the experi­ment.

An important step in pre-processing is dye-normalization. This normalization aims to re­move the systematic differences due to the use of different dyes for coloring the samples. In Paper III a novel dye-normalization, the MC-normalization, is proposed. The idea behind this normaliza­tion is to let the channels’ individual intensities determine the cor­rection, rather than the aver­age intensity which is the case for the commonly used MA-normali­zation. Spike-in data showed that  the MC-normalization reduced the bias for the differentially expressed genes compared to the MA-normalization.

The standard method for preserving patient samples for diagnostic purposes is fixation in formalin followed by embedding in paraffin (FFPE). In Paper IV we used tongue-cancer micro­RNA-microarray data to study the effect of FFPE-storage. We suggest that the microRNAs are not equally affected by the storage time and propose a novel procedure to remove this bias. The procedure improves the ability of the analysis to detect differentially expressed microRNAs.

Ort, förlag, år, upplaga, sidor
Umeå: Statistiska institutionen, 2010. s. 32
Serie
Statistical studies, ISSN 1100-8989 ; 42
Nyckelord
Coverage process, vacancy, microarray, pre-processing, sensitivity, bias, dye-normalization, FFPE, storage time effects
Nationell ämneskategori
Sannolikhetsteori och statistik
Forskningsämne
statistik
Identifikatorer
urn:nbn:se:umu:diva-33452 (URN)978-91-7264-966-8 (ISBN)
Disputation
2010-05-21, Samhällsvetarhuset, hörsal D, Umeå universitet, Umeå, 10:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2010-04-29 Skapad: 2010-04-26 Senast uppdaterad: 2018-06-08Bibliografiskt granskad
2. Normalization and analysis of high-dimensional genomics data
Öppna denna publikation i ny flik eller fönster >>Normalization and analysis of high-dimensional genomics data
2012 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

In the middle of the 1990’s the microarray technology was introduced. The technology allowed for genome wide analysis of gene expression in one experiment. Since its introduction similar high through-put methods have been developed in other fields of molecular biology. These high through-put methods provide measurements for hundred up to millions of variables in a single experiment and a rigorous data analysis is necessary in order to answer the underlying biological questions.

Further complications arise in data analysis as technological variation is introduced in the data, due to the complexity of the experimental procedures in these experiments. This technological variation needs to be removed in order to draw relevant biological conclusions from the data. The process of removing the technical variation is referred to as normalization or pre-processing. During the last decade a large number of normalization and data analysis methods have been proposed.

In this thesis, data from two types of high through-put methods are used to evaluate the effect pre-processing methods have on further analyzes. In areas where problems in current methods are identified, novel normalization methods are proposed. The evaluations of known and novel methods are performed on simulated data, real data and data from an in-house produced spike-in experiment.

Ort, förlag, år, upplaga, sidor
Umeå: Umeå Universitet, 2012. s. 43
Nyckelord
normalization, pre-processing, microarray, downstream analysis, evaluation, sensitivity, bias, genomics data, gene expression, spike-in data, ChIP-chip
Nationell ämneskategori
Bioinformatik och systembiologi Sannolikhetsteori och statistik
Forskningsämne
matematisk statistik
Identifikatorer
urn:nbn:se:umu:diva-53486 (URN)978-91-7459-402-7 (ISBN)
Disputation
2012-04-20, MA121, Mit-huset, Umeå Universitet, Umeå, 19:25 (Engelska)
Opponent
Handledare
Tillgänglig från: 2012-03-30 Skapad: 2012-03-28 Senast uppdaterad: 2018-06-08Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Personposter BETA

Landfors, MattiasFahlén, JessicaRydén, Patrik

Sök vidare i DiVA

Av författaren/redaktören
Landfors, MattiasFahlén, JessicaRydén, Patrik
Av organisationen
Klinisk bakteriologiInstitutionen för matematik och matematisk statistikStatistiska institutionen
I samma tidskrift
Statistical Applications in Genetics and Molecular Biology
Bioinformatik (beräkningsbiologi)

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 481 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf