umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Essays on spatial point processes and bioinformatics
Umeå universitet, Samhällsvetenskapliga fakulteten, Statistiska institutionen.
2010 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

This thesis consists of two separate parts. The first part consists of one paper and considers problems concerning spatial point processes and the second part includes three papers in the field of bioinformatics.

The first part of the thesis is based on a forestry problem of estimating the number of trees in a region by using the information in an aerial photo, showing the area covered by the trees. The positions of the trees are assumed to follow either a binomial point process or a hard-core Strauss process. Furthermore, discs of equal size are used to represent the tree-crowns. We provide formulas for the expectation and the variance of the relative vacancy for both processes. The formulas are approximate for the hard-core Strauss process. Simulations indicate that the approximations are accurate. 

The second part of this thesis focuses on pre-processing of microarray data. The microarray technology can be used to measure the expression of thousands of genes simultaneously in a single experiment. The technique is used to identify genes that are differentially expressed between two populations, e.g. diseased versus healthy individuals. This information can be used in several different ways, for example as diagnostic tools and in drug discovery.

The microarray technique involves a number of complex experimental steps, where each step introduces variability in the data. Pre-processing aims to reduce this variation and is a crucial part of the data analysis. Paper II gives a review over several pre-processing methods. Spike-in data are used to describe how the different methods affect the sensitivity and bias of the experi­ment.

An important step in pre-processing is dye-normalization. This normalization aims to re­move the systematic differences due to the use of different dyes for coloring the samples. In Paper III a novel dye-normalization, the MC-normalization, is proposed. The idea behind this normaliza­tion is to let the channels’ individual intensities determine the cor­rection, rather than the aver­age intensity which is the case for the commonly used MA-normali­zation. Spike-in data showed that  the MC-normalization reduced the bias for the differentially expressed genes compared to the MA-normalization.

The standard method for preserving patient samples for diagnostic purposes is fixation in formalin followed by embedding in paraffin (FFPE). In Paper IV we used tongue-cancer micro­RNA-microarray data to study the effect of FFPE-storage. We suggest that the microRNAs are not equally affected by the storage time and propose a novel procedure to remove this bias. The procedure improves the ability of the analysis to detect differentially expressed microRNAs.

Ort, förlag, år, upplaga, sidor
Umeå: Statistiska institutionen , 2010. , s. 32
Serie
Statistical studies, ISSN 1100-8989 ; 42
Nyckelord [en]
Coverage process, vacancy, microarray, pre-processing, sensitivity, bias, dye-normalization, FFPE, storage time effects
Nationell ämneskategori
Sannolikhetsteori och statistik
Forskningsämne
statistik
Identifikatorer
URN: urn:nbn:se:umu:diva-33452ISBN: 978-91-7264-966-8 (tryckt)OAI: oai:DiVA.org:umu-33452DiVA, id: diva2:312514
Disputation
2010-05-21, Samhällsvetarhuset, hörsal D, Umeå universitet, Umeå, 10:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2010-04-29 Skapad: 2010-04-26 Senast uppdaterad: 2018-06-08Bibliografiskt granskad
Delarbeten
1. Coverage problems for Strauss disc processes
Öppna denna publikation i ny flik eller fönster >>Coverage problems for Strauss disc processes
2001 (Engelska)Licentiatavhandling, monografi (Övrigt vetenskapligt)
Ort, förlag, år, upplaga, sidor
Umeå: Institutionen för matematik och matematisk statistik, Umeå universitet, 2001. s. 64
Nyckelord
Binomial point process, coverage process, disc process, hard-core distance, McMC, Poisson point process, spatial point process, Strauss process, vacancy
Nationell ämneskategori
Sannolikhetsteori och statistik
Forskningsämne
matematisk statistik
Identifikatorer
urn:nbn:se:umu:diva-19838 (URN)91-7305-071-7 (ISBN)
Presentation
(Svenska)
Opponent
Handledare
Tillgänglig från: 2009-03-13 Skapad: 2009-03-11 Senast uppdaterad: 2018-06-09Bibliografiskt granskad
2. Bioinformatics strategies for cDNA-microarray data processing
Öppna denna publikation i ny flik eller fönster >>Bioinformatics strategies for cDNA-microarray data processing
Visa övriga...
2009 (Engelska)Ingår i: Batch effects and noise in microarray experiments: sources and solutions / [ed] Scherer, Andreas, Wiley and Sons , 2009, 1, , s. 272s. 61-74Kapitel i bok, del av antologi (Övrigt vetenskapligt)
Abstract [en]



Pre-processing plays a vital role in cDNA-microarray data analysis. Without proper pre-processing it is likely that the biological conclusions will be misleading. However, there are many alternatives and in order to choose a proper pre-processing procedure it is necessary to understand the effect of different methods. This chapter discusses several pre-processing steps, including image analysis, background correction, normalization, and filtering. Spike-in data are used to illustrate how different procedures affect the analytical ability to detect differentially expressed genes and estimate their regulation. The result shows that pre-processing has a major impact on both the experiment’s sensitivity andits bias. However, general recommendations are hard to give, since pre-processing consists of several actions that are highly dependent on each other. Furthermore, it is likely that pre-processing have a major impact on downstream analysis, such as clustering and classification, and pre-processing methods should be developed and evaluated with this in mind.

Ort, förlag, år, upplaga, sidor
Wiley and Sons, 2009. s. 272 Upplaga: 1
Serie
Wiley series in probability and statistics
Nationell ämneskategori
Beräkningsmatematik
Forskningsämne
matematisk statistik
Identifikatorer
urn:nbn:se:umu:diva-30827 (URN)978-0-470-74138-2 (ISBN)
Tillgänglig från: 2010-01-18 Skapad: 2010-01-18 Senast uppdaterad: 2018-06-08Bibliografiskt granskad
3. MicroRNA-microarray data analysis in the precence of FFPE storage time effects
Öppna denna publikation i ny flik eller fönster >>MicroRNA-microarray data analysis in the precence of FFPE storage time effects
2010 (Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

Background: The standard method for preserving patient samples for diagnostic purposes is fixation in formalin followed by embedding in paraffin (FFPE). The use of FFPE blocks makes it possible to include a large number of patients in the experimental studies since millions of FFPE blocks are stored around the world. However, FFPE storage can cause degradation and modifi­cations of nucleic acids. In order to draw reliable biological conclusions it is therefore important to know what effect FFPE-storage have on the tissues and to have procedures that normalize this effect. In this paper, we study the effect that FFPE-storage has on microRNA-microarray data from tongue-cancer patients and propose a novel procedure for normalizing the bias intro­duced by FFPE-storage.

Results: MicroRNA-microarray data from 21 tongue-cancer patients and 8 control patients were used. The samples were stored in FFPE blocks and had been in storage for up to 11 years. The data contained a large amount of biological relevant variation, yet the largest variation was due to the samples storage times. The storage effect was shown to be significant and some results suggested that it may be causal. Moreover, the microRNAs were unequally affected by storage and this could partially be explained by sequence characteristics. The novel normaliza­tion procedure was shown to have a large impact in the analysis ability to identify differentially expressed microRNAs between young and old cancer patients as well as between cancer and control patients. The p-values for the top microRNAs candidates were much lower for the pro­posed novel normalization compared to a standard normalization procedure which suggested that the novel normalization made the analysis more efficient.

Conclusions: MicroRNA-microarray data can be seriously affected by FFPE-storage and the introduced variation cannot be removed by standard normalizations. The proposed normaliza­tion removes the bias introduced by FFPE-storage and gives higher sensitivity than the standard normalization.

Nyckelord
microRNA, microarray, FFPE, storage time effects, normalization
Nationell ämneskategori
Sannolikhetsteori och statistik
Forskningsämne
statistik
Identifikatorer
urn:nbn:se:umu:diva-33355 (URN)
Tillgänglig från: 2010-04-22 Skapad: 2010-04-22 Senast uppdaterad: 2018-06-08
4. MC-normalization: a novel method for dye-normalization of two-channel microarray data
Öppna denna publikation i ny flik eller fönster >>MC-normalization: a novel method for dye-normalization of two-channel microarray data
2009 (Engelska)Ingår i: Statistical Applications in Genetics and Molecular Biology, ISSN 1544-6115, E-ISSN 1544-6115, Vol. 8, nr 1, s. 42-Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Motivation: Pre-processing plays a vital role in two-color microarray data analysis. An analysis is characterized by its ability to identify differentially expressed genes (its sensitivity) and its ability to provide unbiased estimators of the true regulation (its bias). It has been shown that microarray experiments regularly underestimate the true regulation of differentially expressed genes. We introduce the MC-normalization, where C stands for channel-wise normalization, with considerably lower bias than the commonly used standard methods.

Methods: The idea behind the MC-normalization is that the channels’ individual intensities determine the correction, rather than the average intensity which is the case for the widely used MA-normalization. The two methods were evaluated using spike-in data from an in-house produced cDNA-experiment and a public available Agilent-experiment. The methods were applied on background corrected and non-background corrected data. For the cDNA-experiment the methods were either applied separately on data from each of the print-tips or applied on the complete array data. Altogether 24 analyses were evaluated. For each analysis the sensitivity, the bias and two variance measures were estimated.

Results: We prove that the MC-normalization has lower bias than the MA-normalization. The spike-in data confirmed the theoretical result and suggest that the difference is significant. Furthermore, the empirical data suggest that the MC-and MA-normalization have similar sensitivity. A striking result is that print-tip normalizations did have considerably higher sensitivity than analyses using the complete array data.

Ort, förlag, år, upplaga, sidor
Berkeley: The Berkeley Electronic Press (bepress), 2009
Nyckelord
microarray analysis, dye-normalization, background correction, gene expression, spike-in data, agilent
Nationell ämneskategori
Bioinformatik (beräkningsbiologi)
Forskningsämne
matematisk statistik; statistik
Identifikatorer
urn:nbn:se:umu:diva-26566 (URN)10.2202/1544-6115.1459 (DOI)
Tillgänglig från: 2009-12-15 Skapad: 2009-10-15 Senast uppdaterad: 2018-06-08Bibliografiskt granskad

Open Access i DiVA

fulltext(1530 kB)950 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 1530 kBChecksumma SHA-512
db6745f611c99d1e74a17562a55f74af61efb1537205b6190e6d81e2971043f8a393776c570f853c1cb774580dfdbdfda0041b3d0b7d5011a42baabf04aaf9c4
Typ fulltextMimetyp application/pdf

Personposter BETA

Fahlén, Jessica

Sök vidare i DiVA

Av författaren/redaktören
Fahlén, Jessica
Av organisationen
Statistiska institutionen
Sannolikhetsteori och statistik

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 950 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 604 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf