Open this publication in new window or tab >>2020 (English)Doctoral thesis, comprehensive summary (Other academic)
Dolda betydelsefulla mönster : statistiska metoder för analys av DNA och RNA data
Abstract [en]
Understanding how the genetic variations can affect characteristics and function of organisms can help researchers and medical doctors to detect genetic alterations that cause disease and reveal genes that causes antibiotic resistance. The opportunities and progress associated with such data come however with challenges related to statistical analysis. It is only by using properly designed and employed tools, that we can extract the information about hidden patterns. In this thesis we present three types of such analysis. First, the genetic variant in the gene COL17A1 that causes corneal dystrophy with recurrent erosions is reveled. By studying Next-generation sequencing data, the order of the nucleotides in the DNAsequence was be obtained, which enabled us to detect interesting variants in the genome. Further, we present results of an experimental design study with the aim to make the best selection from a family that is affected by an inherited disease. In second part of the work, we analyzed a novel antibiotic resistance Staphylococcus epidermidis clone that is only found in northern Europe. By investigating its genetic data, we revealed similarities to a world known antibiotic resistance clone. As a result, the antibiotic resistance profile is established from the DNA sequences. Finally, we also focus on the challenges related to the abundance of genetic data from different sources. The increasing number of public gene expression datasets gives us opportunity to increase our understanding by using information from multiple sources simultaneously. Naturally, this requires merging independent datasets together. However, when doing so, the technical and biological variation in the joined data increases. We present a pre-processing method to construct gene co-expression networks from a large diverse gene-expression dataset.
Place, publisher, year, edition, pages
Umeå: Umeå universitet, Institutionen för matematik och matematisk statistik, 2020. p. 26
Series
Research report in mathematical statistics, ISSN 1653-0829 ; 71/20
Keywords
Genome, Next-generation sequence, statistics, microarrays, bacteria, antibiotic resistance, inherited diseases, Co-expression networks, centralization within subgroups
National Category
Probability Theory and Statistics Biological Sciences Medical and Health Sciences
Identifiers
urn:nbn:se:umu:diva-175242 (URN)978-91-7855-240-5 (ISBN)978-91-7855-241-2 (ISBN)
Public defence
2020-10-16, Hörsal B, Lindellhallen, Umeå, 09:00 (English)
Opponent
Supervisors
2020-09-252020-09-222020-09-23Bibliographically approved