Characterisation of light affected genes in Arabidopsis using cluster analysis
Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
The aim of this work is to find and visualize gene networks. The data used is gene expression Arabidopsis microarrays. Microarrays measure at once the expression level of thousands of genes. Due to the cost of arrays, high-dimensional datasets with small sample sizes are common. An adjusted t-test, which is more robust for small sample sizes, were used to find a smaller set of interesting genes for further analysis.
In this study, correlations based on centralised data were used to quantify the degree of communication. Centralisation was done by, for each gene, subtracting the mean of the replicates from the data of said replicates. The reason was to remove high correlations when genes share, for example, an up-regulated response but are otherwise not correlated. With distances based on the correlations, hierarchical clustering was performed. Each cluster of genes was presentedas a graph with genes as nodes and edges as correlations. The mean profiles of the genes and bar charts of percent of differentially expressed genes were plotted for each cluster.
Place, publisher, year, edition, pages
2014. , 23 p.
Probability Theory and Statistics
IdentifiersURN: urn:nbn:se:umu:diva-97316OAI: oai:DiVA.org:umu-97316DiVA: diva2:771736
Bachelor of Science in Physics and Applied Mathematics
Rydén, Patrik, Universitetslektor
Yu, Jun, Professor