umu.sePublications
Change search
Refine search result
1 - 5 of 5
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Rentoft, Matilda
    et al.
    Umeå University, Faculty of Medicine, Department of Medical Biochemistry and Biophysics. Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Svensson, Daniel
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Sjödin, Andreas
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Division of CBRN Security and Defence, FOI–Swedish Defence Research Agency, SE Umeå, Sweden.
    Olason, Pall I.
    Sjöström, Olle
    Umeå University, Faculty of Medicine, Department of Radiation Sciences, Oncology. Unit of research, education and development, Region Jämtland Härjedalen, SE Östersund, Sweden.
    Nylander, Carin
    Umeå University, Faculty of Medicine, Department of Radiation Sciences, Oncology.
    Osterman, Pia
    Umeå University, Faculty of Medicine, Department of Medical Biochemistry and Biophysics.
    Sjögren, Rickard
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Netotea, Sergiu
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Science for Life Laboratory, Department of Biology and Biological Engineering, Chalmers University of Technology, SE Göteborg, Sweden.
    Wibom, Carl
    Umeå University, Faculty of Medicine, Department of Radiation Sciences, Oncology.
    Cederquist, Kristina
    Umeå University, Faculty of Medicine, Department of Medical Biosciences, Medical and Clinical Genetics.
    Chabes, Andrei
    Umeå University, Faculty of Medicine, Department of Medical Biochemistry and Biophysics.
    Trygg, Johan
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Melin, Beatrice S.
    Umeå University, Faculty of Medicine, Department of Radiation Sciences, Oncology.
    Johansson, Erik
    Umeå University, Faculty of Medicine, Department of Medical Biochemistry and Biophysics.
    A geographically matched control population efficiently limits the number of candidate disease-causing variants in an unbiased whole-genome analysis2019In: PLoS ONE, ISSN 1932-6203, E-ISSN 1932-6203, Vol. 14, no 3, article id e0213350Article in journal (Refereed)
    Abstract [en]

    Whole-genome sequencing is a promising approach for human autosomal dominant disease studies. However, the vast number of genetic variants observed by this method constitutes a challenge when trying to identify the causal variants. This is often handled by restricting disease studies to the most damaging variants, e.g. those found in coding regions, and overlooking the remaining genetic variation. Such a biased approach explains in part why the genetic causes of many families with dominantly inherited diseases, in spite of being included in whole-genome sequencing studies, are left unsolved today. Here we explore the use of a geographically matched control population to minimize the number of candidate disease-causing variants without excluding variants based on assumptions on genomic position or functional predictions. To exemplify the benefit of the geographically matched control population we apply a typical disease variant filtering strategy in a family with an autosomal dominant form of colorectal cancer. With the use of the geographically matched control population we end up with 26 candidate variants genome wide. This is in contrast to the tens of thousands of candidates left when only making use of available public variant datasets. The effect of the local control population is dual, it (1) reduces the total number of candidate variants shared between affected individuals, and more importantly (2) increases the rate by which the number of candidate variants are reduced as additional affected family members are included in the filtering strategy. We demonstrate that the application of a geographically matched control population effectively limits the number of candidate disease-causing variants and may provide the means by which variants suitable for functional studies are identified genome wide.

  • 2.
    Sjögren, Rickard
    et al.
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Stridh, Kjell
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Skotare, Tomas
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Trygg, Johan
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Sartorius Stedim Data Analytics, Umeå, Sweden.
    Multivariate patent analysis: using chemometrics to analyze collections of chemical and pharmaceutical patents2018In: Journal of Chemometrics, ISSN 0886-9383, E-ISSN 1099-128X, article id e3041Article in journal (Refereed)
    Abstract [en]

    Abstract Patents are an important source of technological knowledge, but the amount of existing patents is vast and quickly growing. This makes development of tools and methodologies for quickly revealing patterns in patent collections important. In this paper, we describe how structured chemometric principles of multivariate data analysis can be applied in the context of text analysis in a novel combination with common machine learning preprocessing methodologies. We demonstrate our methodology in 2 case studies. Using principal component analysis (PCA) on a collection of 12338 patent abstracts from 25 companies in big pharma revealed sub-fields which the companies are active in. Using PCA on a smaller collection of patents retrieved by searching for a specific term proved useful to quickly understand how patent classifications relate to the search term. By using orthogonal projections to latent structures (O-PLS) on patent classification schemes, we were able to separate patents on a more detailed level than using PCA. Lastly, we performed multi-block modeling using OnPLS on bag-of-words representations of abstracts, claims, and detailed descriptions, respectively, showing that semantic variation relating to patent classification is consistent across multiple text blocks, represented as globally joint variation. We conclude that using machine learning to transform unstructured data into structured data provide a good preprocessing tool for subsequent chemometric multivariate data analysis and provides an easily interpretable and novel workflow to understand large collections of patents. We demonstrate this on collections of chemical and pharmaceutical patents.

  • 3.
    Skotare, Tomas
    et al.
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Sjögren, Rickard
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Surowiec, Izabella
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Nilsson, David
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Trygg, Johan
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Sartorius Stedim Data Analytics, 907 36 Umeå, Sweden.
    Visualization of descriptive multiblock analysis2018In: Journal of Chemometrics, ISSN 0886-9383, E-ISSN 1099-128XArticle in journal (Refereed)
    Abstract [en]

    Abstract Understanding and making the most of complex data collected from multiple sources is a challenging task. Data integration is the procedure of describing the main features in multiple data blocks, and several methods for multiblock analysis have been previously developed, including OnPLS and JIVE. One of the main challenges is how to visualize and interpret the results of multiblock analyses because of the increased model complexity and sheer size of data. In this paper, we present novel visualization tools that simplify interpretation and overview of multiblock analysis. We introduce a correlation matrix plot that provides an overview of the relationships between blocks found by multiblock models. We also present a multiblock scatter plot, a metadata correlation plot, and a variation distribution plot, that simplify the interpretation of multiblock models. We demonstrate our visualizations on an industrial case study in vibration spectroscopy (NIR, UV, and Raman datasets) as well as a multiomics integration study (transcript, metabolite, and protein datasets). We conclude that our visualizations provide useful tools to harness the complexity of multiblock analysis and enable better understanding of the investigated system.

  • 4.
    Surowiec, Izabella
    et al.
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Sartorius Stedim Data Analytics, Tvistevägen 48, 907 36 Umeå, Sweden.
    Skotare, Tomas
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Sjögren, Rickard
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Sartorius Stedim Data Analytics, Tvistevägen 48, 907 36 Umeå, Sweden.
    Gouveia-Figueira, Sandra C.
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Orikiiriza, Judy Tatwan
    Bergström, Sven
    Umeå University, Faculty of Medicine, Department of Molecular Biology (Faculty of Medicine).
    Normark, Johan
    Umeå University, Faculty of Medicine, Department of Molecular Biology (Faculty of Medicine).
    Trygg, Johan
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Sartorius Stedim Data Analytics, Tvistevägen 48, 907 36 Umeå, Sweden.
    Joint and unique multiblock analysis of biological data: multiomics malaria study2019In: Faraday discussions (Online), ISSN 1359-6640, E-ISSN 1364-5498, Vol. 218, p. 268-283Article in journal (Refereed)
    Abstract [en]

    Modern profiling technologies enable obtaining large amounts of data which can be later used for comprehensive understanding of the studied system. Proper evaluation of such data is challenging, and cannot be faced by bare analysis of separate datasets. Integrated approaches are necessary, because only data integration allows finding correlation trends common for all studied data sets and revealing hidden structures not known a priori. This improves understanding and interpretation of the complex systems. Joint and Unique MultiBlock Analysis (JUMBA) is an analysis method based on the OnPLS-algorithm that decomposes a set of matrices into joint parts containing variation shared with other connected matrices and variation that is unique for each single matrix. Mapping unique variation is important from a data integration perspective, since it certainly cannot be expected that all variation co-varies. In this work we used JUMBA for integrated analysis of lipidomic, metabolomic and oxylipin datasets obtained from profiling of plasma samples from children infected with P. falciparum malaria. P. falciparum is one of the primary contributors to childhood mortality and obstetric complications in the developing world, what makes development of the new diagnostic and prognostic tools, as well as better understanding of the disease, of utmost importance. In presented work JUMBA made it possible to detect already known trends related to disease progression, but also to discover new structures in the data connected to food intake and personal differences in metabolism. By separating the variation in each data set into joint and unique, JUMBA reduced complexity of the analysis, facilitated detection of samples and variables corresponding to specific structures across multiple datasets and by doing this enabled fast interpretation of the studied system. All this makes JUMBA a perfect choice for multiblock analysis of systems biology data.

  • 5.
    Svensson, Daniel
    et al.
    Umeå University, Faculty of Science and Technology, Department of Chemistry.
    Sjögren, Rickard
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Corporate Research, Sartorius AG, Umeå, Sweden.
    Sundell, David
    Sjödin, Andreas
    Trygg, Johan
    Umeå University, Faculty of Science and Technology, Department of Chemistry. Corporate Research, Sartorius AG, Umeå, Sweden.
    doepipeline: a systematic approach to optimizing multi-level and multi-step data processing workflows2019In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 20, no 1, article id 498Article in journal (Refereed)
    Abstract [en]

    Background: Selecting the proper parameter settings for bioinformatic software tools is challenging. Not only will each parameter have an individual effect on the outcome, but there are also potential interaction effects between parameters. Both of these effects may be difficult to predict. To make the situation even more complex, multiple tools may be run in a sequential pipeline where the final output depends on the parameter configuration for each tool in the pipeline. Because of the complexity and difficulty of predicting outcomes, in practice parameters are often left at default settings or set based on personal or peer experience obtained in a trial and error fashion. To allow for the reliable and efficient selection of parameters for bioinformatic pipelines, a systematic approach is needed.

    Results: We present doepipeline, a novel approach to optimizing bioinformatic software parameters, based on core concepts of the Design of Experiments methodology and recent advances in subset designs. Optimal parameter settings are first approximated in a screening phase using a subset design that efficiently spans the entire search space, then optimized in the subsequent phase using response surface designs and OLS modeling. Doepipeline was used to optimize parameters in four use cases; 1) de-novo assembly, 2) scaffolding of a fragmented genome assembly, 3) k-mer taxonomic classification of Oxford Nanopore Technologies MinION reads, and 4) genetic variant calling. In all four cases, doepipeline found parameter settings that produced a better outcome with respect to the characteristic measured when compared to using default values. Our approach is implemented and available in the Python package doepipeline.

    Conclusions: Our proposed methodology provides a systematic and robust framework for optimizing software parameter settings, in contrast to labor- and time-intensive manual parameter tweaking. Implementation in doepipeline makes our methodology accessible and user-friendly, and allows for automatic optimization of tools in a wide range of cases. The source code of doepipeline is available at https://github.com/clicumu/doepipeline and it can be installed through conda-forge.

1 - 5 of 5
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf