Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A scalable, data analytics workflow for image-based morphological profiles
Umeå University, Faculty of Science and Technology, Department of Chemistry.ORCID iD: 0000-0002-1898-4453
Sartorius Corporate Research, Bordeaux, France.
Sartorius Corporate Research, Umeå, Sweden.ORCID iD: 0000-0001-8357-5018
Sartorius BioAnalytics, Royston, United Kingdom.
Show others and affiliations
2024 (English)In: Chemometrics and Intelligent Laboratory Systems, ISSN 0169-7439, E-ISSN 1873-3239, Vol. 254, article id 105232Article in journal (Refereed) Published
Abstract [en]

Cell Painting is an established community-based microscopy-assay platform that provides high-throughput, high-content data for biological readouts. In November 2022, the JUMP-Cell Painting Consortium released the largest publicly available Cell Painting dataset with CellProfiler features, comprising more than 2 billion cell images. This dataset is designed for predicting the activity and toxicity of 115k drug compounds, with the aim to make cell images as computable as genomes and transcriptomes. In this context, our paper introduces a scalable and computationally efficient data analytics workflow created to meet the needs of researchers. This data-driven workflow facilitates the comparison of drug treatment effects through significant and biologically relevant insights. The workflow consists of two parts: first, the Equivalence score (Eq. score), a straightforward yet sophisticated metric highlighting relevant deviations from negative controls based on cell image morphology; second, the scalability of the workflow, by utilizing the Eq. scores on a large scale to predict and classify the subtle morphological changes in cell image profiles. By doing so, we show classification improvements compared to using the raw CellProfiler features on the CPJUMP1-pilot dataset on three types of perturbations. We hope that our workflow's contributions will enhance drug screening efficiency and streamline the drug development process. As this process is resource-intensive, every incremental improvement is valuable. Through our collective efforts in advancing the understanding of high-throughput image-based data, we aim to reduce both the time and cost of developing new, life-saving treatments.

Place, publisher, year, edition, pages
Elsevier, 2024. Vol. 254, article id 105232
Keywords [en]
Cell Painting, Chemometrics, Computational Workflow, Drug discovery, High-throughput Screening, Morphological Profiling, Quantitative Image Analysis
National Category
Bioinformatics (Computational Biology) Pharmacology and Toxicology
Identifiers
URN: urn:nbn:se:umu:diva-230015DOI: 10.1016/j.chemolab.2024.105232ISI: 001320783800001Scopus ID: 2-s2.0-85204373412OAI: oai:DiVA.org:umu-230015DiVA, id: diva2:1902720
Funder
eSSENCE - An eScience CollaborationAvailable from: 2024-10-02 Created: 2024-10-02 Last updated: 2025-04-24Bibliographically approved
In thesis
1. Chemometric strategies for supervised multi-model analysis
Open this publication in new window or tab >>Chemometric strategies for supervised multi-model analysis
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Kemometriska strategier för guidad multi-modellanalys
Abstract [en]

Understanding biological processes is inherently complex. The cellular machinery andbiochemical pathways present significant challenges in scientific research. Advances indata collection, such as high-content imaging and omics technologies, have enableddeeper insights, but extracting meaningful conclusions from these complicateddatasets remains a challenge. In this thesis, the focus has been on developingchemometric strategies and supervised modelling approaches to improve datainterpretation, aiming to aid scientists in drawing conclusions from their data.In Paper I, we show that cell imaging data, combined with chemometric tools, caneffectively characterize treatment effects, leading to the development of a metric calledEquivalence (Eq.) scores. This work raised two main questions: Are fluorescent labelsnecessary for meaningful characterization? Can living cells, imaged over time, providedeeper insights? In Paper III, we address these questions by investigating anapproach based on label-free live-cell imaging data where we extended the Eq. scoresto time series data. We demonstrate that time-dependent analysis reveals both earlyand late cellular responses and improves the prediction of drug mechanisms.In Paper II, we address challenges arising when Orthogonal Projections to LatentStructures-Discriminant Analysis (OPLS-DA) models are used to analyse severalclasses, such as subtypes of diseases or different treatments. We introduce OPLSHierarchicalDiscriminant Analysis (OPLS-HDA), a method that integrateshierarchical clustering analysis (HCA) with two-class OPLS-DA models to create anOPLS-based decision tree. We demonstrated that OPLS-HDA is a strong classifiercompared to eight other established methods while maintaining interpretability.Additionally, we provide Python scripts that are integrated with SIMCA®, offering auser-friendly interface for broader accessibility.Extracting reliable insights from complex data requires intentional and structuredapproaches. This work highlights the benefits of modular and interpretable modellingsolutions, ensuring that results are both understandable and trustworthy. By breakingdown complex analytical challenges and building tools that enhance interpretability,this work contributes to the broader goal of accelerating data-driven discoveries in lifesciences.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2025. p. 58
Keywords
Label-free live-cell imaging, Morphological profiling, Multi-class classification
National Category
Pharmacology and Toxicology Bioinformatics and Computational Biology
Identifiers
urn:nbn:se:umu:diva-236640 (URN)978-91-8070-642-1 (ISBN)978-91-8070-643-8 (ISBN)
Public defence
2025-04-16, Stora Hörsalen (KBE303), KBC-huset, Linnaeus väg 6, Umeå, 09:00 (English)
Opponent
Supervisors
Funder
eSSENCE - An eScience Collaboration
Available from: 2025-03-26 Created: 2025-03-19 Last updated: 2025-03-21Bibliographically approved

Open Access in DiVA

fulltext(2552 kB)58 downloads
File information
File name FULLTEXT01.pdfFile size 2552 kBChecksum SHA-512
a7b3dd5805702a1019ec4130106d782e2ee2eb0d8abd20fe59f9eabdf7de8b69a592b71ff55c065ad4c40b952ede2cbf1718aee8de66e149cfb2b0f9a5b77fdd
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Forsgren, EdvinJonsson, PärTrygg, Johan

Search in DiVA

By author/editor
Forsgren, EdvinJonsson, PärTrygg, Johan
By organisation
Department of Chemistry
In the same journal
Chemometrics and Intelligent Laboratory Systems
Bioinformatics (Computational Biology)Pharmacology and Toxicology

Search outside of DiVA

GoogleGoogle Scholar
Total: 58 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 279 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf