Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Identifying stable communities in Hi-C data using a multifractal null model
Umeå University, Faculty of Science and Technology, Department of Physics.ORCID iD: 0000-0002-3315-0633
Umeå University, Faculty of Science and Technology, Department of Physics.
Umeå University, Faculty of Science and Technology, Department of Physics.ORCID iD: 0000-0003-3174-8145
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Chromosome capture techniques like Hi-C have expanded our understanding of mammalian genome 3D architecture and how it influences gene activity. To analyze Hi-C data sets, researchers increasingly treat them as DNA-contact networks and use standard community detection techniques to identify mesoscale 3D communities. However, there are considerable challenges in finding significant communities because the Hi-C networks have cross-scale interactions and are almost fully connected. This paper presents a pipeline to distil 3D communities that remain intact under experimental noise. To this end, we bootstrap an ensemble of Hi-C datasets representing noisy data and extract 3D communities that we compare with the unperturbed dataset. Notably, we extract the communities by maximizing local modularity (using the Generalized Louvain method), which considers the multifractal spectrum recently discovered in Hi-C maps. Our pipeline finds that stable communities (under noise) typically have above-average internal contact frequencies and tend to be enriched in active chromatin marks. We also find they fold into more nested cross-scale hierarchies than less stable ones. Apart from presenting how to systematically extract robust communities in Hi-C data, our paper offers new ways to generate null models that take advantage of the network's multifractal properties. We anticipate this has a broad applicability to several network applications.

National Category
Other Physics Topics Biophysics
Identifiers
URN: urn:nbn:se:umu:diva-231511DOI: 10.48550/arXiv.2405.05425OAI: oai:DiVA.org:umu-231511DiVA, id: diva2:1911177
Available from: 2024-11-06 Created: 2024-11-06 Last updated: 2025-02-20Bibliographically approved
In thesis
1. Finding a target on DNA: interplay between the genomic sequence and 3D structure
Open this publication in new window or tab >>Finding a target on DNA: interplay between the genomic sequence and 3D structure
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Att hitta ett mål på DNA : samspelet mellan den genomiska sekvensen och 3D-strukturen
Abstract [en]

Cells are complex systems of interconnected machinery that maintain, repairs and furthers the growth of themselves. In the centre lies the instructions that coordinate it all — the DNA. This meter-long string of code carries the instructions that coordinate cell life, from basic maintenance to the specific function of the cell in the body.

These instructions are constantly used by different protein complexes, but the mechanisms behind several details of these processes are still not understood. For example — the size of a specific set of instructions on the DNA is a mere fraction of the whole genome — how can these instructions be quickly found, and how can the complexes know it found the right set of instructions? Is this search problem related to how DNA is folded and stored in our cell nucleus? These questions are further complicated by the fact that different cell types only use specific instructions, which can change as the cell is affected by, for example, external forces. How can the DNA control which instruction set is available, and how does this affect the other questions we just asked?

These are some questions this thesis tackles. To take a step towards a better mechanistic understanding, this thesis combines data from biology and methods from physics to formulate computational and analytic models to understand the mechanical principles of DNA folding, as well as protein search and binding. This entails finding new hierarchical clusters in DNA, proposing explanations for discrepancies in DNA regulation, connecting sequence specificity with DNA folding and investigating how multiple cooperating parts complicate the DNA search problem.

We find that we can improve our tools to better understand the data we base our models on, and that sequence specificity and folding connects in intricate ways, giving us a more complete view of cellular function.

Abstract [sv]

Celler består av sammanflätade maskinerier som underhåller, reparerar och främjar tillväxten av sig själva. Centralt ligger instruktionerna som samordnar allt — DNA. Denna meterlånga kodsträng är instruktionerna som samordnar cellens liv, allt från enkelt underhåll till cellens specifika funktion i kroppen.

Dessa instruktioner används ständigt av olika proteinkomplex, men vi saknar fortfarande detaljerad förståelse om flera mekanismer bakom dessa processer. Till exempel så är längden av en specifik uppsättning instruktioner på DNA:t endast en bråkdel av hela genomet — hur kan dessa instruktioner hittas snabbt, och hur vet komplexen att de har hittat rätt instruktioner? Är detta sökproblem relaterat till hur DNA veckas och lagras i vår cellkärna? Dessa frågor kompliceras ytterligare av att olika celltyper bara använder vissa instruktioner, som kan ändras när cellen påverkas av till exempel externa påfrestningar. Hur kan DNA:t bestämma vilken uppsättning instruktioner som används, och hur påverkar det de andra frågorna vi ställde tidigare?

Detta är några av de frågor denna avhandling fokuserar på. För att uppnå en bättre mekanistisk förståelse kombinerar denna avhandling data från biologin och metoder från fysik för att formulera beräknings- och analysmodeller för att förstå de mekanistiska principerna bakom DNA-veckning samt proteinsökning och bindning. Detta innefattar att hitta nya hierarkiska kluster i DNA, föreslå alternativa förklaringar till avvikelser i DNA-reglering, koppla samman sekvenskänslighet med DNA-veckning och undersöka hur samverkande komponenter komplicerar DNA-sökningsproblemet.

Vi finner att vi kan förbättra våra verktyg för att bättre förstå det data som vi baserar våra modeller på, samt att sekvensspecificitet och veckning bör kombineras för att bättre förstå mekanismerna i cellen.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2024. p. 69
Keywords
search processes, stochastic simulations, DNA, network science, gene regulation, target-finding problems
National Category
Physical Sciences Biophysics
Research subject
Physical Biology; Physics
Identifiers
urn:nbn:se:umu:diva-231571 (URN)978-91-8070-518-9 (ISBN)978-91-8070-517-2 (ISBN)
Public defence
2024-12-06, NAT.D.450, Naturvetarhuset, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2024-11-15 Created: 2024-11-11 Last updated: 2025-02-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Hedström, LucasCarcedo Martínez, AntónLizana, Ludvig

Search in DiVA

By author/editor
Hedström, LucasCarcedo Martínez, AntónLizana, Ludvig
By organisation
Department of Physics
Other Physics TopicsBiophysics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 156 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf