Umeå University's logo

umu.sePublications
Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Finding a target on DNA: interplay between the genomic sequence and 3D structure
Umeå University, Faculty of Science and Technology, Department of Physics.ORCID iD: 0000-0002-3315-0633
2024 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Att hitta ett mål på DNA : samspelet mellan den genomiska sekvensen och 3D-strukturen (Swedish)
Abstract [en]

Cells are complex systems of interconnected machinery that maintain, repairs and furthers the growth of themselves. In the centre lies the instructions that coordinate it all — the DNA. This meter-long string of code carries the instructions that coordinate cell life, from basic maintenance to the specific function of the cell in the body.

These instructions are constantly used by different protein complexes, but the mechanisms behind several details of these processes are still not understood. For example — the size of a specific set of instructions on the DNA is a mere fraction of the whole genome — how can these instructions be quickly found, and how can the complexes know it found the right set of instructions? Is this search problem related to how DNA is folded and stored in our cell nucleus? These questions are further complicated by the fact that different cell types only use specific instructions, which can change as the cell is affected by, for example, external forces. How can the DNA control which instruction set is available, and how does this affect the other questions we just asked?

These are some questions this thesis tackles. To take a step towards a better mechanistic understanding, this thesis combines data from biology and methods from physics to formulate computational and analytic models to understand the mechanical principles of DNA folding, as well as protein search and binding. This entails finding new hierarchical clusters in DNA, proposing explanations for discrepancies in DNA regulation, connecting sequence specificity with DNA folding and investigating how multiple cooperating parts complicate the DNA search problem.

We find that we can improve our tools to better understand the data we base our models on, and that sequence specificity and folding connects in intricate ways, giving us a more complete view of cellular function.

Abstract [sv]

Celler består av sammanflätade maskinerier som underhåller, reparerar och främjar tillväxten av sig själva. Centralt ligger instruktionerna som samordnar allt — DNA. Denna meterlånga kodsträng är instruktionerna som samordnar cellens liv, allt från enkelt underhåll till cellens specifika funktion i kroppen.

Dessa instruktioner används ständigt av olika proteinkomplex, men vi saknar fortfarande detaljerad förståelse om flera mekanismer bakom dessa processer. Till exempel så är längden av en specifik uppsättning instruktioner på DNA:t endast en bråkdel av hela genomet — hur kan dessa instruktioner hittas snabbt, och hur vet komplexen att de har hittat rätt instruktioner? Är detta sökproblem relaterat till hur DNA veckas och lagras i vår cellkärna? Dessa frågor kompliceras ytterligare av att olika celltyper bara använder vissa instruktioner, som kan ändras när cellen påverkas av till exempel externa påfrestningar. Hur kan DNA:t bestämma vilken uppsättning instruktioner som används, och hur påverkar det de andra frågorna vi ställde tidigare?

Detta är några av de frågor denna avhandling fokuserar på. För att uppnå en bättre mekanistisk förståelse kombinerar denna avhandling data från biologin och metoder från fysik för att formulera beräknings- och analysmodeller för att förstå de mekanistiska principerna bakom DNA-veckning samt proteinsökning och bindning. Detta innefattar att hitta nya hierarkiska kluster i DNA, föreslå alternativa förklaringar till avvikelser i DNA-reglering, koppla samman sekvenskänslighet med DNA-veckning och undersöka hur samverkande komponenter komplicerar DNA-sökningsproblemet.

Vi finner att vi kan förbättra våra verktyg för att bättre förstå det data som vi baserar våra modeller på, samt att sekvensspecificitet och veckning bör kombineras för att bättre förstå mekanismerna i cellen.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2024. , p. 69
Keywords [en]
search processes, stochastic simulations, DNA, network science, gene regulation, target-finding problems
National Category
Physical Sciences Biophysics
Research subject
Physical Biology; Physics
Identifiers
URN: urn:nbn:se:umu:diva-231571ISBN: 978-91-8070-518-9 (electronic)ISBN: 978-91-8070-517-2 (print)OAI: oai:DiVA.org:umu-231571DiVA, id: diva2:1912016
Public defence
2024-12-06, NAT.D.450, Naturvetarhuset, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2024-11-15 Created: 2024-11-11 Last updated: 2024-11-11Bibliographically approved
List of papers
1. Identifying stable communities in Hi-C data using a multifractal null model
Open this publication in new window or tab >>Identifying stable communities in Hi-C data using a multifractal null model
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Chromosome capture techniques like Hi-C have expanded our understanding of mammalian genome 3D architecture and how it influences gene activity. To analyze Hi-C data sets, researchers increasingly treat them as DNA-contact networks and use standard community detection techniques to identify mesoscale 3D communities. However, there are considerable challenges in finding significant communities because the Hi-C networks have cross-scale interactions and are almost fully connected. This paper presents a pipeline to distil 3D communities that remain intact under experimental noise. To this end, we bootstrap an ensemble of Hi-C datasets representing noisy data and extract 3D communities that we compare with the unperturbed dataset. Notably, we extract the communities by maximizing local modularity (using the Generalized Louvain method), which considers the multifractal spectrum recently discovered in Hi-C maps. Our pipeline finds that stable communities (under noise) typically have above-average internal contact frequencies and tend to be enriched in active chromatin marks. We also find they fold into more nested cross-scale hierarchies than less stable ones. Apart from presenting how to systematically extract robust communities in Hi-C data, our paper offers new ways to generate null models that take advantage of the network's multifractal properties. We anticipate this has a broad applicability to several network applications.

National Category
Other Physics Topics Biophysics
Identifiers
urn:nbn:se:umu:diva-231511 (URN)10.48550/arXiv.2405.05425 (DOI)
Available from: 2024-11-06 Created: 2024-11-06 Last updated: 2024-11-11Bibliographically approved
2. Enhancer-insulator pairing reveals heterogeneous dynamics in long-distance 3D gene regulation
Open this publication in new window or tab >>Enhancer-insulator pairing reveals heterogeneous dynamics in long-distance 3D gene regulation
2024 (English)In: PRX Life, E-ISSN 2835-8279, Vol. 2, no 3, article id 033008Article in journal (Refereed) Published
Abstract [en]

Cells regulate fates and complex body plans using spatiotemporal signaling cascades that alter gene expression. Short DNA sequences, known as enhancers (50–1500 base pairs), help coordinate these cascades by attracting regulatory proteins that enhance the transcription by binding to distal gene promoters. In humans, there are hundreds of thousands of enhancers dispersed across the genome, which poses a challenging coordination task to prevent unintended gene activation. To mitigate this problem, the genome contains insulator elements that block enhancer-promoter interactions. However, there is an open problem with how the insulation works, especially as enhancer-insulator pairs may be separated by millions of base pairs. Based on recent empirical data from Hi-C experiments, this paper proposes a new mechanism that challenges the common paradigm that rests on specific insulator-insulator interactions. Instead, this paper introduces a stochastic looping model where insulators bind weakly to chromatin rather than other insulators. After calibrating the model to experimental data, we use simulations to study the broad distribution of hitting times between an enhancer and a promoter when insulators are present. We find parameter regimes with large differences between average and most probable hitting times. This makes it difficult to assign a typical timescale and hints at highly defocused regulation times. We also map our computational model onto a resetting problem that allows us to derive several analytical results. Besides offering new insights into enhancer-insulator interactions, our paper advances the understanding of gene regulatory networks and causal connections between genome folding and gene activation.

Place, publisher, year, edition, pages
American Physical Society (APS), 2024
National Category
Biophysics Other Physics Topics Condensed Matter Physics
Identifiers
urn:nbn:se:umu:diva-231089 (URN)10.1103/prxlife.2.033008 (DOI)
Funder
Swedish Research Council, 2017-03848Swedish Research Council, 2021-04080Swedish Research Council, 2022-06725
Available from: 2024-10-22 Created: 2024-10-22 Last updated: 2024-11-11Bibliographically approved
3. Modelling chromosome-wide target search
Open this publication in new window or tab >>Modelling chromosome-wide target search
2023 (English)In: New Journal of Physics, E-ISSN 1367-2630, Vol. 25, no 3, article id 033024Article in journal (Refereed) Published
Abstract [en]

The most common gene regulation mechanism is when a transcription factor (TF) protein binds to a regulatory sequence to increase or decrease RNA transcription. However, TFs face two main challenges when searching for these sequences. First, the sequences are vanishingly short relative to the genome length. Second, there are many nearly identical sequences scattered across the genome, causing proteins to suspend the search. But as pointed out in a computational study of LacI regulation in Escherichia coli, such almost-targets may lower search times if considering DNA looping. In this paper, we explore if this also occurs over chromosome-wide distances. To this end, we developed a cross-scale computational framework that combines established facilitated-diffusion models for basepair-level search and a network model capturing chromosome-wide leaps. To make our model realistic, we used Hi-C data sets as a proxy for 3D proximity between long-ranged DNA segments and binding profiles for more than 100 TFs. Using our cross-scale model, we found that median search times to individual targets critically depend on a network metric combining node strength (sum of link weights) and local dissociation rates. Also, by randomizing these rates, we found that some actual 3D target configurations stand out as considerably faster or slower than their random counterparts. This finding hints that chromosomes’ 3D structure funnels essential TFs to relevant DNA regions.

Place, publisher, year, edition, pages
Institute of Physics (IOP), 2023
Keywords
chromosome 3D folding, diffusion on networks, DNA target-search, gene regulation, Hi-C data, stochastic simulations
National Category
Bioinformatics and Systems Biology
Identifiers
urn:nbn:se:umu:diva-206375 (URN)10.1088/1367-2630/acc127 (DOI)000951783900001 ()2-s2.0-85150899174 (Scopus ID)
Funder
Swedish Research Council, 2017-03848Swedish Research Council, 2018-05973
Available from: 2023-04-04 Created: 2023-04-04 Last updated: 2024-11-11Bibliographically approved
4. Target search on networks-within-networks with applications to protein-DNA interactions
Open this publication in new window or tab >>Target search on networks-within-networks with applications to protein-DNA interactions
(English)Manuscript (preprint) (Other academic)
Abstract [en]

We present a novel framework for understanding node target search in systems organized as hierarchical networks-within-networks. Our work generalizes traditional search models on complex networks, where the mean-first passage time is typically inversely proportional to the node degree. However, real-world search processes often span multiple network layers, such as moving from an external environment into a local network, and then navigating several internal states. This multilayered complexity appears in scenarios such as international travel networks, tracking email spammers, and the dynamics of protein-DNA interactions in cells. Our theory addresses these complex systems by modeling them as a three-layer multiplex network: an external source layer, an intermediate spatial layer, and an internal state layer. We derive general closed-form solutions for the steady-state flux through a target node, which serves as a proxy for inverse mean-first passage time. Our results reveal a universal relationship between search efficiency and network-specific parameters. This work extends the current understanding of multiplex networks by focusing on systems with hierarchically connected layers. Our findings have broad implications for fields ranging from epidemiology to cellular biology and provide a more comprehensive understanding of search dynamics in complex, multilayered environments.

National Category
Other Physics Topics Biophysics
Identifiers
urn:nbn:se:umu:diva-231512 (URN)10.48550/arXiv.2411.02660 (DOI)
Available from: 2024-11-06 Created: 2024-11-06 Last updated: 2024-11-11Bibliographically approved
5. Exploring the benefits of DNA-target search with antenna
Open this publication in new window or tab >>Exploring the benefits of DNA-target search with antenna
(English)Manuscript (preprint) (Other (popular science, discussion, etc.))
Abstract [en]

The most common gene regulation mechanism is when a protein binds to a regulatory sequence to change RNA transcription. However, these sequences are short relative to the genome length, so finding them poses a challenging search problem. This paper presents two mathematical frameworks capturing different aspects of this problem. First, we study the interplay between diffusional flux through a target where the searching proteins get sequestered on DNA far from the target because of non-specific interactions. From this model, we derive a simple formula for the optimal protein-DNA unbinding rate, maximizing the particle flux. Second, we study how the flux flows through a target on a single antenna with variable length. Here, we identify a non-trivial logarithmic correction to the linear behavior relative to the target size proposed by Smoluchowski's flux formula.

National Category
Other Physics Topics Biophysics
Identifiers
urn:nbn:se:umu:diva-231568 (URN)10.48550/arXiv.2311.11727 (DOI)
Available from: 2024-11-07 Created: 2024-11-07 Last updated: 2024-11-11Bibliographically approved

Open Access in DiVA

fulltext(2495 kB)29 downloads
File information
File name FULLTEXT01.pdfFile size 2495 kBChecksum SHA-512
dcc4c716870d61eb3db803656f4524e7807ec45e1cc2773eddf11b5e9e54b2faf147723b8d41310579e5a506fd33bbba7a01e9096fc994f2bffee3b95f0a4d47
Type fulltextMimetype application/pdf
spikblad(87 kB)13 downloads
File information
File name FULLTEXT02.pdfFile size 87 kBChecksum SHA-512
2505f6a38b83980b36f40edcc30cc759a4406fe93e4fc130ae7d0cef53eaeebb00e85030a75afbe8d20f151caa1e438f819edcf8104ea1b293e3a63cb4b2f928
Type spikbladMimetype application/pdf

Authority records

Hedström, Lucas

Search in DiVA

By author/editor
Hedström, Lucas
By organisation
Department of Physics
Physical SciencesBiophysics

Search outside of DiVA

GoogleGoogle Scholar
Total: 42 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 314 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf