umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Improved gap size estimation for scaffolding algorithms
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för fysiologisk botanik. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Umeå Plant Science Centre (UPSC).ORCID-id: 0000-0001-6031-005X
2012 (engelsk)Inngår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 28, nr 17, s. 2215-2222Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Motivation: One of the important steps of genome assembly is scaffolding, in which contigs are linked using information from read-pairs. Scaffolding provides estimates about the order, relative orientation and distance between contigs. We have found that contig distance estimates are generally strongly biased and based on false assumptions. Since erroneous distance estimates can mislead in subsequent analysis, it is important to provide unbiased estimation of contig distance.

Results: In this article, we show that state-of-the-art programs for scaffolding are using an incorrect model of gap size estimation. We discuss why current maximum likelihood estimators are biased and describe what different cases of bias we are facing. Furthermore, we provide a model for the distribution of reads that span a gap and derive the maximum likelihood equation for the gap length. We motivate why this estimate is sound and show empirically that it outperforms gap estimators in popular scaffolding programs. Our results have consequences both for scaffolding software, structural variation detection and for library insert-size estimation as is commonly performed by read aligners.

sted, utgiver, år, opplag, sider
Oxford: Oxford University Press, 2012. Vol. 28, nr 17, s. 2215-2222
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-60313DOI: 10.1093/bioinformatics/bts441ISI: 000308019200001OAI: oai:DiVA.org:umu-60313DiVA, id: diva2:566647
Tilgjengelig fra: 2012-11-09 Laget: 2012-10-09 Sist oppdatert: 2018-06-08bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekst

Personposter BETA

Street, Nathaniel

Søk i DiVA

Av forfatter/redaktør
Street, Nathaniel
Av organisasjonen
I samme tidsskrift
Bioinformatics

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 129 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf