umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Improved gap size estimation for scaffolding algorithms
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för fysiologisk botanik. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Umeå Plant Science Centre (UPSC).ORCID-id: 0000-0001-6031-005X
2012 (Engelska)Ingår i: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 28, nr 17, s. 2215-2222Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Motivation: One of the important steps of genome assembly is scaffolding, in which contigs are linked using information from read-pairs. Scaffolding provides estimates about the order, relative orientation and distance between contigs. We have found that contig distance estimates are generally strongly biased and based on false assumptions. Since erroneous distance estimates can mislead in subsequent analysis, it is important to provide unbiased estimation of contig distance.

Results: In this article, we show that state-of-the-art programs for scaffolding are using an incorrect model of gap size estimation. We discuss why current maximum likelihood estimators are biased and describe what different cases of bias we are facing. Furthermore, we provide a model for the distribution of reads that span a gap and derive the maximum likelihood equation for the gap length. We motivate why this estimate is sound and show empirically that it outperforms gap estimators in popular scaffolding programs. Our results have consequences both for scaffolding software, structural variation detection and for library insert-size estimation as is commonly performed by read aligners.

Ort, förlag, år, upplaga, sidor
Oxford: Oxford University Press, 2012. Vol. 28, nr 17, s. 2215-2222
Nationell ämneskategori
Biokemi och molekylärbiologi
Identifikatorer
URN: urn:nbn:se:umu:diva-60313DOI: 10.1093/bioinformatics/bts441ISI: 000308019200001OAI: oai:DiVA.org:umu-60313DiVA, id: diva2:566647
Tillgänglig från: 2012-11-09 Skapad: 2012-10-09 Senast uppdaterad: 2018-06-08Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Personposter BETA

Street, Nathaniel

Sök vidare i DiVA

Av författaren/redaktören
Street, Nathaniel
Av organisationen
Institutionen för fysiologisk botanikUmeå Plant Science Centre (UPSC)
I samma tidskrift
Bioinformatics
Biokemi och molekylärbiologi

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 129 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf