Item request has been placed!
×
Item request cannot be made.
×

Processing Request
Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study.
Item request has been placed!
×
Item request cannot be made.
×

Processing Request
- Author(s): Zhao QY;Zhao QY; Wang Y; Kong YM; Luo D; Li X; Hao P
- Source:
BMC bioinformatics [BMC Bioinformatics] 2011 Dec 14; Vol. 12 Suppl 14, pp. S2. Date of Electronic Publication: 2011 Dec 14.
- Publication Type:
Comparative Study; Journal Article; Research Support, Non-U.S. Gov't
- Language:
English
- Additional Information
- Source:
Publisher: BioMed Central Country of Publication: England NLM ID: 100965194 Publication Model: Electronic Cited Medium: Internet ISSN: 1471-2105 (Electronic) Linking ISSN: 14712105 NLM ISO Abbreviation: BMC Bioinformatics Subsets: MEDLINE
- Publication Information:
Original Publication: [London] : BioMed Central, 2000-
- Subject Terms:
- Abstract:
Background: With the fast advances in nextgen sequencing technology, high-throughput RNA sequencing has emerged as a powerful and cost-effective way for transcriptome study. De novo assembly of transcripts provides an important solution to transcriptome analysis for organisms with no reference genome. However, there lacked understanding on how the different variables affected assembly outcomes, and there was no consensus on how to approach an optimal solution by selecting software tool and suitable strategy based on the properties of RNA-Seq data.
Results: To reveal the performance of different programs for transcriptome assembly, this work analyzed some important factors, including k-mer values, genome complexity, coverage depth, directional reads, etc. Seven program conditions, four single k-mer assemblers (SK: SOAPdenovo, ABySS, Oases and Trinity) and three multiple k-mer methods (MK: SOAPdenovo-MK, trans-ABySS and Oases-MK) were tested. While small and large k-mer values performed better for reconstructing lowly and highly expressed transcripts, respectively, MK strategy worked well for almost all ranges of expression quintiles. Among SK tools, Trinity performed well across various conditions but took the longest running time. Oases consumed the most memory whereas SOAPdenovo required the shortest runtime but worked poorly to reconstruct full-length CDS. ABySS showed some good balance between resource usage and quality of assemblies.
Conclusions: Our work compared the performance of publicly available transcriptome assemblers, and analyzed important factors affecting de novo assembly. Some practical guidelines for transcript reconstruction from short-read RNA-Seq data were proposed. De novo assembly of C. sinensis transcriptome was greatly improved using some optimized methods.
- References:
Nat Methods. 2010 Nov;7(11):909-12. (PMID: 20935650)
Genome Biol. 2009;10(3):R25. (PMID: 19261174)
Nat Biotechnol. 2011 May 15;29(7):644-52. (PMID: 21572440)
Bioinformatics. 2009 Aug 1;25(15):1966-7. (PMID: 19497933)
Genome Res. 2002 Apr;12(4):656-64. (PMID: 11932250)
Proc Natl Acad Sci U S A. 2011 May 31;108(22):9172-7. (PMID: 21571633)
Science. 2011 May 20;332(6032):930-6. (PMID: 21511999)
BMC Genomics. 2011 Feb 28;12:131. (PMID: 21356090)
Genome Res. 2010 Oct;20(10):1432-40. (PMID: 20693479)
BMC Genomics. 2011 Jun 07;12:298. (PMID: 21649902)
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D187-91. (PMID: 16381842)
Nature. 2009 Mar 5;458(7234):97-101. (PMID: 19136943)
Genome Res. 2008 May;18(5):821-9. (PMID: 18349386)
Bioinformatics. 2009 Nov 1;25(21):2872-7. (PMID: 19528083)
Nature. 2011 May 25;474(7351):380-4. (PMID: 21614001)
DNA Res. 2011 Feb;18(1):53-63. (PMID: 21217129)
Bioinformatics. 2009 May 1;25(9):1105-11. (PMID: 19289445)
Nature. 2011 Mar 24;471(7339):473-9. (PMID: 21179090)
BMC Genomics. 2010 Jun 24;11:400. (PMID: 20573269)
Nat Genet. 2010 Dec;42(12):1060-7. (PMID: 21037569)
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D277-80. (PMID: 14681412)
- Publication Date:
Date Created: 20120301 Date Completed: 20130308 Latest Revision: 20220321
- Publication Date:
20250114
- Accession Number:
PMC3287467
- Accession Number:
10.1186/1471-2105-12-S14-S2
- Accession Number:
22373417
No Comments.