Novornabreak: Local Assembly for Novel Splice Junction and Fusion Transcript Detection from RNA-Seq Data
Author(s): Yukun Tan, Vakul Mohanty, Shaoheng Liang, Jinzhuang Dou, Jun Ma, Kun Hee Kim, Marc Jan Bonder, Xinghua Shi, Charles Lee, Human Genome Structural Variation Consortium, Zechen Chong, and Ken Chen.
We present novoRNABreak, a unified framework for cancer specific novel splice junction and fusion transcript detection in RNA-seq data obtained from human cancer samples. novoRNABreak is based on a local assembly model, which offers a tradeoff between the alignment-based and de novo whole transcriptome assembly (WTA) methods. This approach is accurate and sensitive in assembling novel junctions that are difficult to directly align or have multiple alignments. Additionally, it is more efficient due to the strategy that focuses on junctions rather than full length transcripts. The performance of novoRNABreak is demonstrated by a comprehensive set of experiments using synthetic data generated based on genome reference, as well as real RNA-seq data from breast cancer and prostate cancer samples. The results show that our tool has a better performance by fully utilizing unmapped reads and precisely identifying the junctions where short reads or small exons have multiple alignments. novoRNABreak is a fully-fledged program available on GitHub (https:// github.com/KChen-lab/novoRNABreak).