Statistical analysis of differential transcript usage following Salmon quantification

Bioinformatics

in Statistical Analysis

5 hours ago
118 Views

Detection of differential transcript usage (DTU) from RNA-seq data is an important bioinformatic analysis that complements differential gene expression analysis. Researchers at the University of North Carolina at Chapel Hill have developed a simple workflow using a set of existing R/Bioconductor packages for analysis of DTU. They show how these packages can be used downstream of RNA-seq quantification using the Salmon software package. The entire pipeline is fast, benefiting from inference steps by Salmon to quantify expression at the transcript level. The workflow includes live, runnable code chunks for analysis using DRIMSeq and DEXSeq, as well as for performing two-stage testing of DTU using the stageR package, a statistical framework to screen at the gene level and then confirm which transcripts within the significant genes show evidence of DTU. The researchers evaluate these packages and other related packages on a simulated dataset with parameters estimated from real data.

MA plot of simulated abundances

rna-seq

Each point depicts a transcript, with the average log2 abundance in transcripts-per-million (TPM) on the x-axis and the difference between the two groups on the y-axis. Of the transcripts which are expressed with TPM > 1 in at least one group, 77% are null transcripts (grey), which fall by construction on the M=0 line, and 23% are differentially expressed (green, orange, or purple). As transcripts can belong to multiple categories of differential gene expression (DGE), differential transcript expression (DTE), and differential transcript usage (DTU), here the transcripts are colored by which genes they belong to (those selected to be DGE-, DTE-, or DTU-by-construction).

Availability – Source code for the workflow: https://github.com/mikelove/rnaseqDTU


Love MI, Soneson C, Patro R. (2018) Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification. F1000Res 7:952. [abstract]

Articles You May Like

No, a NASA Scientist Did Not Reveal a ‘Secret Sign’ of The Apocalypse
Highlights for January 15, 2019
Predicting responses to platin chemotherapy agents with biochemically-inspired machine learning
Study: How Less Than 6 Hours Of Sleep May Hurt Your Blood Vessels
sim1000G: a user-friendly genetic variant simulator in R for unrelated individuals and family-based designs

Leave a Reply

Your email address will not be published. Required fields are marked *