RNA-seq stranded coverage and junctions generated by STAR 2.6.1b
Coverages were normalized using RNA-seq log-ratio normalization factors.
Afterward, coverages were combined to the mean coverage for each experiment group.
Notes on the STAR index used for alignment:
- All samples were aligned to the same STAR index.
- The STAR index used the mouse mm39 genome and
Gencode vM33 comprehensive gene annotations.
- The STAR index contains two important features:
- A custom chromosome chrAdam19TdT was added which contains the
full mutant Adam19 construct Adam19_TdTomato obtained by Sanger sequencing.
- The wildtype Adam19 region region on chr11 not masked ("unmasked").
- Therefore two key points:
- The STAR index should therefore allow competitive alignment of RNA-seq reads
between the wildtype and mutant Adam19 exon regions.
- This STAR index differs from the "Adam19 coverage" tracks,
which had only permitted alignment of Adam19 wildtype sample RNA-seq to
the wildtype Adam19 region on chr11;
and only permitted alignment of mutant Adam19 sample RNA-seq to
the mutant Adam19_TdTomato construct region on chrAdam19TdT.
Observations:
- All Adam19 wildtype and mutant samples show coverage on both Adam19 loci:
- the wildtype Adam19 region on chr11, and
- the region containing Adam19_TdTomato on chrAdam19TdT.
- In Adam19-wildtype samples:
- Roughly 90% of reads are assigned to chr11. This bias appears to be a choice by the STAR alignment algorithm.
- In Adam19-mutant samples:
- The majority of coverage on chrAdam19TdT is seen
near the TdTomato exon which replaced wildtype exons 6 and 7 in Adam19_TdTomato.
- However, upstream from the TdTomato domain, RNA-seq reads were
preferentially aligned to the wildtype Adam19 region on chr11.
This bias seems consistent with the STAR alignment bias toward wildtype
Adam19 region on chr11, in RNA-seq fragments which do not contain custom
sequence data from the TdTomato region.
- The visual coverage represented by this STAR alignment strategy
was not consistent with transcript quantitation performed by Salmon in any samples.
- Salmon uses the overall evidence to support an expectation-maximization strategy to attribute
RNA-seq reads to transcript isoforms to estimate transcript abundance.
- The Salmon tool was provided all Adam19 wildtype and mutant transcripts (in addition to all Gencode
vM33 comprehensive transcripts), and was freely able to attribute RNA-seq reads to any supported transcripts
from that superset.
- Salmon attributed all Adam19-wildtype reads to wildtype Adam19 transcripts, and nearly all* Adam19-mutant
reads to mutant Adam19 transcripts. (* >99.9%)
- The recommendation is to refer to "Adam19 coverage" as a visual
representation of RNA-seq evidence, consistent with Salmon transcript
isoform quantitation.