Description
This track shows predicted and experimental representations of the
SARS-CoV-2 transcriptome based on long-read Nanopore sequencing.
Display Conventions and Configuration
SARS-CoV-2 generates sub-genomic mRNAs (sgmRNAs) for all ORFs. The virus
achieves this by recombination mechanisms in which replication machinery
jumps from one of many TRS-B site (transcription regulatory sequence, body) to
the TRS-L (leader sequence) during negative strand synthesis.
These negative strands are then used as templates for mRNA synthesis.
On these tracks we depict the predicted mRNAs with the excised sequence
drawn like introns. The ORFs predicted to be translated by these mRNAs are
shown in thick boxes. The thin bars function as UTRs for that particular mRNA
species.
Multiple subtracks are available:
- TRS sites: Annotated core TRS sequence (ACGAAC) in the viral genome
that allows recombination. One site TRS* differs from the canonical TRS site by
1 nt, but has experimental support and is required to generate a 7b mRNA.
- SARS-CoV-2 Transcripts: Canonical SARS-CoV-2 Transcripts (gRNA and
mRNA). Generated by recombining all TRS-B sites in the above track with the
leader. Note the actual recombination breakpoints can often be drawn in
multiple ways (since the TRS core motif is identical), and direct RNA
sequencing suggests slightly different breakpoints, depending on the mRNA.
See the experimental tracks if your analysis requires a detailed understanding of
the breaks. The reported breaks in this track are only meant to be approximate.
Methods
- TRS sites: The core TRS sequence (ACGAAC) was identified using the findMotif tool.
The TRS* (AaGAAC) site identified in Kim et al was manually added to create overall
agreement with their Figure 1.
- SARS CoV-2 Transcripts: Using the TRS track, we generated mRNAs which span from
nt 1-75 (the end of the TRS-L core sequence) and resume at the 3' end of all TRS-B
sequences. Neither the 5' and 3' terminal ends of these RNAs, nor their internal
breakpoints, should be considered exact. CDS sequences match Figure 1 from Kim et al.
Data Access
The raw data can be explored interactively with the
Table Browser, or combined with other datasets in the
Data Integrator tool.
For automated analysis, the genome annotation is stored in
a bigBed file that can be downloaded from
the download server.
Annotations can
be converted from binary to ASCII text by our command-line tool bigBedToBed.
Instructions for downloading this command can be found on our
utilities page.
The tool can also be used to obtain features within a given range without downloading the file,
for example:
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/wuhCor1/bbi/kim2020/TRS.bb -chrom=NC_045512v2 -start=0 -end=29902 stdout
Please refer to our
mailing list archives
for questions, or our
Data Access FAQ
for more information.
Credits
Thanks to Jason Fernandes (Haussler-lab, UCSC) for preparing this track.
References
Kim D, Lee JY, Yang JS, Kim JW, Kim VN, Chang H.
The Architecture of SARS-CoV-2 Transcriptome.
Cell. 2020 Apr 18;.
PMID: 32330414; PMC: PMC7179501
|