Description
This track shows the in-silico design of crRNAs for Cas13 using
the tool nCov2019_Guide_Design, as described in Abbott et al., 2020.
To target highly conserved regions of the SARS-CoV-2 genome, an in-silico collection of
all 3,802 possible crRNAs were generated. After excluding crRNAs that are either predicted
to have potential off-target binding (≤2 mismatches in the human transcriptome) or
having poly-T sequences that may prevent crRNA expression (≥4 consecutive Ts), a set
of 3,203 crRNAs were obtained. These crRNAs are also able to target SARS and
MERS with ≤1 mismatch.
Each crRNA has been characterized with four features:
- Efficiency is predicted using the online tool at https://gitlab.com/sanjanalab/cas13
- Specificy is determined by the number of off-target loci in human mRNA with ≤2
mismatches to the crRNA
- Generality within Coronaviridae is quantified as the percentage of Coronaviridae strains
targeted by the given crRNA with perfect identity
- Generality within SARS-CoV-2 is quantified as the percentage of 1,087 SARS-CoV-2 patient
genomes downloaded on March 20, 2020 that are targeted by the given crRNA with perfect
identity
Method
To design all possible crRNAs for the three pathogenic RNA viruses (SARS-CoV-2, SARS-CoV,
and MERS-CoV), the reference genomes of SARS-CoV, MERS-CoV, along with SARS-CoV-2 genomes
derived from 47 patients were first aligned by MAFFT using the --auto flag.
crRNA candidates were identified by using a sliding window to extract all 22-nucleotide
(nt) sequences with perfect identity among the SARS-CoV-2 genomes.
We annotated each crRNA candidate with the number of mismatches relative to the SARS-CoV
and MERS-CoV genomes, as well as the GC content. 3,802 crRNA candidates were selected with
perfect match against the 47 SARS-CoV-2 genomes and with ≤1 mismatch to SARS-CoV
or MERS-CoV sequences. To characterize the specificity of 22-nt crRNAs, we ensured
that each crRNA does not target any sequences in the human transcriptome.
We used Bowtie 1.2.2 to align crRNAs to the human transcriptome (hg38;
including non-coding RNA) and removed crRNAs that mapped to the human
transcriptome with ≤2 mismatches.
Data Access
The raw data can be explored interactively with the
Table Browser, or combined with other datasets in the
Data Integrator tool.
For automated analysis, the genome annotation is stored in
a bigBed file that can be downloaded from
the download server.
Annotations can
be converted from binary to ASCII text by our command-line tool bigBedToBed.
Instructions for downloading this command can be found on our
utilities page.
The tool can also be used to obtain features within a given range without downloading the file,
for example:
bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/wuhCor1/bbi/cas13Crispr.bb -chrom=NC_045512v2 -start=0 -end=29902 stdout
Please refer to our
mailing list archives
for questions, or our
Data Access FAQ
for more information.
Credits
The predictions for this track are produced by
Xueqiu Lin and Augustine Chemparathy in Stanley Qi lab at Stanford University
References
Abbott, Timothy R., Girija Dhamdhere, Yanxia Liu, Xueqiu Lin, Laine Goudy, Leiping Zeng,
Augustine Chemparathy, et al.
, 2020.
Development of CRISPR as a Prophylactic Strategy to Combat Novel Coronavirus and Influenza.
bioRxiv
|