Tabula Muris
is a compendium of single cell transcriptome data from the model organism Mus
musculus, containing nearly 100,000 cells from 20 organs and tissues. The data
allow for direct and controlled comparison of gene expression in cell types
shared between tissues, such as immune cells from distinct anatomical
locations.
This track shows the results from FACS sorted cells sequenced with the
SmartSeq2 protocol, as it has much higher transcript coverage. The sequencing
data comprises more than 2TB and was summarized into a track at UCSC.
Display Conventions and Configuration
As indicated by the "..." after its name, this is a 'super track', a container for
subtracks. There are three different subtracks:
Cell type expression:
A rectangle on the genome, at the location of a gene, filled with a bar graph that indicates the
gene's expression by single cell cluster. The term "cluster" refers to a cluster of
single cells, which usually represents a cell or tissue type. The height of the bar graph on the
genome is the median expression level and a click-through on the bar chart displays a boxplot of
expression level quartiles with outliers, per cluster. On the boxplot, the number of cells from
each experiment is shown.
Coverage:
Bar graphs indicate the number of reads at this base pair. You may want to switch on
auto-scaling of the y-axis. For configuration options, see the graph tracks
configuration help page. These tracks are shown in "dense" by default, set any of
the tracks to "full" to see the detailed coverage plot.
Splice Junctions:
Thick rectangles show exons around a splice site, connected by a line that indicates the
intron. These gaps are shown and are annotated with the number of reads, in the 'score' field.
You can use the 'score' filter on the track configuration page to show only introns with a
certain number of supporting reads. The maximum number of reads that are shown is 1,000, even if
more reads support an intron. These tracks are shown in dense by default, set this track to
"pack" to see. Then click the splice junctions to see their score.
Methods
BAM files were provided by the data submitters, one (single end) or two files (paired end) per
cell. The BAM alignments were used as submitted. They were merged with "samtools merge"
into a single BAM file per cluster. The readgroup (RG) BAM tag indicates the original cell.
From the resulting merged BAM file, coverage was obtained using "wiggletools coverage" a
tool written by Daniel Zerbino and the result was converted with the UCSC tool
"wigToBigWig".
Also on the merged BAM file, the software IntronProspector was run with default settings. It
retains reads with a gap longer than 70 bp and shorter than 500 kbp and merges them into annotated
splice junctions.
Data Access
The merged BAM files, coverage bigWig files and splice junctions in bigBed format can be
downloaded from the /gbdb fileserver.
Since the splice junction .bigBed files have their scores capped at 1000, the original
IntronProspector .bed files are available in the same
track hub directory. You can
also find there *.calls.tsv files with more details about each junction, e.g. the number of
uniquely mapping reads.
Credits
WiggleTools was written by Daniel Zerbino, IntronProspector was written by Mark Diekhans, track
hubs were written to a large extent by Brian Raney and colleages at the UCSC Genome Browser. Track
creation was done by Max Haeussler and tested by Jairo Navarro.