Description
This set of tracks shows the genomic positions of probes and targets from a full
suite of in-solution-capture target enrichment exome kits for Next Generation Sequencing (NGS)
applications. Also known as exome sequencing or whole exome sequencing (WES),
this technique allows high-throughput parallel sequencing of all exons (e.g., coding regions of genes
which affect protein function), constituting about 1% of the human genome, or approximately 30
million base pairs.
The tracks are intended to show the major differences in target genomic regions between the
different exome capture kits from the major players in the NGS sequencing market:
Illumina Inc.,
Roche NimbleGen Inc.,
Agilent Technologies Inc.,
MGI Tech,
Twist Bioscience, and
Integrated DNA Technologies Inc..
Display Conventions and Configuration
Items are shaded according to manufacturing company:
- IDT (Integrated DNA Technologies)
- Twist Biosciences
- MGI Tech (Beijing Genomics Institute)
- Roche NimbleGen
- Agilent Technologies
- Illumina
Tracks labeled as Probes (P) indicate the footprint of the oligonucleotide probes
mapped to the human genome. This is the technically relevant targeted region by the assay. However,
the sequenced region will be bigger than this since flanking sequences are sequenced as well.
Tracks labeled as Target Regions (T) indicate the genomic regions targeted by the
assay. This is the biologically relevant target region. Not all targeted regions
will necessarily be sequenced perfectly; there might be some capture bias at certain locations.
The Target
Regions are those normally used for coverage analysis.
Note that most exome probesets are available on hg19 only. If you are working with hg38 and cannot find
a particular probeset there, try to go to hg19, configure the same track, and
see if it exists there. If you cannot find an array, do not hesitate to send us
an email with the name of the manufacturer website with the probe file. If
an array is available on hg19 but not on hg38 and you need it for your work, we
can lift the locations. Our mailing list can be reached at [email protected].
Methods
The capture of the genomic regions of interest using in-solution capture, is achieved
through the hybridization of a set of probes (oligonucleotides) with a sample of fragmented genomic
DNA in a solution environment. The probes hybridize selectively to the genomic regions of interest
which, after a process of exclusion of the non-selective DNA material, can be pulled down and
sequenced, enabling selective DNA sequencing of the genomic regions of interest (e.g., exons).
In-solution capture sequencing is a sensitive method to detect single nucleotide variants,
insertions and deletions, and copy number variations.
Kit |
Targeted Region |
Databases Used for Design |
Year of Release |
IDT - xGen Exome Research Panel V1.0 |
39 Mb |
Coding sequences from RefSeq (19,396 genes) |
2015 |
IDT - xGen Exome Research Panel V2.0 |
34 Mb |
Coding sequences from RefSeq 109 (19,433 genes) |
2020 |
Twist - RefSeq Exome Panel |
3.6 Mb |
Curated subset of protein coding genes from CCDS |
N/A |
Twist - Core Exome Panel |
33 Mb |
Protein coding genes from CCDS |
N/A |
Twist - Comprehensive Exome Panel |
36.8 Mb |
Protein coding genes from RefSeq, CCDS, and GENCODE |
2020 |
Twist - Exome Panel 2.0 |
36.4 Mb |
Protein coding genes from RefSeq, CCDS, and GENCODE |
2021 |
MGI - Easy Exome Capture V4 |
59 Mb |
CCDS, GENCODE, RefSeq, and miRBase |
N/A |
MGI - Easy Exome Capture V5 |
69 Mb |
CCDS, GENCODE, RefSeq, miRBase, and MGI Clinical Database |
N/A |
Agilent - SureSelect Clinical Research Exome |
54 Mb |
Disease-associated regions from OMIM, HGMD, and ClinVar |
2014 |
Agilent - SureSelect Clinical Research Exome V2 |
63.7 Mb |
Disease-associated regions from OMIM, HGMD, ClinVar, and ACMG |
2017 |
Agilent - SureSelect Focused Exome |
12 Mb |
Disease-associated regions from HGMD, OMIM and ClinVar |
2016 |
Agilent - SureSelect All Exon V4 |
51 Mb |
Coding regions from CCDS, RefSeq, and GENCODE v6, miRBase v17, TCGA v6, and UCSC known genes |
2011 |
Agilent - SureSelect All Exon V4 + UTRs |
71 Mb |
Coding regions and 5' and 3' UTR sequences from CCDS, RefSeq, and GENCODE v6, regions from miRBase v17, TCGA v6, and UCSC known genes |
2011 |
Agilent - SureSelect All Exon V5 |
50 Mb |
Coding regions from Refseq, GENCODE, UCSC, TCGA, CCDS, and miRBase (21.522 genes) |
2012 |
Agilent - SureSelect All Exon V5 + UTRs |
74 Mb |
Coding regions and 5' and 3' UTR sequences from Refseq, GENCODE, UCSC, TCGA, CCDS, and miRBase (21.522 genes) |
2012 |
Agilent - SureSelect All Exon V6 r2 |
60 Mb |
Coding regions from RefSeq, CCDS, GENCODE, HGMD, and OMIM |
2016 |
Agilent - SureSelect All Exon V6 + COSMIC r2 |
66 Mb |
Coding regions from RefSeq, CCDS, GENCODE, HGMD, and OMIM, and targets from both TCGA and COSMIC |
2016 |
Agilent - SureSelect All Exon V6 + UTR r2 |
75 Mb |
Coding regions and 5' and 3' UTR sequences from RefSeq, GENCODE, CCDS, and UCSC known genes,and miRNAs and lncRNA sequences |
2016 |
Agilent - SureSelect All Exon V7 |
35.7 Mb |
Coding regions from RefSeq, CCDS, GENCODE, and UCSC known genes |
2018 |
Roche - KAPA HyperExome |
43Mb |
Coding regions from CCDS, RefSeq, Ensembl, GENCODE,and variants from ClinVar |
2020 |
Roche - SeqCap EZ Exome V3 |
64 Mb |
Coding regions from RefSeq RefGene CDS, CCDS, and miRBase v14 databases, plus coverage of 97% Vega, 97% Gencode, and 99% Ensembl |
2018 |
Roche - SeqCap EZ Exome V3 + UTR |
92 Mb |
Coding sequences from RefSeq RefGene, CCDS, and miRBase v14, plus coverage of 97% Vega, 97% Gencode, and 99% Ensembl and UTRs from RefSeq RefGene table from UCSC GRCh37/hg19 March 2012 and Ensembl (GRCh37 v64) |
2018 |
Roche - SeqCap EZ MedExome |
47 Mb |
Coding sequences from CCDS 17, RefSeq, Ensembl 76, VEGA 56, GENCODE 20, miRBase 21, and disease-associated regions from GeneTests, ClinVar, and based on customer input |
2014 |
Roche - SeqCap EZ MedExome + Mito |
47 Mb |
Coding sequences and mitochondrial genes from CCDS 17, RefSeq, Ensembl 76, VEGA 56, GENCODE 20 and miRBase 21, disease-associated regions from GeneTests, ClinVar, and based on customer input |
2014 |
Illumina - Nextera DNA Exome V1.2 |
45 Mb |
Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v19 |
2015 |
Illumina - Nextera Rapid Capture Exome |
37 Mb |
212,158 targeted exonic regions with start and stop chromosome locations in GRCh37/hg19 |
2013 |
Illumina - Nextera Rapid Capture Exome V1.2 |
37 Mb |
Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v12 |
2014 |
Illumina - Nextera Rapid Capture Expanded Exome |
66 Mb |
Coding regions from RefSeq, CCDS, Ensembl, and GENCODE v12 |
2013 |
Illumina - TruSeq DNA Exome V1.2 |
45 Mb |
Coding regions from RefSeq, CCDS, and Ensembl |
2017 |
Illumina - TruSeq Rapid Exome V1.2 |
45 Mb |
Coding regions from RefSeq, CCDS, Ensembl, and GENECODE v19 |
2015 |
Illumina - TruSight ONE V1.1 |
12 Mb |
Coding regions of 6700 genes from HGMD, OMIM, and GeneTest |
2017 |
Illumina - TruSight Exome |
7 Mb |
Disease-causing mutations as curated by HGMD |
2017 |
Illumina - AmpliSeq Exome Panel |
N/A |
CCDS coding regions |
2019 |
Data Access
The raw data can be explored interactively with the Table Browser
or cross-referenced with Data Integrator. The data can be
accessed from scripts through our API, with track names
found in the Table Schema page for each subtrack after "Primary Table:".
For downloading the data, the annotations are stored in bigBed files that
can be accessed at
our download directory.
Regional or the whole genome text annotations can be obtained using our utility
bigBedToBed. Instructions for downloading utilities can be found
here.
Credits
Thanks to Illumina (U.S.), Roche NimbleGen, Inc. (U.S.), Agilent Technologies (U.S.), MGI Tech
(Beijing Genomics Institute, China), Twist Bioscience (U.S.), and Integrated DNA Technologies (IDT),
Inc. (U.S.) for making these data available and to Tiana Pereira, Pranav Muthuraman, Began Nguy
and Anna Benet-Pages for enginering these tracks.
|
|