gnomAD Variants gnomAD Structural Variants Track Settings
 
Genome Aggregation Database (gnomAD) - Structural Variants v4.1

Track collection: Genome Aggregation Database (gnomAD) Genome and Exome Variants

+  Description
+  All tracks in this collection (9)

Display mode:      Duplicate track

Filter by Variant Size: to
Filter by non-neurological allele frequency: to (0 to 1)
Filter by common disease control allele frequency: to (0 to 1)
Filter by Type of Variation (select multiple items - Help)


Display data as a density graph:
Data schema/format description and download
Source data version: Release 4.1 (November 01, 2023)
Assembly: Human Dec. 2013 (GRCh38/hg38)
Data last updated at UCSC: 2024-08-01 12:53:09


new Note: September 30, 2024

Description

The Genome Aggregation Database (gnomAD) - Structural Variants v4.1 track set shows structural variants calls (>=50 nucleotides) from the gnomAD v4.1 release on 63,046 unrelated genomes. It mostly (but not entirely) overlaps with the genome set used for the gnomAD short variant release. For more information see the following blog post, Structural variants in gnomAD.

Display Conventions and Configuration

Items are shaded according to variant type, mouseover on items indicates affected protein-coding genes, size of the variant (which may differ from the chromosomal coordinates in cases like insertions), variant type (insertion, duplication, etc), allele count, allele number, and allele frequency. When more than 2 genes are affected by a variant, the full list can be obtained by clicking on the item and reading the details page. A short summary is available in the below table:

Variant Type All SV's
Breakend (BND) 356035
Complex (CPX) 15189
Translocation (CTX) 99
Deletion (DEL) 1206278
Duplication (DUP) 269326
Insertion (INS) 304645
Inversion (INV) 2193
Copy number variants (CNV) 721

Detailed information on the CNV color code is described here. All tracks can be filtered according to the size of the variant and variant type, using the track Configure options.

Filtering Options

Three filters are available for this track:

  • Variant Size: Used to exclude/include variants according to the size.
  • Non-neurological allele frequency: Used to exclude/include allele frequency of variants in individuals who do not have a neurological condition, as identified in a case-control study.
  • Common disease control allele frequency: Used to exclude/include allele frequency of variants in individuals not identified as cases in a case-control study of common disease.

Methods

The bed files was obtained from the gnomAD Google Storage bucket:

https://storage.googleapis.com/gcp-public-data--gnomad/release/4.1/genome_sv/gnomad.v4.1.sv.non_neuro_controls.sites.bed.gz
The data was then transformed into a bigBed track. For the full list of commands used to make this track please see the "gnomAD Structural Variants v4" section of the makedoc.

Data Access

The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated access, this track, like all others, is available via our API. However, for bulk processing, it is recommended to download the dataset. The genome annotation is stored in a bigBed file that can be downloaded from the download server. The exact filenames can be found in the track configuration file. Annotations can be converted to ASCII text by our tool bigBedToBed which can be compiled from the source code or downloaded as a precompiled binary for your system. Instructions for downloading source code and binaries can be found here. The tool can also be used to obtain only features within a given range, for example:

bigBedToBed http://hgdownload.soe.ucsc.edu/gbdb/hg38/gnomAD/v4/structuralVariants/gnomad.v4.1.sv.non_neuro_controls.sites.bb -chrom=chr6 -start=0 -end=1000000 stdout

Please refer to our mailing list archives for questions and example queries, or our Data Access FAQ for more information.

More information about using and understanding the gnomAD data can be found in the gnomAD FAQ site.

Credits

Thanks to the Genome Aggregation Database Consortium for making these data available. The data are released under the ODC Open Database License (ODbL) as described here.

References

Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, O'Donnell-Luria AH, Ware JS, Hill AJ, Cummings BB et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016 Aug 18;536(7616):285-91. PMID: 27535533; PMC: PMC5018207

Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, Collins RL, Laricchia KM, Ganna A, Birnbaum DP et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature. 2020 May;581(7809):434-443. PMID: 32461654; PMC: PMC7334197

Collins RL, Brand H, Karczewski KJ, Zhao X, Alföldi J, Francioli LC, Khera AV, Lowther C, Gauthier LD, Wang H et al. A structural variation reference for medical and population genetics. Nature. 2020 May;581(7809):444-451. PMID: 32461652; PMC: PMC7334194

Cummings BB, Karczewski KJ, Kosmicki JA, Seaby EG, Watts NA, Singer-Berk M, Mudge JM, Karjalainen J, Satterstrom FK, O'Donnell-Luria AH et al. Transcript expression-aware annotation improves rare variant interpretation. Nature. 2020 May;581(7809):452-458. PMID: 32461655; PMC: PMC7334198