INTRODUCTION TO SHORT VARIANTS

Variants are found throughout all of our genomic sequences. In the past, the word "mutations" was typically used in place of  "variants," but the present-day nomenclature is the latter, to avoid the negative assoications of the word. Some variants are unique to certain individuals or familial bloodlines ("private" variants), while others are common and well-documented. They can be completely innocuous, or result in faulty proteins with pathogenic ramifications. The Genome Browser can be utilized to visualize variants and their consequences. 

A variant is typically defined as a deviation from the sequence of the reference assembly, which itself is the sequence of one arbitrary individual at any genomic location. The gene BRCA2 is known for having variants that are correlated with breast and ovarian cancer because its main function is as a tumor suppressor gene that helps repair damaged DNA; it also has a plethora of potential variations and serves as a good example for exploring different types of variants. We can examine some of these variations by turning on two types of  "dbSNP" tracks, both of which tag variation sites in a gene. "SNP" is an acronym for "Single Nucleotide Polymorphism", and has been replaced by the more accurate "SNV" (Single Nucleotide Variant) to avoid misrepresenting the allele frequency — "polymorphism" implies that a variant is common in the population, which is not usually true. The database is still called dbSNP, however.

In Figures 1 and 2, the "UCSC Genes" track is displaying exon 11 of the BRCA2 gene. Below the "UCSC Genes" track are two "dbSNP" tracks, "dbSNP153" (upper) and "dbSNP151" (lower); these two tracks are different iterations of the same rsIDs and data, with some differences between the two data releases. The two tracks have different details available when clicking on one track or the other. Checking both versions of the rsID, or "reference SNP" can provide a wide range of insight into a variant. 

Each colored box, labeled with an rsID, note marks a documented variation site within the sequence. A green box indicates that the variant is synonymous, or does not change the resulting amino acid. Red boxes signify that the nucleotide change does impact the amino acid sequence.

Figure 1. Zoomed-in picture of BRCA2, exon 11 (frame is 23 nucleotides across). The DNA sequence is at the top. The "UCSC Genes" track shows corresponding amino acids. The lower tracks labeled, "All Short Genetic Variants from dbSNP Release 153" and "Simple Nucleotide Polymorphisms (dbSNP 151)", have show identified variant sites.

An active Browser session for this view can be found at https://genome.ucsc.edu/s/education/hg19_BRCA2variants

 

Figure 2. How to enable the "dbSNP 153" and "dbSNP151" on human genome assembly hg19. The "UCSC Genes" track shows exon 11 of BRCA2.