Integration of external data sources puts you in control.

The automated integration of variant annotations from a variety of external databases and methods gives you all the information you need to decide if your variants are important. Annotation sources with the versions used by Seave are listed below.

OMIM

Description: OMIM (Online Mendelian Inheritance in Man) is a comprehensive, authoritative compendium of human genes and genetic phenotypes that is freely available and updated daily. The full-text, referenced overviews in OMIM contain information on all known mendelian disorders and over 15,000 genes. OMIM focuses on the relationship between phenotype and genotype. It is updated daily, and the entries contain copious links to other genetics resources.
Annotated by: Seave
Current version: 20170913
Last updated: 2017-09-14

ClinVar

Description: ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. ClinVar thus facilitates access to and communication about the relationships asserted between human variation and observed health status, and the history of that interpretation. ClinVar processes submissions reporting variants found in patient samples, assertions made regarding their clinical significance, information about the submitter, and other supporting data.
Annotated by: Seave
Current version: 20170905
Last updated: 2017-09-14

COSMIC

Description: All cancers arise as a result of the acquisition of a series of fixed DNA sequence abnormalities, mutations, many of which ultimately confer a growth advantage upon the cells in which they have occurred. There is a vast amount of information available in the published scientific literature about these changes. COSMIC is designed to store and display somatic mutation information and related details and contains information relating to human cancers.
Annotated by: Seave
Current version: v81
Last updated: 2017-06-09

RVIS

Description: RVIS (Residual Variation Intolerance Score) is a gene-based score intended to help in the interpretation of human sequence data. The intolerance score in its current form is based upon allele frequency data as represented in whole exome sequence data from the NHLBI-ESP6500 data set. The score is designed to rank genes in terms of whether they have more or less common functional genetic variation relative to the genome wide expectation given the amount of apparently neutral variation the gene has.
Annotated by: Seave
Current version: 20160312
Last updated: 2017-04-11

COSMIC CGC

Description: The COSMIC Cancer Gene Census is an ongoing effort to catalogue those genes for which mutations have been causally implicated in cancer. The census is not static but rather is updated regularly/as needed. Currently, more than 1% of all human genes are implicated via mutation in cancer. Of these, approximately 90% have somatic mutations in cancer, 20% bear germline mutations that predispose to cancer and 10% show both somatic and germline mutations.
Annotated by: Seave
Current version: 20170403
Last updated: 2017-04-03

MITOMAP

Description: The Mitomap database of human mitochondrial DNA (mtDNA) information has been an important resource for information about the human mitochondrial DNA (mtDNA) for researchers, clinicians, and genetic counselors for the past 25 years. Essential information about the mitochondrial reference sequence is provided, along with an extensive compilation of mtDNA variants. The Mitomap curators search research literature for published reports of mitochondrial DNA variants and index those variants in the database. Those variants that are reported as having possible association with disease are noted. A new addition to Mitomap is the inclusion of data from full-length human mtDNA sequences in GenBank.
Annotated by: Seave
Current version: 20170330
Last updated: 2017-04-03

Orphanet

Description: Orphanet is the reference portal for information on rare diseases and orphan drugs, for all audiences. Orphanet?s aim is to help improve the diagnosis, care and treatment of patients with rare diseases.
Annotated by: Seave
Current version: 5/2016
Last updated: 2016-06-09

CADD

Description: CADD is a tool for scoring the deleteriousness of single nucleotide variants as well as insertion/deletions variants in the human genome. While many variant annotation and scoring tools are around, most annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Thus, a broadly applicable metric that objectively weights and integrates diverse information is needed. Combined Annotation Dependent Depletion (CADD) is a framework that integrates multiple annotations into one metric by contrasting variants that survived natural selection with simulated mutations.
Annotated by: GEMINI
Current version: v1.3
Last updated: 2015-11-22

HGVS

Description: Biological sequence variants are commonly represented in scientific literature, clinical reports and databases of variation using the mutation nomenclature guidelines endorsed by the Human Genome Variation Society (HGVS).
Annotated by: VEP
Current version: Calculated by VEP v79
Last updated: 2015-08-24

KCCG Exome VAFs

Description: Every sequencing platform and bioinformatics pipeline has its own biases. Seave uses an internal allele frequency database of genomes and exomes sequenced at the KCCG to highlight variants that are frequent in the human population but have not been reported by other more restricted allele frequency databases, or variants that are the result of the processes at the KCCG.
Annotated by: Seave
Current version: 17/7/15
Last updated: 2015-07-17

KCCG Genome VAFs

Description: Every sequencing platform and bioinformatics pipeline has its own biases. Seave uses an internal allele frequency database of genomes and exomes sequenced at the KCCG to highlight variants that are frequent in the human population but have not been reported by other more restricted allele frequency databases, or variants that are the result of the processes at the KCCG.
Annotated by: Seave
Current version: 17/7/15
Last updated: 2015-07-17

PolyPhen2

Description: PolyPhen-2 (Polymorphism Phenotyping v2) is a tool which predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations.
Annotated by: VEP
Current version: v2.2.2 Calculated by VEP v79
Last updated: 2015-05-29

Transcript

Description: The transcript which the variant impacts to the highest degree.
Annotated by: VEP
Current version: ENSEMBL v79
Last updated: 2015-05-29

LOFTEE LoF

Description: Loss-Of-Function Transcript Effect Estimator: A VEP plugin to identify LoF (loss-of-function) variation.
Annotated by: VEP
Current version: 9/2014
Last updated: 2015-05-29

SIFT

Description: SIFT predicts whether an amino acid substitution affects protein function. SIFT prediction is based on the degree of conservation of amino acid residues in sequence alignments derived from closely related sequences, collected through PSI-BLAST. SIFT can be applied to naturally occurring nonsynonymous polymorphisms or laboratory-induced missense mutations.
Annotated by: VEP
Current version: v5.0.2 Calculated by VEP v79
Last updated: 2015-05-29

Gene

Description: The gene which the variant impacts to the highest degree.
Annotated by: VEP
Current version: ENSEMBL v79
Last updated: 2015-05-29

Impact

Description: The highest impact the variant has on a gene.
Annotated by: VEP
Current version: Calculated by VEP v79
Last updated: 2015-05-29

PROVEAN

Description: PROVEAN (Protein Variation Effect Analyzer) is a software tool which predicts whether an amino acid substitution or indel has an impact on the biological function of a protein. PROVEAN is useful for filtering sequence variants to identify nonsynonymous or indel variants that are predicted to be functionally important.
Annotated by: DBNSFP
Current version: v1.1
Last updated: 2015-03-26

MetaLR

Description: Very similar to MetaSVM with similar performance. However, the model is far more interpretable.
Annotated by: DBNSFP
Current version: v1.0
Last updated: 2015-03-26

FATHMM

Description: Functional Analysis through Hidden Markov Models predicting the functional consequences of both coding variants, i.e. non-synonymous single nucleotide variants (nsSNVs), and non-coding variants.
Annotated by: DBNSFP
Current version: Unknown Comes from dbNSFP v2.9
Last updated: 2015-03-26

MetaSVM

Description: MetaSVM is developed by Coco Dong at my lab in collaboration with Dr. Xiaoming Liu. It is composed of two steps: (1) perform imputation for whole-exome variants and fill out missing scores for SIFT, PolyPhen, MutationAssessor and so on. (2) Normalize all scores to 0-1 range (3) use a radial SVM model to train prediction model using all available scores and some population genetics parameters, and then apply the model on whole-exome variants.
Annotated by: DBNSFP
Current version: v1.0
Last updated: 2015-03-26

GERP++

Description: GERP identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint.
Annotated by: DBNSFP
Current version: GERP++ (v2.0)
Last updated: 2015-03-26

UniProt

Description: UniProt is a comprehensive, high-quality and freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature.
Annotated by: DBNSFP
Current version: 9/2013
Last updated: 2015-03-26

ExAC

Description: The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a wide variety of large-scale sequencing projects, and to make summary data available for the wider scientific community.
Annotated by: GEMINI
Current version: 0.2
Last updated: 2015-00-00

ESP

Description: The goal of the NHLBI GO Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to share these datasets and findings with the scientific community to extend and enrich the diagnosis, management and treatment of heart, lung and blood disorders.
Annotated by: GEMINI
Current version: ESP6500SI.snps_indels.vcf (3/2013)
Last updated: 2015-00-00

ENCODE

Description: The ENCODE (Encyclopedia of DNA Elements) Consortium is an international collaboration of research groups funded by the National Human Genome Research Institute (NHGRI). The goal of ENCODE is to build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and RNA levels, and regulatory elements that control cells and circumstances in which a gene is active.
Annotated by: GEMINI
Current version: Unknown Comes from GEMINI v0.11
Last updated: 2015-00-00

1000 Genomes

Description: The 1000 Genomes Project ran between 2008 and 2015, creating the largest public catalogue of human variation and genotype data.
Annotated by: GEMINI
Current version: 1000G Phase3 (v20130502)
Last updated: 2015-00-00

dbSNP

Description: The Single Nucleotide Polymorphism Database (dbSNP) is a free public archive for genetic variation within and across different species developed and hosted by the National Center for Biotechnology Information (NCBI) in collaboration with the National Human Genome Research Institute (NHGRI).
Annotated by: GEMINI
Current version: v141
Last updated: 2015-00-00