Project

General

Profile

Annotation Level Columns

  • Impact: SnpEff classification of the severity
  • Effect: Consequence type of this variation for most damaging effect
  • Canonical Transcript Effect: Consequence type of this variation for most damaging effect associated with canonical transcript
  • Gene Name: HGNC gene for most damaging effect
  • Transcript Stable Id: Transcript for most damaging effect
  • Has CCDS Transcript: Boolean indicating whether variant has CCDS transcript.
  • HGVS_c: annotation from ClinEff
  • HGVS_p: annotation from ClinEff
  • Polyphen Humdiv Score: PolyPhen-2 HumDiv Score for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humdiv Prediction: PolyPhen-2 HumDiv Classification for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humvar Score: PolyPhen-2 HumVar Score for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humvar Prediction: PolyPhen-2 HumVar Classification for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Consequence annotations: Effect|Gene|Transcript|HGVS_c|HGVS_p|Polyphen_Humdiv|Polyphen_Humvar: The predicted amino acid change for all transcripts effected by this variant

KnownVar dateset:

  • HGMD DM Site Count: Tally of how many variant-disease associations are reported in HGMD at the specific variant of interest site. Includes precise indel overlaps.
  • HGMD DM 2bpflanks Count: Tally of how many variant-disease associations are reported in HGMD at the specific variant of interest site and within 2bp flanking. Includes precise indel overlaps.
  • HGMD Disease: List of disease name(s) associated with the HGMD entry for this variant. Includes precise indel overlaps.
  • HGMD PMID: List of Pubmed ID for the publication(s) that resulted in the HGMD entries corresponding to the variant. Includes precise indel overlaps.
  • HGMD Class: List of HGMD classifications of variant (for more info see: http://www.hgmd.cf.ac.uk/ac/index.php). Includes precise indel overlaps.
  • ClinVar PLP Site Count: Tally of how many variant-disease associations are reported in ClinVar at the specific variant of interest site. Includes precise indel overlaps.
  • ClinVar PLP 2bpflanks Count: Tally of how many variant-disease associations are reported in ClinVar at the specific variant of interest site and within 2bp flanking. Includes precise indel overlaps.
  • ClinVar PLP 25bpflanks Count: Tally of how many variant-disease associations are reported in ClinVar at the specific variant of interest site and within 25bp flanking. Includes precise indel overlaps.
  • ClinVar ClinRevStar: Star level for a given ClinVar CLNREVSTAT.
  • ClinVar ClinSig: Clinical significance for this single variant.
  • ClinVar ClinSigConf: Conflicting clinical significance for this single variant.
  • ClinVar DiseaseName: ClinVar's preferred disease name for the concept specified by disease identifiers in CLNDISDB.
  • ClinVar Pathogenic Indel Count: indel "pathogenic/likely pathogenic" variants.
  • Clinvar Pathogenic CNV Count: CNV "pathogenic/likely pathogenic" variants.
  • ClinVar Pathogenic SNV Splice Count: Splice "pathogenic/likely pathogenic" variants.
  • ClinVar Pathogenic SNV Nonsense Count: Nonsense "pathogenic/likely pathogenic" variants.
  • ClinVar Pathogenic SNV Missense Count: Missense "pathogenic/likely pathogenic" variants.
  • ClinGen: Flags genes considered to be 'Halploinsufficient' by the ClinGen curation committee.
  • OMIM Disease: List of OMIM disease associations linked to the Gene name linked to the variant (for more info see: http://www.omim.org/).
  • OMIM Inheritance: List of OMIM Inheritance patterns from all disease associations.
  • ACMG: ACMG v3 disease/phenotype associations.

ExAC dateset:

  • ExAC global maf: Global minor allele frequency among the collective ExAC cohort of ~6K samples (ExAC release 0.3)
  • ExAC global gts: Global genotype distribution among the collective ExAC cohort of ~6K samples (ExAC release 0.3)
  • ExAC afr maf: African (american) minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC afr gts: African (american) genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC amr maf: Latino minor allele frequency among the collective ExAC cohort (ExAC release 0.30)
  • ExAC amr gts: Latino genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC eas maf: East Asian minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC eas gts: East Asian genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC sas maf: South Asian minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC sas gts: South Asian genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC fin maf: Finnish allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC fin gts: Finnish genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC nfe maf: Non-finnish European minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC nfe gts: Non-finnish European genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC oth maf: Other population minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC oth gts: Other population genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC vqslod: The vqslod provided for this variant by the ExAC consortium
  • ExAC Mean Coverage: The mean coverage at this site among the ExAC population
  • ExAC Sample Covered 10x: The number of samples with at least 10-fold coverage at this site among the ExAC population

RVIS dataset:

  • IGM-Roche-Avg%GeneCov "A population averaged estimate of the percentage of the protein-coding sequence for a gene that's covered with atleast 10-fold coverage across IGM sequenced exomes (Roche kit)".

A higher estimate indicates the better the protein-coding sequence of the gene is captured (on average) using the Roche kit.

  • 0.1%RVIS[EVS] "EVS-based RVIS as published in PLoS Genetics paper (Petrovski et al. 2013)"
  • 0.1%RVIS%[EVS] "EVS-based RVIS percentile as published in PLoS Genetics paper (Petrovski et al. 2013)"

As published, a lower RVIS and percentile score indicates a gene is increasingly intolerant relative to the rest of the assessed human protein-coding genes

Given the EVS cohort and protein-coding gene size some genes (Edge Case genes) have insufficient resolution data to be confident of their RVIS. For these genes, we recommend considering their OERatio as an alternative estimate of their genic intolerance.

  • OEratio%tile[EVS] "Alternative EVS-based genic intolerance score for EdgeCase genes flagged by indicator as Y (See: http://genic-intolerance.org/about.jsp)"
  • GenicConstraint[EVS] "EVS-based Constraint (missense z) scores as published in the Nature Genetics paper (Samocha et al. 2014)"

A percentile estimate of how constrained the gene is relative to the rest of the protein-coding exome.

  • 1.0%ncRVIS[IGM] "IGM WGS-based noncoding RVIS (ncRVIS) reflecting a gene's 5', 3' and promoter sequence intolerance to variation (Petrovski et al. 2015 [26332131])"
  • 1.0%ncRVIS%tile[IGM] "IGM WGS-based noncoding RVIS (ncRVIS) percentile reflecting a gene's 5', 3' and promoter sequence intolerance to variation (Petrovski et al. 2015 [26332131])"

A percentile estimate of how intolerant the noncoding exome sequence of a protein-coding gene is relative to the noncoding sequence of other protein-coding genes. Lower ncRVIS and percentile indicate increasingly intolerant noncoding (UTR and 250bp promoter) sequence. Genes with intolerant noncoding sequence have been described as begin dosage-sensitive genes.

  • ncGERP "GERP++ based non-coding genic conservation score reflecting the average GERP++ score for a protein-coding gene's noncoding exome sequence + 250bp promoter (Petrovski et al. 2015 [26332131])"
  • ncGERP%tile "GERP++ based non-coding genic conservation percentile reflecting how conserved a protein-coding gene's noncoding exome sequence + 250bp promoter is based on all protein-coding genes (Petrovski et al. 2015 [26332131])"

A percentile estimate of how overall conserved the noncoding exome sequence of a protein-coding gene is relative to the noncoding sequence of other protein-coding genes. Higher ncGERP scores and lower ncGERP percentiles indicate increasingly conserved noncoding (UTR and 250bp promoter) sequence. Genes with highly conserved noncoding sequence have been described as begin dosage-sensitive genes.

  • pcGERP "GERP++ based protein-coding genic conservation score reflecting the average GERP++ score for a protein-coding gene's coding sequence is (Petrovski et al. 2015 [26332131])"
  • pcGERP%tile "GERP++ based protein-coding genic conservation percentile reflecting how conserved a protein-coding gene's coding sequence is based on all protein-coding genes (Petrovski et al. 2015 [26332131])"

A percentile estimate of how overall conserved the protein-coding exome sequence of a gene is relative to the protein-coding sequence of other genes. Higher pcGERP scores and lower pcGERP percentiles indicate increasingly conserved protein-coding genes.

The RVIS has been reformulated using the ExAC cohort (accessed: January 13th 2015 [updated scores for release0.3]). Note, in addition to using a larger population of sequenced samples, this updated genic score differs from the original implementation in three major ways: the 'common' MAF is assigned as 0.05% and the score now leverages the stratified ethnicity data provided within ExAC. Moreover, X chromosome genes are assessed independently of the autosomal genes.

  • OEratio%tile[ExAC] "Alternative ExAC-based genic intolerance score for EdgeCase genes flagged by indicator as Y"

An OEratio, as previously described, now constructed on the ExAC cohort.


subRVIS dataset:

  • subRVIS Domain Name
  • subRVIS Domain Score
  • subRVIS Domain OEratio Percentile
  • subRVIS Exon Name
  • subRVIS Exon Score
  • subRVIS Exon OEratio Percentile

Note: the scores are ExAC 0.3 based.


LIMBR dataset:

  • LIMBR Domain Name
  • LIMBR Domain Score
  • LIMBR Domain Percentile
  • LIMBR Exon Name
  • LIMBR Exon Score
  • LIMBR Exon Percentile

MGI dataset:

  • MGI Phenotypes: it lists distinct 'high-level' mouse phenotypes reported for a HGNC gene name, with multiple entries separated using ATAV standards. To serve as a reference.
  • MGI Essential: it is a 0/1 indicator to highlight whether that ATAV gene has been linked to lethality when disrupted in mice. To serve as an enrichment reference.
  • MGI Seizure: it is a 0/1 indicator to highlight whether that ATAV gene has been linked to seizures when disrupted in mice. To serve as an enrichment reference.

DiscovEHR dataset:
  • DiscovEHR AF: allele frequencies (exact for MAF > 0.001 and binned for <0.001 and we converted to 0.00099)

MTR dataset:
  • MTR:
  • MTR FDR:
  • MTR Centile:

Only when variant's effect started with 'NON_SYNONYMOUS', it will output MTR scores.


LOFTEE dataset:
  • LOFTEE-HC in CCDS: Indicates whether the variant is annotated as high-confidence by LOFTEE in any CCDS transcript. Variants not annotated by LOFTEE (i.e. start lost, stop lost, and non LoF) receive NA