Project

General

Profile

Annotation Level Columns

  • Transcript Stable Id: first transcript in which most damaging function was annotated
  • Effect: most damaging effect of output variant, annotated from ClinEff
  • HGVS_c: annotation from ClinEff
  • HGVS_p: annotation from ClinEff
  • Polyphen Humdiv Score: PolyPhen-2 HumDiv Score for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humdiv Prediction: PolyPhen-2 HumDiv Classification for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humdiv Score (CCDS): PolyPhen-2 HumDiv Score for missense variants from Ensembl server (ATAV outputs most damaging score within CCDS transcripts)
  • Polyphen Humdiv Prediction (CCDS): PolyPhen-2 HumDiv Classification for missense variants from Ensembl server (ATAV outputs most damaging score within CCDS transcripts)
  • Polyphen Humvar Score: PolyPhen-2 HumVar Score for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humvar Prediction: PolyPhen-2 HumVar Classification for missense variants from Ensembl server (ATAV outputs most damaging score within all transcripts)
  • Polyphen Humvar Score (CCDS): PolyPhen-2 HumVar Score for missense variants from Ensembl server (ATAV outputs most damaging score within CCDS transcripts)
  • Polyphen Humvar Prediction (CCDS): PolyPhen-2 HumVar Classification for missense variants from Ensembl server (ATAV outputs most damaging score within CCDS transcripts)
  • Gene Name: Corresponding Gene name for variant effect
  • UpToDate Gene Name
  • Has CCDS Transcript: Boolean indicating whether variant has CCDS transcript.
  • All Effect Gene Transcript HGVS_p Polyphen_Humdiv Polyphen_Humvar: The predicted amino acid change for all transcripts effected by this variant

EVS dateset:

  • Evs All Maf: Minor allele frequency among the EVS population
  • Evs All Genotype Count: Genotype Distribution among the EVS population
  • Evs Filter Status: PASS if passes EVS's QC criteria; FAIL if not, which means the variant was found in the EVS samples but was excluded from their analysis

ExAC dateset:

  • ExAC global maf: Global minor allele frequency among the collective ExAC cohort of ~6K samples (ExAC release 0.3)
  • ExAC global gts: Global genotype distribution among the collective ExAC cohort of ~6K samples (ExAC release 0.3)
  • ExAC afr maf: African (american) minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC afr gts: African (american) genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC amr maf: Latino minor allele frequency among the collective ExAC cohort (ExAC release 0.30)
  • ExAC amr gts: Latino genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC eas maf: East Asian minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC eas gts: East Asian genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC sas maf: South Asian minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC sas gts: South Asian genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC fin maf: Finnish allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC fin gts: Finnish genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC nfe maf: Non-finnish European minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC nfe gts: Non-finnish European genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC oth maf: Other population minor allele frequency among the collective ExAC cohort (ExAC release 0.3)
  • ExAC oth gts: Other population genotype distribution among the collective ExAC cohort (ExAC release 0.3)
  • ExAC vqslod: The vqslod provided for this variant by the ExAC consortium
  • ExAC Mean Coverage: The mean coverage at this site among the ExAC population
  • ExAC Sample Covered 10x: The number of samples with at least 10-fold coverage at this site among the ExAC population

RVIS dataset:

  • IGM-Roche-Avg%GeneCov "A population averaged estimate of the percentage of the protein-coding sequence for a gene that's covered with atleast 10-fold coverage across IGM sequenced exomes (Roche kit)".

A higher estimate indicates the better the protein-coding sequence of the gene is captured (on average) using the Roche kit.

  • 0.1%RVIS[EVS] "EVS-based RVIS as published in PLoS Genetics paper (Petrovski et al. 2013)"
  • 0.1%RVIS%[EVS] "EVS-based RVIS percentile as published in PLoS Genetics paper (Petrovski et al. 2013)"

As published, a lower RVIS and percentile score indicates a gene is increasingly intolerant relative to the rest of the assessed human protein-coding genes

Given the EVS cohort and protein-coding gene size some genes (Edge Case genes) have insufficient resolution data to be confident of their RVIS. For these genes, we recommend considering their OERatio as an alternative estimate of their genic intolerance.

  • OEratio%tile[EVS] "Alternative EVS-based genic intolerance score for EdgeCase genes flagged by indicator as Y (See: http://genic-intolerance.org/about.jsp)"
  • GenicConstraint[EVS] "EVS-based Constraint (missense z) scores as published in the Nature Genetics paper (Samocha et al. 2014)"

A percentile estimate of how constrained the gene is relative to the rest of the protein-coding exome.

  • 1.0%ncRVIS[IGM] "IGM WGS-based noncoding RVIS (ncRVIS) reflecting a gene's 5', 3' and promoter sequence intolerance to variation (Petrovski et al. 2015 [26332131])"
  • 1.0%ncRVIS%tile[IGM] "IGM WGS-based noncoding RVIS (ncRVIS) percentile reflecting a gene's 5', 3' and promoter sequence intolerance to variation (Petrovski et al. 2015 [26332131])"

A percentile estimate of how intolerant the noncoding exome sequence of a protein-coding gene is relative to the noncoding sequence of other protein-coding genes. Lower ncRVIS and percentile indicate increasingly intolerant noncoding (UTR and 250bp promoter) sequence. Genes with intolerant noncoding sequence have been described as begin dosage-sensitive genes.

  • ncGERP "GERP++ based non-coding genic conservation score reflecting the average GERP++ score for a protein-coding gene's noncoding exome sequence + 250bp promoter (Petrovski et al. 2015 [26332131])"
  • ncGERP%tile "GERP++ based non-coding genic conservation percentile reflecting how conserved a protein-coding gene's noncoding exome sequence + 250bp promoter is based on all protein-coding genes (Petrovski et al. 2015 [26332131])"

A percentile estimate of how overall conserved the noncoding exome sequence of a protein-coding gene is relative to the noncoding sequence of other protein-coding genes. Higher ncGERP scores and lower ncGERP percentiles indicate increasingly conserved noncoding (UTR and 250bp promoter) sequence. Genes with highly conserved noncoding sequence have been described as begin dosage-sensitive genes.

  • pcGERP "GERP++ based protein-coding genic conservation score reflecting the average GERP++ score for a protein-coding gene's coding sequence is (Petrovski et al. 2015 [26332131])"
  • pcGERP%tile "GERP++ based protein-coding genic conservation percentile reflecting how conserved a protein-coding gene's coding sequence is based on all protein-coding genes (Petrovski et al. 2015 [26332131])"

A percentile estimate of how overall conserved the protein-coding exome sequence of a gene is relative to the protein-coding sequence of other genes. Higher pcGERP scores and lower pcGERP percentiles indicate increasingly conserved protein-coding genes.

The RVIS has been reformulated using the ExAC cohort (accessed: January 13th 2015 [updated scores for release0.3]). Note, in addition to using a larger population of sequenced samples, this updated genic score differs from the original implementation in three major ways: the 'common' MAF is assigned as 0.05% and the score now leverages the stratified ethnicity data provided within ExAC. Moreover, X chromosome genes are assessed independently of the autosomal genes.

  • OEratio%tile[ExAC] "Alternative ExAC-based genic intolerance score for EdgeCase genes flagged by indicator as Y"

An OEratio, as previously described, now constructed on the ExAC cohort.


KnownVar dateset:

  • HGMDm2site: Flags how many (and the details) of HGMD variants overlapping with a -2 position from the variant of interest. For multiple overlaps events separated with a pipe "|".
  • HGMDm1site: Flags how many (and the details) of HGMD variants overlapping with a -1 position from the variant of interest. For multiple overlaps events separated with a pipe "|".
  • HGMD site: Tally of how many variant-disease associations are reported in HGMD at the specific variant of interest site. Includes precise indel overlaps.
  • HGMD Disease: List of disease name(s) associated with the HGMD entry for this variant. Includes precise indel overlaps.
  • HGMD PMID: List of Pubmed ID for the publication(s) that resulted in the HGMD entries corresponding to the variant. Includes precise indel overlaps.
  • HGMD Class: List of HGMD classifications of variant (for more info see: http://www.hgmd.cf.ac.uk/ac/index.php). Includes precise indel overlaps.
  • HGMDp1site: Flags how many (and the details) of HGMD variants overlapping with a +1 position from the variant of interest. For multiple overlaps events separated with a pipe "|".
  • HGMDp2site: Flags how many (and the details) of HGMD variants overlapping with a +2 position from the variant of interest. For multiple overlaps events separated with a pipe "|".
  • HGMD indel 9bpflanks: For indels this highlights how many indels of any classification are reported in HGMD at the site and within 9bp flanking region.
  • ClinVar: Tally of how many variant-disease associations are reported in ClinVar at the specific variant of interest site.
  • ClinVar Disease: List of disease name(s) corresponding to the ClinVar "ConceptID"(s) assigned to the ClinVar variant.
  • ClinVar Clinical Significance: List of values of clinical significance reported for this variant (for more info see: ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/README.txt).
  • ClinVar PMID: List of Pubmed ID for the publication(s) that resulted in the ClinVar entries corresponding to the variant.
  • ClinVar Other Ids: List of other identifiers or sources of information about this variant.
  • ClinVar pathogenic indels: This highlights how many "Pathogenic" indels are reported in Clinvar at the site and within 9bp flanking.
  • ClinVar all indels: This highlights how many indels of any classification are reported in Clinvar at the site and within 9bp flanking.
  • ClinVar Pathogenic Indel Count: indel "pathogenic/likely pathogenic" variants.
  • Clinvar Pathogenic CNV Count: CNV "pathogenic/likely pathogenic" variants.
  • ClinVar Pathogenic SNV Splice Count: Splice "pathogenic/likely pathogenic" variants.
  • ClinVar Pathogenic SNV Nonsense Count: Nonsense "pathogenic/likely pathogenic" variants.
  • ClinVar Pathogenic SNV Missense Count: Missense "pathogenic/likely pathogenic" variants.
  • ClinGen: Flags genes considered to be 'Halploinsufficient' by the ClinGen curation committee.
  • ClinGen HaploinsufficiencyDesc: ClinGen Dosage Sensitive Map (see: ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/clingen/ClinGen_gene_curation_list.tsv). Restricted to Haploinsufficiency curations.

"Unlikely" = Dosage sensitivity unlikely; "No evidence" = No evidence available; "Little evidence" = Little evidence for dosage pathogenicity; "Some evidence" = Some evidence for dosage pathogenicity; "Sufficient evidence" = Sufficient evidence for dosage pathogenicity; "Recessive evidence" = Gene associated with autosomal recessive phenotype.

  • ClinGen TriplosensitivityDesc: ClinGen Dosage Sensitive Map (see: ftp://ftp.ncbi.nlm.nih.gov/pub/dbVar/clingen/ClinGen_gene_curation_list.tsv). Restricted to Triplosensitivity curations.
  • OMIM Disease: List of OMIM disease associations linked to the Gene name linked to the variant (for more info see: http://www.omim.org/).
  • RecessiveCarrier: Curated indicator (1 = yes) of known Carrier disease genes as curated by IGM team members: Ayal Gussow and Matt Halvorsen March 2014.
  • ACMG: The annotation associated with a gene is three pieces of information separated by a pipe delimiter. All information is taken from Table 1 of Green et al.2013. First field = inheritance model | second field = Type of Variation to Report | third field = ACMG associated disorder.

Kaviar dataset:

  • Kaviar MAF
  • Kaviar Allele Count
  • Kaviar Allele Number

1000 Genomes dataset:

  • 1000 Genomes GLOBAL maf
  • 1000 Genomes EAS maf
  • 1000 Genomes EUR maf
  • 1000 Genomes AFR maf
  • 1000 Genomes AMR maf
  • 1000 Genomes SAS maf

subRVIS dataset:

  • subRVIS Domain Name
  • subRVIS Domain Score
  • subRVIS Domain OEratio Percentile
  • subRVIS Exon Name
  • subRVIS Exon Score
  • subRVIS Exon OEratio Percentile

Note: the scores are ExAC 0.3 based.


LIMBR dataset:

  • LIMBR Domain Name
  • LIMBR Domain Score
  • LIMBR Domain Percentile
  • LIMBR Exon Name
  • LIMBR Exon Score
  • LIMBR Exon Percentile

MGI dataset:

  • MGI Phenotypes: it lists distinct 'high-level' mouse phenotypes reported for a HGNC gene name, with multiple entries separated using ATAV standards. To serve as a reference.
  • MGI Essential: it is a 0/1 indicator to highlight whether that ATAV gene has been linked to lethality when disrupted in mice. To serve as an enrichment reference.
  • MGI Seizure: it is a 0/1 indicator to highlight whether that ATAV gene has been linked to seizures when disrupted in mice. To serve as an enrichment reference.

DiscovEHR dataset:
  • DiscovEHR AF: allele frequencies (exact for MAF > 0.001 and binned for <0.001 and we converted to 0.00099)

MTR dataset:
  • MTR:
  • MTR FDR:
  • MTR Centile:

Only when variant's effect started with 'NON_SYNONYMOUS', it will output MTR scores.


LOFTEE dataset:
  • LOFTEE-HC in CCDS: Indicates whether the variant is annotated as high-confidence by LOFTEE in any CCDS transcript. Variants not annotated by LOFTEE (i.e. start lost, stop lost, and non LoF) receive NA