Project

General

Profile

Diagnostic Analysis Workflow

This diagnostic analysis workflow has been developed to filter for and prioritize pathogenic variation/mutation in patient samples.

Relevant reading material:
  • Petrovski, Slavé, et al. "Whole-exome sequencing in the evaluation of fetal structural anomalies: a prospective cohort study." The Lancet (2019).
  • Zhu X, et al. "Whole exome sequencing in undiagnosed genetic diseases: Interpreting 119 trios." Genetics in Medicine (2015).

Variant filtration:

ATAV Commands:

Notes: The path to atav.sh is /nfs/goldstein/software/sh/atav.sh

Trio (or Duo) analysis

Rare variants:

atav.sh --list-trio --impact HIGH,MODERATE,LOW --min-ad-alt 3 --qual 30 --gq 20 --filter pass,likely,intermediate --max-default-control-af 0.01 --max-gnomad-exome-af 0.01 --max-gnomad-genome-af 0.01 --include-qc-missing --sample $SAMPLE --out $OUTPUT/rare_var

Modifier known variants:

atav.sh --list-trio --known-var-pathogenic-only --modifier-only --min-ad-alt 3 --qual 30 --gq 20 --filter pass,likely,intermediate --include-qc-missing --sample $SAMPLE --out $OUTPUT/modifier_known_var

Coverage Summary: 

atav.sh --coverage-summary --trio --gene-boundary /nfs/goldstein/software/atav_home/data/ccds/addjusted.CCDS.genes.index.r20.hg19.txt --min-coverage 10 --percent-region-covered .9 --sample $SAMPLE --out $OUTPUT/coverage

Relatedness:

atav.sh --ped-map --kinship --trio --variant /nfs/goldstein/software/atav_home/data/variant/informative_snps.ld_pruned.37MB.txt --min-coverage 10 --sample $SAMPLE --out $OUTPUT/relatedness

Singleton analysis

Rare variants:

atav.sh --list-singleton --impact HIGH,MODERATE,LOW --min-ad-alt 3 --qual 30 --gq 20 --filter pass,likely,intermediate --max-default-control-af 0.01 --max-gnomad-exome-af 0.01 --max-gnomad-genome-af 0.01 --include-qc-missing --sample $SAMPLE --out $OUTPUT/rare_var

Modifier known variants:

atav.sh --list-singleton --known-var-pathogenic-only --modifier-only --min-ad-alt 3 --qual 30 --gq 20 --filter pass,likely,intermediate --include-qc-missing --sample $SAMPLE --out $OUTPUT/modifier_known_var

Coverage Summary: 

atav.sh --coverage-summary --gene-boundary /nfs/goldstein/software/atav_home/data/ccds/addjusted.CCDS.genes.index.r20.hg19.txt --min-coverage 10 --percent-region-covered .9 --sample $SAMPLE --out $OUTPUT/coverage

Note: all the external dataset are used in variant prioritization / tier classification will be included by default.

Example output

/nfs/goldstein/software/atav_home/example/diagnostic

Variant prioritization:

Variant reviewing SOP

Click to check primary Diagnostic Columns

Trio (or Duo) analysis output

  • rare_variant_trio_genotypes.csv: rare variant with Tier1 or Tier2 or LoF Gene or Known Variant
  • rare_variant_trio_genotypes_noflag.csv: rare variants do not meet any of the flags above
  • modifier_known_variant_trio_genotypes.csv: all non-coding known variants

Singleton analysis output

  • rare_variant_singleton_genotypes.csv: rare variant with Tier1 or Tier2 or LoF Gene or Known Variant
  • rare_variant_singleton_genotypes_noflag.csv: rare variants do not meet any of the flags above
  • modifier_known_variant_singleton_genotypes.csv: all non-coding known variants

How to document Notes, Clinical Fit and Curation Results

  1. Save CSV file as Excel file
  2. Create two new columns at the very left of the output. Name the first column “Notes”, and the second column “Clinical Fit”
  3. Write your notes on Note column
  4. Document clinical fit in clinical fit column
  5. Highlight variants you want to present in Green, and rows you need to do more research on in yellow
  6. Highlight variants that you have analyzed and have ruled out in gray, as they may come up more than once using the filters below

SOP for Trio/Duo Analysis of Rare Coding Variants (rare_variant_trio_genotypes.csv)

  • Apply (Excel) filter to column Single Variant Prioritization to review all single variants:
    • 01_TIER1_DNM_HZ: Tier 1 De novo variants in "Hot Zone"
    • 02_TIER1_DNM: Tier 1 De novo non-HZ variants
    • 03_TIER1_HOMO_HEMI: Tier 1 Homo/Hemi variants
    • 04_TIER2_DNM: Tier 2 De novo variants
    • 05_TIER2_HOMO_HEMI: Tier 2 Homo/Hemi variants
    • 06_LOF_GENE: LoF Dominant and Haploinsufficient Gene
    • 07_KNOWN_VAR: Previously reported ClinVar P/LP variants or HGMD DM (as long as it is NOT ClinVar B/LB)
    • 08_CLINVAR_SITE: >=1 ClinVar P/LP variant at site. Includes precise indel overlaps.
    • 09_CLINVAR_2BP: >= 1 ClinVar P/LP variant within 2bp flanking. Includes precise indel overlaps.
    • 10_HGMD_SITE: >=1 HGMD DM variant at site. Includes precise indel overlaps.
    • 11_MIS_HOT_SPOT: Missense Dominant and Haploinsufficient Gene + ClinVar PLP 25bpflanks count >= 6
    • 12_TIER1_OMIM_MIS_INFRAME: High-quality ultra rare novel missense/inframe indel variants
    • 13_ACMG_GENE: ACMG v3 gene associated variants
  • Apply (Excel) filter to column Compound Het Variant Prioritization to review all compound het variants:
    • 01_TIER1
    • 02_TIER2

SOP for Trio Analysis of Rare Coding Variants that meet no Flags (rare_variant_trio_genotypes_noflag.csv)

These are variants that DO NOT meet any of the flags (Tier 1 or 2 or LoF Gene or KV). This output will be helpful for the following: you have a specific gene/gene list in mind, or if you want to look for a second variant, or if the patient has a more common phenotype or if looking for inherited variants from the parents. Note that duos are also included in this Output. In this output review all the variants/genes for clinical overlap.

SOP for Trio Analysis of Non-Coding Known Variants (modifier_known_variant_trio_genotypes.csv)

All variants meet the Known Pathogenic Variant Flag (previously reported P/LP variants on CV or DM (only if it is NOT CV B/LB)). Note that duos are also included in this Output.

  1. Apply (Excel) filter to column “Tier Flag (Single Var)” for Tier 1 or 2 variants
  2. Remove Tier filtering then go to column “ACMG Disease” and uncheck NA, review those variants
  3. Remove ACMG filter, go to “Gene Link” filter for genes that have OMIM genes
  4. Quickly review phenotypes to see if it is a phenotype fit

Singleton Analysis of Rare Coding Variants (rare_variant_singleton_genotypes.csv)

  • Apply (Excel) filter to column Single Variant Prioritization to review all single variants:
    • 01_LOF_GENE: LoF Dominant and Haploinsufficient Gene
    • 02_TIER1_HOMO_HEMI: Tier 1 Homo/Hemi variants
    • 03_KNOWN_VAR: Previously reported ClinVar P/LP variants or HGMD DM (as long as it is NOT ClinVar B/LB)
    • 04_CLINVAR_SITE: >=1 ClinVar P/LP variant at site. Includes precise indel overlaps.
    • 05_CLINVAR_2BP: >= 1 ClinVar P/LP variant within 2bp flanking. Includes precise indel overlaps.
    • 06_HGMD_SITE: >=1 HGMD DM variant at site. Includes precise indel overlaps.
    • 07_MIS_HOT_SPOT: Missense Dominant and Haploinsufficient Gene + ClinVar PLP 25bpflanks count >= 6
    • 08_TIER1_OMIM_MIS_INFRAME: High-quality ultra rare novel missense/inframe indel variants
    • 09_ACMG_GENE: ACMG v3 gene associated variants
  • Apply (Excel) filter to column Compound Het Variant Prioritization to review all compound het variants:
    • 01_TIER1
    • 02_TIER2

SOP for Singleton Analysis of Rare Coding Variants that meet no Flags (rare_variant_singleton_genotypes_noflag.csv)

These are variants that DO NOT meet any of the flags (Tier 1 or 2 or LoF Gene or KV). This list will have inherited variants, variants in specific gene/gene list in mind, or if you want to look for a second variant, or if the patient has a more common phenotype.. In this output review all the variants/genes for clinical overlap.

SOP for Singleton Analysis of Non-Coding Known Variants (modifier_known_variant_singleton_genotypes.csv)

All variants meet the Known Pathogenic Variant Flag (previously reported P/LP variants on CV or DM (only if it is NOT CV B/LB)).

  1. Apply (Excel) filter to column “Tier Flag (Single Var)” for Tier 1 or 2 variants
  2. Remove Tier filtering then go to column “ACMG Disease” and uncheck NA, review those variants
  3. Remove ACMG filter, go to “Gene Link” filter for genes that have OMIM genes
  4. Quickly review phenotypes to see if it is a phenotype fit

We suggested to mainly focus on reviewing rare coding variants output (rare_variant_trio_genotypes.csv or rare_variant_singleton_genotypes.csv).