Project

General

Profile

Fisher's Exact Test

Introduction:

A Fisher’s exact test is implemented to analyze single variants. It allows users to screen for variants with frequency imbalances between cases and controls. Users can specify whether the test is focused on an imbalance of the variant allele frequency (by comparing frequencies of variant alleles in cases and controls; allele model), or by the frequency of selected genotypes (for example, in a dominant model, counts of both homozygotes (AA) and heterozygotes (Aa) for any given variant and compared between cases and controls). This analysis performs a Fisher’s exact test with allelic, dominant, recessive, and genotypic models.

allelic model:

(homAlleleNum + hetAlleleNum) compared to (refAlleleNum + hetAlleleNum)
Note: the count for male will be excluded on chrX

dominant model:

reference allele is major: (homSampleNum + hetSampleNum) compared to refSampleNum
reference allele is minor: (refSampleNum + hetSampleNum) compared to homSampleNum

recessive model:

reference allele is major: homSampleNum compared to (hetSampleNum + refSampleNum)
reference allele is minor: refSampleNum compared to (hetSampleNum + homSampleNum)

genotypic model:

homSampleNum compared to hetSampleNum compared to refSampleNum

The below rules will be applied to count the above allele number & sample number:
  1. Female & chr Y & outside Pseudoautosomal Regions --> excluded
  2. Male & Het & (chr X or chr Y) & outside Pseudoautosomal Regions --> excluded
  3. Male & (Hom or Ref) & (chr X or chr Y) & outside Pseudoautosomal Regions --> count as one allele
  4. For sampleNum, hom, het, ref will all be count as 1. (Do not meet rules 1, 2 ,3 above)
  5. For alleleNum, hom count as 2, het count as 1, ref count as 2. (Do not meet rules 1, 2 ,3 above)

Command examples:

atav.sh --fisher --sample PATH_TO_SAMPLE_FILE --out PATH_TO_OUTPUT_DIR

Command options:

--fisher: trigger Fisher's exact test function.

--threshold-sort: specify a threshold of p-value then ATAV will output a separate file (sorted by p-value) that only include variants with lower p-value.

--case-only: variants are only listed in the output when they are present in a case.

All the Command Options are available to use in this function.

Output:

allelic.csv, dominant.csv, recessive.csv, genotypic.csv

  • P value
  • Odds Ratio
  • Avg Min Case Cov
  • Avg Min Ctrl Cov

Please check Output Columns for common columns.