One of the initial thoughts was to use permutations to assess whether there is greater interconnectivity among the qualifying genes for a given index case. Here, we're using "qualifying gene" to describe a gene where one or more 'qualifying variants' has been observed in the index case. For now this is mostly helpful as a quick 'visual' check to determine if we have known disease genes (also known to interact) interacting in a given patient.
- The PPI resource is based on BioGrid (http://thebiogrid.org/)
- This was originally conceptualized as an add-on tool for our collapsing analyses, but can work with any ATAV genotypes file.
- For an index case we assess 'observed' connectivity based on the number of unique PPI interactions among the individuals' set of qualifying genes.
- Also, each gene/protein in the database comes with a known number of 'edges' (aka connecting proteins).
- Thus, for each observed qualifying gene in an index case, we randomly sample from the genome for genes with similar # of edges and then once done for each qualifying gene we assess whether, for that sample, there was greater connectivity among the randomized sampling than the observed.
- For each index case, we currently repeat this ~1K permutations and ask how often do we see more interconnectivity in the permutations than our actual observation. Conceptually, this is pretty much the same to what DAPPLE does.
- The GenesInPPI output actually lists all interacting gene pairs for each patient. This is often of interest to determine whether an index case has hits in interacting genes that might be of interest to the ascertainment.
atav.sh --ppi --geno PATH_TO_YOUR_GENOTYPE_FILE --perm 1000 --out PATH_TO_OUTPUT_DIR
--ppi: trigger ppi function.
--geno: specify a collapsing output genotpyes.csv as input file.
--perm: specify the number of permutations for analysis. (default 100)
--ppi-file: specify the BioGrid file. (default using /nfs/goldstein/software/atav_home/data/ppi/BioGrid.csv)