As part of our efforts to increase operational efficiency, we are migrating the ATAV application to use the IGM cluster. One important distinction in this framework is that these nodes are lightweight nodes – they SHOULD NOT be used for any compute intensive applications. Please DO NOT run ATAV or any other programs locally on that submit host server as any these will crash the submit host and make it unavailable for all users.
We have setup two submit host servers for ATAV jobs submission to cluster. For internal users who can access igm-atav03 and igm-atav04, please ssh to: 10.73.52.20 (igm-atav-qsub01).
For external users who can access igm-atav05, please ssh to: 10.73.50.99 (igm-atav-qsub02)
These host are designed to be used to submit jobs using qsub, So all submit commands should be prefixed with this – and your job will automatically be submitted to cluster.
qsub /nfs/goldstein/software/sh/atav.sh \ --list-var-geno \ --sample /nfs/goldstein/software/atav_home/data/sample/ALS_1424_DukeGr_ctrl.txt \ --function /nfs/goldstein/software/atav_home/data/function/functional.txt \ --gene TBK1 \ --ctrl-maf 0.01 \ --min-coverage 10 \ --out ~/hello_atav
We will keep the current ATAV servers accessible for another week after which we plan to slowly repurpose them. Feel free to contact us if you have any questions or concerns.
- Added a new filter option --min-covered-sample-binomial-p to Collapsing function. (issue #2385)
Ex. --min-covered-sample-binomial-p 0.05
We recommend using a standard nominal significance threshold of 0.05 in order to exclude variants at sites with statistically significant coverage differences between cases and controls. Thresholds greater than 0.05 will exclude more variants. Care should be taken using thresholds below 0.05, as inflation can be an issue as the number of permitted variants increases.
This test is performed using the user-provided --min-coverage threshold (e.g. 10x or 20x) to determine covered/not covered status for each sample at each site.
See attached PDF for a detailed description of the binomial coverage comparison test, as well as testing results from two cohorts.
- Added gnomAD Genome dataset and command options (issue #2445):
--gnomad-genome-pop --gnomad-genome-maf --gnomad-genome-as-rf-snv --gnomad-genome-as-rf-indel --include-genome-exome
- Warn users of duplicate CHGVIDs. (issue #2462)
- Added a new toolkit Cohort Selection to generate ATAV input sample file:
run /nfs/goldstein/software/cohort_selection/build_cohort.py --help to see options.
- Update KnownVar 2017 Q2 (issue #2345)
- Polyphen score for CCDS transcripts (issue #2362)
- Added new genotype QC filter command options that can be used separately for SNVs and indels. (issue #2338)
Ex. --snv-fs 60, --indel-fs 200
For more QC filter options, check Genotype Level Filter Options.
- Added gnomAD Exome dataset and command options (issue #2284):
--gnomad-exome-pop --gnomad-exome-maf --gnomad-exome-as-rf-snv --gnomad-exome-as-rf-indel --include-gnomad-exome
- All ExAC command options refer to use v0.3 dataset (ATAV 6.5.5 and ATAV 6.5.6 used gnomAD Exome dataset for the following command options):
--exac-pop --exac-maf --min-exac-vqslod-snv --min-exac-vqslod-indel --include-exac
- --include-all-external-data command option disabled, check here for the command options to include external dataset in your analysis.
- Fixed wrong genotype count issue and missing values (ExAC Sample Covered 10x, ExAC AB MEDIAN, ExAC GQ MEDIAN, ExAC AS RF) in ExAC v2. (issue #2213)
- Added new ExAC filter --min-exac-as-rf-snv and --min-exac-as-rf-indel. (issue #2213)
Suggested cutoffs are prob >= 0.1 (high-confidence SNVs) and prob >= 0.2 (high-confidence indels)
- Update ExAC from version 0.3 to 2.0 (issue #2213) - beta version
Added one more population: ASJ
Added four additional fields:
ExAC AB MEDIAN
ExAC GQ MEDIAN
ExAC AS RF
--min-exac-vqslod-snv and --min-exac-vqslod-indel is no longer support since VQSLOD doesn't perform well at large scale joint-calling.
- Invalid input genes caused ATAV crash (issue #2079)
- Kinship relatedness pruning (--kinship) integrated in --ped-map function (issue #2148)
- Add new external dataset DenovoDB (issue #2149)
using --include-denovo-db to include denovo-db variants in the final output when ran a variant based functions.
using List Denovo DB to output variant denovo-db variants only.
- Update KnownVar 2017 Q1 (issue #2164)
Upgraded dataset: HGMD, ClinVar, ClinGen, OMIM, MGI
- Update trio run tier data filtering process (issue #2156)
- Output pruned sample file when run --ped-map function with --eigenstrat option (issue #1960)
- Job crash when generating permutation QQ plot (issue #2035)
- --site-coverage-comparison output wrong aggregated info in the log file. (issue #2007)
- Inclusion of Binomial Exact Test in Collapsing Analysis Output & Filter. (issue #2014, #2015)
- KnownVar dataset upgraded: ClinVar, HGMD, ACMG, OMIM. (issue #1951)
- Multiple Frequency Bands for --het-percent-alt-read. (issue #1884)
- Flag deletions supported by inherited genotypes in trio rules. (issue #1988)
- chip2pca replacement. (issue #1960)
Also available in: Atom