Project

General

Profile

Coverage Analysis Functions

Coverage Summary

Coverage Comparison

Site Coverage Summary

Site Coverage Comparison


Command options below are required to use in all coverage analysis functions:

--min-coverage: specify a minimum coverage (read depth). (Only accept value 10, 20, 30, 50, 200)

--sample: specify a sample file to include all your interested samples and their case/ctrl and family status.

file format: Family ID, Individual ID, Paternal ID, Maternal ID, Sex, Phenotype, Sample Type, Capture Kit (tab delimited)

Family ID: specify a family id or use the same value as Individual ID to indicate this sample used as a non family control or case
Individual ID: sample name (child)
Paternal ID: sample name (father) or 0 (indicate not available)
Maternal ID: sample name (mother) or 0 (indicate not available)
Sex: 1=male,2=female
Phenotype: 1=control, 2=case
Sample Type & Capture Kit: please use the value form seqdb, including “N/A” for genome samples
Ex. /nfs/goldstein/software/atav_home/data/sample/ALS_1424_DukeGr_ctrl.txt

Note: please make sure all the controls are approved to use in your input sample file.

--gene-boundary: specify a gene-boundary file to indicate which exonic regions you want to include in your analysis. This file is defined by a gene name followed by its region(exon) information. When the gene-boundary option is specified, a variant not only has to be in the regions from the gene-boundary file but also has to match the gene name or gene domain name to be output.

Ex:

gene name: AVPR1B 1 (206224439..206225382,206230806..206231144) 1283
gene domain name: CFHR5_-_0 1 (196946795..196946852,196952015..196952209,196953091..196953095) 258

Note: There are 4 columns in this format,separated by space. Column 1 is the gene name or gene domain name; column 2 is the chromosome (1,2,...X,Y); column 3 is a list of regions(exons) that one wants to use to define the gene, separated by comma, enclosed by parenthesis, with each region in the format of region_start..region_end; column 4 is the total count of sites from all regions in column 3. The start/stop positions in gene-boundaries file is one based.
CCDS gene boundaries file directory: /nfs/goldstein/software/atav_home/data/ccds (currently recommend using /nfs/goldstein/software/atav_home/data/ccds/addjusted.CCDS.genes.index.r14.txt; this file has 2bp added to the ends of each exon to catch splice sites)
Gene domain example file: /nfs/goldstein/software/atav_home/data/gene/dRVIS_domain_index_withoutUTR.txt