Project

General

Profile

Generate Ped Map

Command examples:

atav.sh --ped-map --sample PATH_TO_SAMPLE_FILE --out PATH_TO_OUTPUT_DIR

Command options:

--ped-map: trigger generate PED/MAP files function.

The sequence information of the indel is retained, but the genotype will be represented in the ped file using the nomenclature: I (for insertion) or D (for deletion). Thus the three genotypes would be I/I for homozygous ref, I/D for heterozygous or D/D for homozygous deletion. For insertions the nomenclature would be reversed: Homozygous reference subjects would be D/D, while homozygous insertion would be I/I.

--eigenstrat: trigger to run eigensoft after generating ped file and map file.

--flashpca: trigger to run FlashPCA after generating ped file and map file.

--flashpca-plink-pruning: trigger to run plink outlier removal; default (absence of this option) is false i.e. by default outliers are not removed.

--flashpca-num-eigvec: number of eigenvectors for flashpca; default = 10.

--flashpca-num-nearest-neighbor: number of neighbors to decide if sample is an outlier using plink outlier detection; default = 5

--flashpca-z-score-thresh: default value is -3

plink outlier detection generates an outlier.nearest file with the Z scores across the k nearest neighbors ( k = '--num-nearest-neighbor' value) for each sample.
If the average of z-scores across all nearest neighbors for each sample < z-score-thresh, the sample is classified as an "outlier".

--kinship: trigger to run kinship after generating ped file and map file.

this option will greedily remove one individual from a pair of relatives, as described in Kinship Pruning.

--kinship-relatedness-threshold: greedily remove samples as described if they have a kinship value above this threshold. Default value is 0.0884.

--sample-coverage-summary: optional option for kinship process, input file is "sample.summary.csv" output from Coverage Summary function.

Running eigenstra, flashpca or kinship script requires an LD-pruned list of variants with MAF>1-5%. If you do not have experience creating your own such list, then it is recommended that you use one of the SNP sets as described in SNV Data Sets as part of your command.

All the Command Options are available to use in this function.