Generate Ped Map¶
atav.sh --ped-map --sample PATH_TO_SAMPLE_FILE --out PATH_TO_OUTPUT_DIR
--ped-map: trigger generate PED/MAP files function.
The sequence information of the indel is retained, but the genotype will be represented in the ped file using the nomenclature: I (for insertion) or D (for deletion). Thus the three genotypes would be I/I for homozygous ref, I/D for heterozygous or D/D for homozygous deletion. For insertions the nomenclature would be reversed: Homozygous reference subjects would be D/D, while homozygous insertion would be I/I.
--eigenstrat: trigger to run eigensoft after generating ped file and map file.
--flashpca: trigger to run FlashPCA after generating ped file and map file.
--flashpca-plink-pruning: trigger to run plink outlier removal; default (absence of this option) is false i.e. by default outliers are not removed.
--flashpca-num-eigvec: number of eigenvectors for flashpca; default = 10.
--flashpca-num-nearest-neighbor: number of neighbors to decide if sample is an outlier using plink outlier detection; default = 5
--flashpca-z-score-thresh: default value is -3
plink outlier detection generates an outlier.nearest file with the Z scores across the k nearest neighbors ( k = '--num-nearest-neighbor' value) for each sample.
If the average of z-scores across all nearest neighbors for each sample < z-score-thresh, the sample is classified as an "outlier".
--kinship: trigger to run kinship after generating ped file and map file.
this option will greedily remove one individual from a pair of relatives, as described in Kinship Pruning.
--kinship-relatedness-threshold: greedily remove samples as described if they have a kinship value above this threshold. Default value is 0.0884.
--sample-coverage-summary: optional option for kinship process, input file is "sample.summary.csv" output from Coverage Summary function.
All the Command Options are available to use in this function.