Does anybody know of a way to recode SNP genotype data based on the minor allele frequency of that SNP? For example, if we have a column containing A/A, A/G, G/G genotypes and G is the minor allele, then we recode (or generate a new variable) such that A/A=0, A/G=1 and G/G=2. It's straightforward to just generate the a variable based on the levels of genotype, but what I can't tell is whether you can do this based on the frequency of the genotypes. In my understanding the convention is to have the homozygote of the common allele =0 and the homozygote of the rare allele =2 (and heterozygotes =1). So iterating over a bunch of SNPs based of different allele frequencies would be very convenient.
-
Login or Register
- Log in with
Comment