Please, can anybody help to interpret the formula used in GenABEL's ibs() function:
f_{i,j} = Σ_k \frac{(x_{i,k} - p_k) * (x_{j,k} - p_k)}{(p_k * (1 - p_k))}
I am especially troubled to understand how to read the "\" and "frac{..." ?
Thanks!
Looks like tex.
ReplyDeletehttp://texify.com/$f_{i,j}=\sum_k \frac{(x_{i,k} - p_k) * (x_{j,k} - p_k)}{(p_k * (1 - p_k))}$
The IBS matrix is calculated with the following formula
ReplyDeleteIBS(i,j)=2/m sum{(g(ik)-pk)(g(jk)-pk)/(pk qk)}
where the sum is from k=1 to m (the number of SNPs), g(ik) is 0,1/2 or 1 if individual i is AA, AB, or BB, pk is the frequency of allele B. In fact, a bit of algebra shows that IBS(i,j)=r(i,j) is just the average correlation of genotype scores between individuals assuming independent SNPs. GenABEL uses that but without the 2 factor, hence IBS(i,j)=r(i,j)/2. I don't know why this choice. I'll ask the GenABEL forum. A slight variation of the formula is with two sums, one for the denominator and another for the numerator. They give similar results, although the interpretation is different, i.e. is not true anymore that G(i,j)=r(i,j)