May 16, 2011

ibs()

Please, can anybody help to interpret the formula used in GenABEL's ibs() function:

f_{i,j} = Σ_k \frac{(x_{i,k} - p_k) * (x_{j,k} - p_k)}{(p_k * (1 - p_k))}


I am especially troubled to understand how to read the "\" and "frac{..." ?


Thanks!

2 comments:

  1. Looks like tex.


    http://texify.com/$f_{i,j}=\sum_k \frac{(x_{i,k} - p_k) * (x_{j,k} - p_k)}{(p_k * (1 - p_k))}$

    ReplyDelete
  2. The IBS matrix is calculated with the following formula
    IBS(i,j)=2/m sum{(g(ik)-pk)(g(jk)-pk)/(pk qk)}
    where the sum is from k=1 to m (the number of SNPs), g(ik) is 0,1/2 or 1 if individual i is AA, AB, or BB, pk is the frequency of allele B. In fact, a bit of algebra shows that IBS(i,j)=r(i,j) is just the average correlation of genotype scores between individuals assuming independent SNPs. GenABEL uses that but without the 2 factor, hence IBS(i,j)=r(i,j)/2. I don't know why this choice. I'll ask the GenABEL forum. A slight variation of the formula is with two sums, one for the denominator and another for the numerator. They give similar results, although the interpretation is different, i.e. is not true anymore that G(i,j)=r(i,j)

    ReplyDelete