Multiple testing


A test statistic is typically declared significant when the associated p-value is ≤0.05. This is also the false positive rate α. In GWAS, the number of tested SNPs is of the order of 104 (domestic species) to 106 (humans). If they were completely independent of each other, i.e. no linkage disequilibrium between SNPs, the number of false positives would be of the order of α104 = 500 to α106 = 50,000. Hence, the appropriate thresholds would have be of the order of 0.05/104 = 5·10-6 to 0.05/106 = 5·10-8. This is known as Bonferroni’s correction for multiple testing. This correction is considered too conservative given the block-like structure of genomes, i.e. there is linkage disequilibrium between adjacent SNPs. Conservative thresholds lead to high false negative error rates. Thus, the appropriate significant thresholds are a matter for further discussion.



A fixed significance threshold for GWAS in humans

Pe'er et al., 2008 aimed to define the testing burden (tb) as the factor by which significance is exagerated. Phased chromosomes from the Human Haplotype Map (HapMap) ENCODE regions (representing a fraction g = 1/600 of the genome) were used to generate randomly 1,000 cases and 1,000 controls (no association expected). Association statistics and p-values were calculated and the process was simulated N = 10e7 times.

As I understand it, for a given p, a nominal p-value computed from the theoretical distribution, n(p) was calculated as the number of simulations out of N simulations at which the best simulated p-value region-wide (i.e. in the region of size g) exceeded pH(p) = n(p)/g·N, was defined as the number of expected regions in the genome that have a SNP exceeding p (i.e. expected significant hits in the genome by chance). tb is defined as H(p)/p and by consesus they define H(p) = 1 so that tb = 1/p. So far so good.

What it is not so clear to me is why they set p to be the gN th element of the list of the top single-hits in each of the simulations sort from the smallest (most significant) to the largest. Less clear is how they extrapolate to estimate the number of independent tests (1 million for all ENCODE SNPs in the CEU HapMap population), which is used to set a fixed threshold of genome-wide significance of P = 0.05 / 1 million = ~10e-8.

No comments:

Post a Comment