May 12, 2011

Converting PLINK to GenABEL files

There are two R functions to convert from PLINK to GenABEL files: convert.snp.ped() (which inputs PLINK's *.ped and *.map files) and convert.snp.tped() (same for *.tped and *.tfam). An error happened when I tried to use convert.snp.ped() and R was forced to close. I had no problems with convert.snp.tped(). I know of one person who experienced the same issue although I do not know whether the same error message was prompt (and the cause of the problem).

On the other hand, be aware that any of the two functions (although both use as input files containing the phenotypic information) will only produce one of the input files (*.raw) for the GenABEL function that loads data: load.gwaa.data(). The second input file (*.dat) will have to be produced for instance from the *.tfam file. This means that (1) *.tfam does not have header and and *.dat must have (at least 'id' and 'sex' fields are required) and (2) PLINK uses 1/2 coding for sex by default whereas load.gwaa.data() uses 1/0. On the other hand, one would have to create anyway the *.dat file anyway if using a alternative phenotype file in PLINK (to allow for >1 phenotype or covariates). It is not the end of the world but it makes the conversion a bit less direct than expected...

Maybe somebody can add his/her experience?

4 comments:

  1. I use

    convert.snp.illumina(inf="allgenotypes.txt",out="allgenotypes.raw",strand="file")

    where allgenotypes.txt contains
    SNPID CHROMOSOME POSITION STRAND SNP1 SNP2 ... SNPn

    and then

    mydata<-load.gwaa.data(phenofile="mydata.txt",genofile="allgenotypes.raw",force=TRUE,makemap=FALSE,sort=TRUE)

    where mydata.txt contains id, sex and other variables such as traits and covariates (see GenABEL manual)
    jules

    ReplyDelete
  2. Thanks Jules. I managed to convert the files but I just wanted to point out that it was not so straightforward as I expected because (1) one of the functions described to use PLINK's files as input seems to not work well and (2) convert.snp.tped() does take advantage of the fact that the *.fam already already has information on the phenotype and that it requires to update sex status to the 0/1 coding.

    ReplyDelete
  3. Hello
    I am very new in both R and PLINK. My question might seem naive to you guys.
    If i have .bed instead of .ped file how do i convert it? i searched google but not finding any way.

    Thanks in advance :)

    ReplyDelete
    Replies
    1. plink --bfile filename --recode --out filename
      Input filename.bed, bim, fam out put filename.ped, map

      Delete