Merging Files

library(BinaryDosage)

Quite often subjects have their genotypes imputed in batches. The files returned by these imputation can be converted into binary dosage files. These binary files can be merged into a single file if they have the same SNPs and different subjects using the bdmerge routine.

bdmerge

The bdmerge routine takes the following parameters

mergefiles - A character vector of the binary dosage file, family file, and map file names
format - Integer value indicating which format of the binary dosage file should be used for the merged files
subformat - Integer value indicating which subformat should be used for the merged files
bdfiles - A character vector of the binary dosage files to merge
famfiles - Character vector of the family files associated with the binary dosage files to merge
mapfiles - Character vector of the map files associated with the binary dosage files to merge
onegroup - Logical value indicating if the binary dosage saves SNP summary information about each merged file
bdoptions - Character vector indicating on which SNP information should be evaluated for the merged files. This cannot be used if onegroup is set to FALSE
snpjoin - Character value indicating if an inner or outer join is done for the SNPs

The following code merges vcf1a.bdose and vcf1b.bdose into one binary dosage file. It then displays the number of subjects in each file.

bd1afile <- system.file("extdata", "vcf1a.bdose", package = "BinaryDosage")
bd1bfile <- system.file("extdata", "vcf1b.bdose", package = "BinaryDosage")
bd1file <- tempfile()

bdmerge(mergefiles = bd1file, bdfiles = c(bd1afile, bd1bfile))

bd1ainfo <- getbdinfo(bd1afile)
bd1binfo <- getbdinfo(bd1bfile)
bd1info <- getbdinfo(bd1file)

nrow(bd1ainfo$samples)
#> [1] 60
nrow(bd1binfo$samples)
#> [1] 40
nrow(bd1info$samples)
#> [1] 100

- bdmerge