After you have created a ckmr object with create_ckmr, then it is time to simulate multilocus genotype pairs under all the relationships that you want to simulate from, and compute the likelihood of those relationships under different relationship hypotheses. This function does that.

simulate_Qij(
  C,
  sim_relats,
  calc_relats,
  reps = 10^4,
  unlinked = TRUE,
  forceLinkagePO = FALSE,
  pedigree_list = NULL,
  miss_mask_mat = NULL,
  rando_miss_wts = NULL,
  rando_miss_n = 0,
  froms,
  tos
)

Arguments

C

the ckmr object upon which to base the simulations.

sim_relats

a vector of names of the relationship IDs (these were the rownames in the kappa_matrix argument to create_ckmr to simulate from. For each relationship ID in sim_relats, genotype values will get simulated from the Y_l_true values in C.

calc_relats

a vector of names of the relationship IDs to calculate the genotype log probabilities of the simulated genotypes from. Genotype log probs are calculated using the Y_l matrices.

reps

a synonym for calc_relats for compatibility to an earlier version of CKMRsim.

unlinked

A logical indicating whether to simulate the markers as unlinked. By default this is TRUE. If FALSE, then genotypes at linked markers will be simulated using the program MENDEL, genotyping errors will be applied to them, and the Q_ij values themselves will still be computed under the assumption of no linkage. However, they will be simulated under the no-linkage model for relationships "U", "PO", and "MZ", because, in the absence of LD, related pairs under those relationships are not affected by physical linkage.

forceLinkagePO

If you really want to force simulation to be done under physical linkage for the PO case (perhaps to verify that you get the same result as with unlinked. Pass TRUE to this while unlinked is FALSE.)

pedigree_list

If you specify unlinked == FALSE, then you have to supply a pedigree_list.

miss_mask_mat

A logical matrix with length(YL) columns and reps rows. The (r,c)-th is TRUE if the c-th locus should be considered missing in the r-th simulated sample. This type of specification lets the user simulate either a specific pattern of missingness, if desired, or to simulate patterns of missing data given missing data rates, etc.

rando_miss_wts

weights to be given to different loci that influence whether they will be one of the rando_miss_n missing loci in any iteration. These will be recycled (or truncated) to have length equal to the number of loci, and they will be normalized to sum to one as appropriate (so you can provide them in unnormalized form.) The idea of this is to be able to use observed rates of missingness amongst loci to mask some loci as missing. Given as a comma-delimited string in column "rando_miss_wts" in the output.

rando_miss_n

a single number less than the number of loci. Each iteration, rando_miss_n loci will be considered missing, according to the rando_miss_wts. This let's you get a sense for how well you will do, on average, with a certain number of missing loci.

froms

a synonym for sim_relats for compatibility to an earlier version of CKMRsim.

tos

a synonym for calc_relats for compatibility to an earlier version of CKMRsim.