This is used for identifying duplicate individuals/genotypes in large data sets. I've specified this in terms of the max number of missing loci because I think everyone should already have tossed out individuals with a lot of missing data, and then it makes it easy to toss out pairs without even looking at all the loci, so it is faster for all the comparisons.
find_close_matching_genotypes(LG, CK, max_mismatch)
a long genotypes data frame.
a ckmr object created from the allele frequencies computed from LG.
maximum allowable number of mismatching genotypes betwen the pairs.
a data frame with columns:
the id (from the rownames in S) of the firt member of the pair
the id (from the rownames in S) of the second individual of the pair
the number of loci at which the pair have mismatching genotypes
the total number of loci missing in neither individual