This is a convenience function to convert PLINK map format to the format used in the 'gscramble' RecRates object. By default, this function will use the positions of the markers and assume recombination rates of 1 cM per megabase. If the marker positions are also available in Morgans in the PLINK map file, the these can be used by setting use_morgans to TRUE.

plink_map2rec_rates(
  map,
  use_morgans = FALSE,
  cM_per_Mb = 1,
  chrom_lengths = NULL
)

Arguments

map

path to the plink .map file holding information about the markers. This file can be gzipped.

use_morgans

logical. IF true, the third column in the PLINK map file (assumed to have the position of the markers in Morgans) will be used to calculate the rec_probs in the bins of the RecRates object.

cM_per_Mb

numeric. If use_morgans is FALSE, physical positions will be converted to recombination fractions as cM_per_Mb centiMorgans per megabase. Default is 1. This is also used to determine the recombination probability on the last segment of the chromosome (beyond the last marker) if chrom_lengths is used.

chrom_lengths

if you know the full length of each chromosome, you can add those in a tibble with columns chrom and bp where chrom must be a character vector (Don't leave them in as numerics) and bp must be a numeric vector of the number of base pairs of length of each chromosome.

Value

A tibble that provides the recombination rates for the segments of the genome.

Details

For simplicity, this function will assume that the length of the chromosome is just one base pair beyond the last marker. That is typically not correct but will have no effect, since there are no markers to be typed out beyond that point. However, if you know the lengths of the chromosomes and want to add those in there, then pass them into the chrom_lengths option.

Examples

mapfile <- system.file(
    "extdata/example-plink-with-morgans.map.gz",
    package = "gscramble"
 )

# get a rec-rates tibble from the positions of the markers,
# assuming 1 cM per megabase.
rec_rates_from_positions <- plink_map2rec_rates(mapfile)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.

# get a rec-rates tibble from the positions of the markers,
# assuming 1.5 cM per megabase.
rec_rates_from_positions_1.5 <- plink_map2rec_rates(
    mapfile,
    cM_per_Mb = 1.5
)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.

# get a rec-rates tibble from the cumulative Morgans position
# in the plink map file
rec_rates_from_positions_Morg <- plink_map2rec_rates(
    mapfile,
    use_morgans = TRUE
)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.

# get a rec-rates tibble from the cumulative Morgans position
# in the plink map file, and extend it out to the full length
# of the chromosome (assuming for that last part of the chromosome
# a map of 1.2 cM per megabase.)
rec_rates_from_positions_Morg_fl <- plink_map2rec_rates(
    mapfile,
    use_morgans = TRUE,
    cM_per_Mb = 1.2,
    chrom_lengths = example_chrom_lengths
)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#> 
#>  Use `spec()` to retrieve the full column specification for this data.
#>  Specify the column types or set `show_col_types = FALSE` to quiet this message.