R/plink_map2rec_rates.R
plink_map2rec_rates.Rd
This is a convenience function to convert PLINK map format to
the format used in the 'gscramble' RecRates
object. By default,
this function will use the positions of the markers and assume
recombination rates of 1 cM per megabase. If the marker positions
are also available in Morgans in the PLINK map file, the these can be
used by setting use_morgans
to TRUE.
plink_map2rec_rates(
map,
use_morgans = FALSE,
cM_per_Mb = 1,
chrom_lengths = NULL
)
path to the plink .map
file holding information about
the markers. This file can be gzipped.
logical. IF true, the third column in the PLINK map
file (assumed to have the position of the markers in Morgans) will be used
to calculate the rec_probs
in the bins of the RecRates object.
numeric. If use_morgans
is FALSE
, physical positions
will be converted to recombination fractions as cM_per_Mb
centiMorgans
per megabase. Default is 1. This is also used to determine the recombination
probability on the last segment of the chromosome (beyond the last marker)
if chrom_lengths
is used.
if you know the full length of each chromosome, you can
add those in a tibble with columns chrom
and bp
where chrom
must be a character vector (Don't leave them in as numerics) and bp
must be a numeric vector of the number of base pairs of length of each
chromosome.
A tibble that provides the recombination rates for the segments of the genome.
For simplicity, this function will assume that the length of the chromosome
is just one base pair beyond the last marker. That is typically not correct
but will have no effect, since there are no markers to be typed out beyond
that point. However, if you know the lengths of the chromosomes and want to
add those in there, then pass them into the chrom_lengths
option.
mapfile <- system.file(
"extdata/example-plink-with-morgans.map.gz",
package = "gscramble"
)
# get a rec-rates tibble from the positions of the markers,
# assuming 1 cM per megabase.
rec_rates_from_positions <- plink_map2rec_rates(mapfile)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# get a rec-rates tibble from the positions of the markers,
# assuming 1.5 cM per megabase.
rec_rates_from_positions_1.5 <- plink_map2rec_rates(
mapfile,
cM_per_Mb = 1.5
)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# get a rec-rates tibble from the cumulative Morgans position
# in the plink map file
rec_rates_from_positions_Morg <- plink_map2rec_rates(
mapfile,
use_morgans = TRUE
)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# get a rec-rates tibble from the cumulative Morgans position
# in the plink map file, and extend it out to the full length
# of the chromosome (assuming for that last part of the chromosome
# a map of 1.2 cM per megabase.)
rec_rates_from_positions_Morg_fl <- plink_map2rec_rates(
mapfile,
use_morgans = TRUE,
cM_per_Mb = 1.2,
chrom_lengths = example_chrom_lengths
)
#> Rows: 100 Columns: 4
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: "\t"
#> chr (1): id
#> dbl (3): chrom, morgans, bp
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.