Tabulate occurrences of all observed alleles in reference genetic data
Source:R/data_conversion.R
reference_allele_counts.RdTakes the first output of tcf2long, along with two columns named "collection" and "sample_type",
and returns a data frame of allele counts for each locus within each reference population.
Alleles to be counted are identified from both reference and mixture populations.
Arguments
- D
A data frame containing, at minimum, a column of sample group identifiers named "collection", a column designating each row as "reference" or "mixture", named "sample_type", and (from tcf2long output) locus, gene copy, and observed alleles. If higher-level reporting unit counts are desired, must have a column of reporting unit identifiers named "repunit"
- pop_level
a character vector expressing the population level for which allele counts should be tabulated. Set to "collection" for collection/underlying sample group (default), or "repunit" for reporting unit/overlying sample groups
Value
reference_allele_counts returns a long-format dataframe, with count data for
each collection, locus, and allele. Counts are only drawn from "reference" samples; alleles
unique to the "mixture" samples will still appear in the list, but will have 0s for all groups.
Details
The "collection" column should be a key assigning samples to the desired groups, e.g. collection site, run time, year. The "sample_type" column must contain either "reference" or "mixture" for each sample.
Examples
## count alleles in alewife reference populations
example(tcf2long) # gets variable ale_long
#>
#> tcf2ln> ## Convert the alewife dataset for further processing
#> tcf2ln> # the data frame passed into this function must have had
#> tcf2ln> # character collections and repunits converted to factors
#> tcf2ln> reference <- alewife
#>
#> tcf2ln> reference$repunit <- factor(reference$repunit, levels = unique(reference$repunit))
#>
#> tcf2ln> reference$collection <- factor(reference$collection, levels = unique(reference$collection))
#>
#> tcf2ln> ale_long <- tcf2long(reference, 17)
ale_rac <- reference_allele_counts(ale_long$long)