Scale segmented participation counts to total (if needed)

This scaling accounts for missing values in segments, scaling up all counts to ensure the sum matches the total count. It expects 2 tables as input, both produced by est_part. If no scaling is needed (i.e., sum(part_segment$part) == sum(part_total$part)) the function will simply return the input df.

scaleup_part(part_segment, part_total, test_threshold = 10,
  show_test_stat = FALSE, outvar = "participants")

scaleup_recruit(part_segment, part_total, test_threshold = 10,
  show_test_stat = FALSE, outvar = "recruits")

Arguments

part_segment	data frame: A segmented participation table produced by `est_part` (e.g., with segment argument set to "res")
part_total	data frame: An overall participation table produced by `est_part`
test_threshold	numeric: threshold in whole number percentage points which defines the upper limit of acceptable proportion of missing values for the segment. The function will stop with an error if this threshold is exceeded. Relaxing the threshold can allow the check to pass, but use this with caution since a high percentage of missing values might suggests that the breakouts aren't representative (e.g., if not missing at random).
show_test_stat	logical: If TRUE, the output table will include a variable holding the test statistic for each row.
outvar	character: name of variable that stores metric

Examples

library(dplyr)
data(history)
history <- label_categories(history)

# demonstrate the need for scaling
part_total <- est_part(history)
part_segment <- est_part(history, "sex", test_threshold = 40)
left_join(
    select(part_total, year, part_tot = participants),
    group_by(part_segment, year) %>% summarise(part_seg = sum(participants)),
)
#> Joining, by = "year"
#> # A tibble: 11 x 3
#>     year part_tot part_seg
#>    <int>    <int>    <int>
#>  1  2008     6393     6285
#>  2  2009     7591     7475
#>  3  2010     7775     7632
#>  4  2011     7819     7679
#>  5  2012     8243     8100
#>  6  2013     8273     8136
#>  7  2014     8970     8815
#>  8  2015     9086     8959
#>  9  2016     9161     9011
#> 10  2017     9389     9253
#> 11  2018     9206     9059

# perform scaling
part_segment <- scaleup_part(part_segment, part_total)
left_join(
    select(part_total, year, part_tot = participants),
    group_by(part_segment, year) %>% summarise(part_seg = sum(participants)),
)
#> Joining, by = "year"
#> # A tibble: 11 x 3
#>     year part_tot part_seg
#>    <int>    <int>    <int>
#>  1  2008     6393     6393
#>  2  2009     7591     7591
#>  3  2010     7775     7775
#>  4  2011     7819     7819
#>  5  2012     8243     8243
#>  6  2013     8273     8273
#>  7  2014     8970     8970
#>  8  2015     9086     9086
#>  9  2016     9161     9161
#> 10  2017     9389     9389
#> 11  2018     9206     9206

# new recruits - unscaled
history_new <- filter(history, R3 == "Recruit")
part_total <- est_recruit(history_new, "tot")
part_segment <- est_recruit(history_new, "sex")
part_segment
#> # A tibble: 12 x 3
#>    sex     year recruits
#>    <fct>  <int>    <int>
#>  1 Male    2013     1383
#>  2 Male    2014     1653
#>  3 Male    2015     1675
#>  4 Male    2016     1670
#>  5 Male    2017     1638
#>  6 Male    2018     1460
#>  7 Female  2013      534
#>  8 Female  2014      631
#>  9 Female  2015      704
#> 10 Female  2016      713
#> 11 Female  2017      661
#> 12 Female  2018      586

# new recruits - scaled
scaleup_recruit(part_segment, part_total)
#> # A tibble: 12 x 3
#>    sex     year recruits
#>    <fct>  <int>    <int>
#>  1 Male    2013     1413
#>  2 Male    2014     1686
#>  3 Male    2015     1697
#>  4 Male    2016     1696
#>  5 Male    2017     1658
#>  6 Male    2018     1490
#>  7 Female  2013      546
#>  8 Female  2014      644
#>  9 Female  2015      713
#> 10 Female  2016      724
#> 11 Female  2017      669
#> 12 Female  2018      598

Scale segmented participation counts to total (if needed)

Arguments

See also

Examples

Contents