This scaling accounts for missing values in segments, scaling up all counts to ensure the sum matches the total count. It expects 2 tables as input, both produced by est_part. If no scaling is needed (i.e., sum(part_segment$part) == sum(part_total$part)) the function will simply return the input df.

scaleup_part(part_segment, part_total, test_threshold = 10,
  show_test_stat = FALSE, outvar = "participants")

scaleup_recruit(part_segment, part_total, test_threshold = 10,
  show_test_stat = FALSE, outvar = "recruits")

Arguments

part_segment

data frame: A segmented participation table produced by est_part (e.g., with segment argument set to "res")

part_total

data frame: An overall participation table produced by est_part

test_threshold

numeric: threshold in whole number percentage points which defines the upper limit of acceptable proportion of missing values for the segment. The function will stop with an error if this threshold is exceeded. Relaxing the threshold can allow the check to pass, but use this with caution since a high percentage of missing values might suggests that the breakouts aren't representative (e.g., if not missing at random).

show_test_stat

logical: If TRUE, the output table will include a variable holding the test statistic for each row.

outvar

character: name of variable that stores metric

See also

Salic Function Reference: salic

Other dashboard functions: check_threshold, est_churn, est_part, format_result

Examples

library(dplyr) data(history) history <- label_categories(history) # demonstrate the need for scaling part_total <- est_part(history) part_segment <- est_part(history, "sex", test_threshold = 40) left_join( select(part_total, year, part_tot = participants), group_by(part_segment, year) %>% summarise(part_seg = sum(participants)), )
#> Joining, by = "year"
#> # A tibble: 11 x 3 #> year part_tot part_seg #> <int> <int> <int> #> 1 2008 6393 6285 #> 2 2009 7591 7475 #> 3 2010 7775 7632 #> 4 2011 7819 7679 #> 5 2012 8243 8100 #> 6 2013 8273 8136 #> 7 2014 8970 8815 #> 8 2015 9086 8959 #> 9 2016 9161 9011 #> 10 2017 9389 9253 #> 11 2018 9206 9059
# perform scaling part_segment <- scaleup_part(part_segment, part_total) left_join( select(part_total, year, part_tot = participants), group_by(part_segment, year) %>% summarise(part_seg = sum(participants)), )
#> Joining, by = "year"
#> # A tibble: 11 x 3 #> year part_tot part_seg #> <int> <int> <int> #> 1 2008 6393 6393 #> 2 2009 7591 7591 #> 3 2010 7775 7775 #> 4 2011 7819 7819 #> 5 2012 8243 8243 #> 6 2013 8273 8273 #> 7 2014 8970 8970 #> 8 2015 9086 9086 #> 9 2016 9161 9161 #> 10 2017 9389 9389 #> 11 2018 9206 9206
# new recruits - unscaled history_new <- filter(history, R3 == "Recruit") part_total <- est_recruit(history_new, "tot") part_segment <- est_recruit(history_new, "sex") part_segment
#> # A tibble: 12 x 3 #> sex year recruits #> <fct> <int> <int> #> 1 Male 2013 1383 #> 2 Male 2014 1653 #> 3 Male 2015 1675 #> 4 Male 2016 1670 #> 5 Male 2017 1638 #> 6 Male 2018 1460 #> 7 Female 2013 534 #> 8 Female 2014 631 #> 9 Female 2015 704 #> 10 Female 2016 713 #> 11 Female 2017 661 #> 12 Female 2018 586
# new recruits - scaled scaleup_recruit(part_segment, part_total)
#> # A tibble: 12 x 3 #> sex year recruits #> <fct> <int> <int> #> 1 Male 2013 1413 #> 2 Male 2014 1686 #> 3 Male 2015 1697 #> 4 Male 2016 1696 #> 5 Male 2017 1658 #> 6 Male 2018 1490 #> 7 Female 2013 546 #> 8 Female 2014 644 #> 9 Female 2015 713 #> 10 Female 2016 724 #> 11 Female 2017 669 #> 12 Female 2018 598