These single check functions are intended to be called from data_check_table
.
Each prints a warning message on a failed check. Note that data_allowed_values()
is a wrapper for variable_allowed_values
.
data_primary_key(df, df_name, primary_key) data_required_vars(df, df_name, required_vars, use_error = FALSE) data_allowed_values(df, df_name, allowed_values)
df | data frame: table to check |
---|---|
df_name | character: name of relevant data table ("cust", "lic", or "sale") |
primary_key | character: name of variable that acts as primary key, which should be unique and non-missing. NULL indicates no primary key in table. |
required_vars | character: variables that should be included |
use_error | logical: If TRUE, stop with error instead of producing a warning |
allowed_values | list: named list with allowed values for specific variables |
Other functions to check data format: data_check_table
,
data_check
, data_foreign_key
,
variable_allowed_values
library(dplyr) data(lic) # primary keys not unique bind_rows(lic, lic) %>% data_primary_key("lic", "lic_id")#> Warning: lic: Primary key (lic_id) not unique: 136 keys and 272 rows# primary keys missing lic$lic_id[1] <- NA data_primary_key(lic, "lic", "lic_id")#> Warning: lic: Primary key (lic_id) contains missing values# missing required variables select(lic, -duration) %>% data_required_vars("lic", c("lic_id", "type", "duration"))#> Warning: lic: 1 Missing variable(s): duration# includes values that aren't allowed allowed_values <- list(type = c("hunt", "fish"), duration = 1) data_allowed_values(lic, "lic", allowed_values)#> Warning: lic$type: Contains values that aren't allowed: combo#> Warning: lic$duration: Contains values that aren't allowed: 3, 99