These single check functions are intended to be called from data_check_table. Each prints a warning message on a failed check. Note that data_allowed_values() is a wrapper for variable_allowed_values.

data_primary_key(df, df_name, primary_key)

data_required_vars(df, df_name, required_vars, use_error = FALSE)

data_allowed_values(df, df_name, allowed_values)

Arguments

df

data frame: table to check

df_name

character: name of relevant data table ("cust", "lic", or "sale")

primary_key

character: name of variable that acts as primary key, which should be unique and non-missing. NULL indicates no primary key in table.

required_vars

character: variables that should be included

use_error

logical: If TRUE, stop with error instead of producing a warning

allowed_values

list: named list with allowed values for specific variables

See also

Other functions to check data format: data_check_table, data_check, data_foreign_key, variable_allowed_values

Examples

library(dplyr) data(lic) # primary keys not unique bind_rows(lic, lic) %>% data_primary_key("lic", "lic_id")
#> Warning: lic: Primary key (lic_id) not unique: 136 keys and 272 rows
# primary keys missing lic$lic_id[1] <- NA data_primary_key(lic, "lic", "lic_id")
#> Warning: lic: Primary key (lic_id) contains missing values
# missing required variables select(lic, -duration) %>% data_required_vars("lic", c("lic_id", "type", "duration"))
#> Warning: lic: 1 Missing variable(s): duration
# includes values that aren't allowed allowed_values <- list(type = c("hunt", "fish"), duration = 1) data_allowed_values(lic, "lic", allowed_values)
#> Warning: lic$type: Contains values that aren't allowed: combo
#> Warning: lic$duration: Contains values that aren't allowed: 3, 99