The $preprocess()
method runs the preprocessing pipeline
that includes data standardization, filtering, imputation, and aggregation.
See the More on data preparation
vignette for more information about data processing. For usage examples, refer to the
More examples of R6 classes
vignette.
Usage
preprocess(
data,
is_timevar = FALSE,
is_aggregated = FALSE,
special_case = NULL,
family = NULL,
time_freq = NULL,
freq_threshold = 0
)
Arguments
- data
An object of class
data.frame
(or one that can be coerced to that class) that satisfies the requirements specified in the More on data preparation vignette.- is_timevar
Logical indicating whether the data contains time-varying components.
- is_aggregated
Logical indicating whether the data is already aggregated.
- special_case
Character string specifying special case handling. Options are
NULL
(the default),"covid"
, and"poll"
.- family
Character string specifying the distribution family for the outcome variable. Options are
"binomial"
for binary outcome measures and"normal"
for continuous outcome measures.- time_freq
Character string specifying the time indexing frequency or time length for grouping dates (YYYY-MM-DD) in the data. Options are
NULL
(the default),"week"
,"month"
, and"year"
. This parameter must beNULL
for cross-sectional data or time-varying data that already has time indices.- freq_threshold
Numeric value specifying the minimum frequency threshold for including observations. Values with lower frequency will cause the entire row to be removed. The default value is 0 (no filtering).