
03: The CAR Models
RSTr-car.RmdOverview
The *car() functions (rcar(),
car(), mcar(), mstcar()) build
the necessary features needed to run RSTr and places them in the
specified directory. In this vignette, we will talk in detail about each
model, the arguments used in *car() functions, and how to
use them.
The *car() functions
The current version of RSTr features four models to choose from: the Besag-York-Mollié (BYM) CAR model (also known as just the CAR model), the Restricted CAR (RCAR) model, the Multivariate CAR (MCAR) model, and the Multivariate Spatiotemporal CAR (MSTCAR) model. We will now compare these models and their use cases.
The CAR model
The CAR model (function call car()) is the basis of all
models used in RSTr. The premise of the CAR is to spatially smooth
estimates across spatial regions using a random effects estimator
Z. The intensity of its smoothing in a region is based on
the event and population counts of that region and the counts in its
neighboring regions.
The CAR model only smooths across spatial regions and not across
sociodemographic groups or time periods. While datasets that include
these stratifications can be run with car(), the user would
be effectively running several concurrent CAR models. The CAR model is
recommended if the user is only has data for one sociodemographic group
of interest and one time-period.
The RCAR model
The RCAR model (function call rcar()) is the most recent
BYM implementation in RSTr. The RCAR model follows the same general
paradigm as the CAR model, but prevents oversmoothing by capping the
spatial and non-spatial variance. Even though the RCAR only smooths
across spatial regions, the estimates generated by rcar()
are nuanced and strike a happy medium between the use of crude rate
estimates and the oversmoothing of the standard CAR model.
The MCAR model
The MCAR model (function call mcar()) is an extension of
the CAR model: whereas the CAR model can only smooth over spatial
regions, the MCAR model can smooth over spatial regions and
sociodemographic groups. The MCAR model is ideal for datasets that
include multiple sociodemographic groups. A restricted MCAR (RMCAR)
model is currently under development and will be implemented in RSTr
once its methodology is finalized.
The MSTCAR model
The MSTCAR model (function call mstcar()) is an
extension of the MCAR model, allowing for smoothing over spatial
regions, sociodemographic groups, and time periods. The MSTCAR model is
ideal for investigating trends in rate estimates over a specified time
period.
Arguments of the *car() function
All *car() functions provide several arguments:
name: The name of the folder your model information lives in;data: Thelistobject containing the eventYand populationndata. For more information ondatasetup, readvignette("RSTr-event");adjacency: The adjacency structure for your event and population data. For more information on adjacency structure setup, readvignette("RSTr-adjacency");dir: The directory where the model folder lives. By default, this saves into your temporary directory, so the model information will be lost after the R session ends. Should you want to save your model to be analyzed at a later date or ensure that your samples are intact if R crashes during runtime, specify a different directory;seed: Allows the user to specify the random seed used for replication purposes;perc_ci: A number between 0 and 1 which specifies the desired credible interval to use when calculating the relative precision of estimates. By default, set to 0.95;iterations: The number of iterations to run the model for;show_plots: If set toFALSE, hides traceplots during model execution;verbose: If set toFALSE, hides the progress bar and messages in the console;ignore_checks: If set toTRUE, skips model validation;method: Chooses whether the event data is either Binomial ("binomial") or Poisson ("poisson") distributed. By default, RSTr uses Binomial updates for event data;impute_lb: Specifies a lower bound for imputed data for event information that is missing or suppressed;impute_ub: Specifies an upper bound for imputed data for event information that is missing or suppressed;inits: This is alistof initial values for each parameter. This can be specified by the user or generated by default. For more information on specification ofinits, seevignette("RSTr-initialvalues"); andpriors: This is alistof all prior information for each parameter. This can be specified by the user or generated by default. For more information on specification ofpriors, seevignette("RSTr-priors").
Most of these arguments are not needed, as the model has defaults for
many of them. rcar() and mstcar() have
additional arguments only used by them:
A: In the RCAR model, describes the limit of the smoothing intensity between regions;m0: In the RCAR model, specifies the baseline neighbor count; andupdate_rho: In the MSTCAR model, allows for updates of the temporal correlation parameterrho. By default, RSTr does not updaterho.
If you run into errors when trying to initialize your model, read
vignette("RSTr-troubleshoot"). Below, we will go into
detail regarding what each argument does specifically and what to keep
in mind when setting these values.
The inits argument
inits is a list specifying the starting
values for parameters in the model. Details around the initial value
parameters can be found in
vignette("RSTr-initialvalues").
The priors argument
priors behaves similar to inits, except
that it contains all information related to parameter priors. Details
around the initial value parameters can be found in
vignette("RSTr-priors").
The method argument
method offers two values: "binomial" and
"poisson". These values determine how the data is
transformed and how the lambda Metropolis update is
performed: "binomial" treats the event data as
Binomial-distributed and "poisson" treats the event data as
Poisson-distributed. Depending on your use case, you’ll want to choose
between the two: for example, if you are working with very small
mortality rates, "poisson" will work well, but if you are
working with birth rates, for example, then "binomial" will
work better. Note that "binomial" works in most general use
cases and "poisson" only works well for datasets with small
rates under approximately 1%.
m0 and A
m0 and A are two components that determine
the intensity of the smoothing of Restricted CAR models. m0
should be a positive scalar, and the size of A is dependent
on the group/time structure of your data: A will be a
positive scalar for region-only models, a vector of size
n_group for region-group models, and a matrix of size
n_group x n_time for region-group-time models.
Note, however, that these informativeness restriction measures are
currently only developed for the CAR model, and restrictions for more
complex models will be added to the RSTr package as their respective
methods are developed.
The update_rho argument
In the MSTCAR model, update_rho is a
logical that specifies whether to calculate estimates for
the temporal correlation rho. By default, it is set to
FALSE. In empirical testing, this estimate was found to not
be very sensitive to changes when specified prudently and also increases
runtime by an order of magnitude due to its complexity.
The seed argument
Because of the stochastic nature of Bayesian inference and the
inherent instability of the MSTCAR model, replicability is extremely
important. seed allows the user to specify a seed for
generating similar estimates.
The ignore_checks argument
As development continues on RSTr, there are occasions where the
checks performed on the inputs of *car() throw an error,
even though you may be certain that all of your inputs are behaving as
expected. To override the checks, you can use the
ignore_checks argument. Set this to TRUE to
skip this step.
Closing Thoughts
Initialization is one of the most important steps of running the
model, as it’s where virtually all choices regarding the model are made.
In this vignette, we explored each available type of CAR model in RSTr,
the arguments of the *car() functions, and how to
appropriately choose values for each argument.