Skip to contents

Overview

shinymrp is a two-in-one package that offers both a graphical and a programmatic interface for applications for Multilevel Regression and Post-stratification (MRP). They parallel each other in their implementation of the data analysis workflow that MRP underlies.

We recommend the Shiny app (GUI) for new users. You can launch the Shiny app by running

shinymrp::run_app()

For experienced users, the programmatic interface (API) offers more flexibility in exchange for the visual aids that the GUI provides. Users can lay out the workflow and run it from end to end with a single R script execution. Fitting complex statistical models to large datasets can be memory-intensive. The API allows users to utilize computational resources from high-performance computing clusters, which often offer limited support for graphics.

An example of the API is shown below. For a more detailed walkthrough, see the workflow vignette.

Installation

Prerequisites: C++ Toolchain

shinymrp requires a modern C++ compiler and the GNU Make build utility (a.k.a. “gmake”) for compiling Stan programs. These vary by different operating systems.

Linux

On most systems, the GNU Make utility is pre-installed and is the default make utility. There is usually a pre-installed C++ compiler as well, however, it may not be new enough. To check your machine, run the commands:

g++ --version
make --version

If these are at least at g++ version 4.9.3 or later and make version 3.81 or later, no additional installations are necessary. It may still be desirable to update the C++ compiler g++ because later versions are faster.

A modern C++ compiler and GNU make are bundled into the meta-package build-essential, and can be installed via the command:

sudo apt-get install build-essential

# then rerun checks
g++ --version
make --version

Mac

On Mac, the C++ compiler and GNU Make are included with Xcode, the Apple toolset for software developers. To check if you have the Clang C++ compiler:

clang --version

If this command fails, then install Xcode via the following command

xcode-select --install

Windows

For Windows, RTools is a toolchain bundle that includes the necessary C++ toolchain for compiling Stan programs. Install the appropriate version based on the version of R on your machine.

shinymrp Workflow

shinymrp is designed to streamline the data analysis workflow common in applications of Multilevel Regression and Post-stratification. The package handles routine but essential tasks such as data cleaning and generating visualizations of descriptive statistics, allowing users to focus on model building.

We chose an object-oriented design for the programmatic interface rather than the functional programming approach more familiar to R users. This design encapsulates data that needs to be used throughout the entire workflow, eliminating the need for users to manually track multiple objects—similar to how the graphical user interface operates.

Below is an annotated example of the general workflow. For a more details about each step of the process, refer to the workflow vignette.

library(shinymrp)

# Load example data
sample_data <- example_sample_data()

# Initialize the MRP workflow
workflow <- mrp_workflow()


### DATA PREPARATION

# Preprocess sample data
workflow$preprocess(
  sample_data,
  is_timevar = TRUE,
  is_aggregated = FALSE,
  special_case = NULL,
  family = "binomial"
)

# Link data to the ACS
# and obtain poststratification data
workflow$link_acs(
  link_geo = "zip",
  acs_year = 2021
)


### DESCRIPTIVE STATISTICS

# Visualize demographic distribution of data
workflow$demo_bars(demo = "sex")

# Visualize geographic distribution of data
workflow$sample_size_map()

# Visualize outcome measure
workflow$outcome_plot()


### MODEL BUILDING

# Create new model objects
model <- workflow$create_model(
  model_spec = list(
    Intercept = list(
      Intercept = ""
    ),
    fixed = list(
      sex = "",
      race = ""
    ),
    varying = list(
      age = "",
      time = ""
    )
  )
)

# Run MCMC
model$fit(n_iter = 1000, n_chains = 2, seed = 123)

# Estimates summary and diagnostics
model$summary()

# Sampling diagnostics
model$diagnostics()

# Posterior predictive check
workflow$pp_check(model)


### VISUALIZE RESULTS

# Overall estimates and estimates for demographic groups
workflow$estimate_plot(model, group = "sex")

# Geographic estimates
workflow$estimate_map(model, geo = "county")