Skip to contents

The goal of nestedmodels is to allow the modelling of nested data. Some models only accept certain predictors. For panel data, it is often desirable to create a model for each panel. nestedmodels enhances the ‘tidymodels’ set of packages by allowing the user to classify a model as ‘nested’.

Installation

# Install the released version on CRAN
install.packages("nestedmodels")

# Or install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("ashbythorpe/nestedmodels")

Example

Nested models are often best used on panel data.

data <- example_nested_data

nested_data <- tidyr::nest(example_nested_data, data = -id)

nested_data
#> # A tibble: 20 × 2
#>       id data             
#>    <int> <list>           
#>  1     1 <tibble [50 × 6]>
#>  2     2 <tibble [50 × 6]>
#>  3     3 <tibble [50 × 6]>
#>  4     4 <tibble [50 × 6]>
#>  5     5 <tibble [50 × 6]>
#>  6     6 <tibble [50 × 6]>
#>  7     7 <tibble [50 × 6]>
#>  8     8 <tibble [50 × 6]>
#>  9     9 <tibble [50 × 6]>
#> 10    10 <tibble [50 × 6]>
#> 11    11 <tibble [50 × 6]>
#> 12    12 <tibble [50 × 6]>
#> 13    13 <tibble [50 × 6]>
#> 14    14 <tibble [50 × 6]>
#> 15    15 <tibble [50 × 6]>
#> 16    16 <tibble [50 × 6]>
#> 17    17 <tibble [50 × 6]>
#> 18    18 <tibble [50 × 6]>
#> 19    19 <tibble [50 × 6]>
#> 20    20 <tibble [50 × 6]>

The nested_resamples() function makes sure that the testing and training data contain every unique value of ‘id’.

split <- nested_resamples(nested_data, rsample::initial_split())

data_tr <- rsample::training(split)
data_tst <- rsample::testing(split)

Fitting a nested model to this data is very simple.

model <- parsnip::linear_reg() %>%
  nested()

fit <- fit(model, z ~ x + y + a + b, 
           tidyr::nest(data_tr, data = -id))

predict(fit, data_tst)
#> # A tibble: 260 × 1
#>    .pred
#>    <dbl>
#>  1  35.0
#>  2  27.7
#>  3  35.0
#>  4  39.4
#>  5  30.4
#>  6  29.5
#>  7  33.8
#>  8  33.1
#>  9  26.3
#> 10  18.9
#> # ℹ 250 more rows

If you don’t want to nest your data manually, use step_nest() inside a workflow:

recipe <- recipes::recipe(data_tr, z ~ x + y + a + b + id) %>%
  step_nest(id)

wf <- workflows::workflow() %>%
  workflows::add_model(model) %>%
  workflows::add_recipe(recipe)

wf_fit <- fit(wf, data_tr)

predict(wf_fit, data_tst)
#> # A tibble: 260 × 1
#>    .pred
#>    <dbl>
#>  1  35.0
#>  2  27.7
#>  3  35.0
#>  4  39.4
#>  5  30.4
#>  6  29.5
#>  7  33.8
#>  8  33.1
#>  9  26.3
#> 10  18.9
#> # ℹ 250 more rows

Please note that the nestedmodels project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.