Stepwise covariate selection

The goal of NMproject’s covariate modelling functions is to provide a stepwise covariate method with manual decision making. This important to ensure that the full model selection/evaluation criteria (should be defined in statistical analysis plans) can be applied at every step rather than just log likelihood ratio testing, where the most significant model may be unstable, may worsen model predictions or may only be slightly more significant than a more physiologically plausible covariate relationship.

The functions test_relations(), covariate_step_tibble(), bind_covariate_results() together comprise NMproject stepwise covariate method with manual decision. The goal is to be part way between PsN’s SCM and completely manual process at each forward and backward elimination step. The syntax of how covariates are included is the same as PsN’s SCM routine - See PsN documentation for more information.

Relationships to tests are defined using the test_relations() function. Below we test BWT and AGE on all parameters in both linear and power fashion and all parameters are test with the categorical covariate SEX.

dtest <- test_relations(param = c("KA", "K", "V"),
                        cov = c("BWT", "AGE"),
                        state = c("linear", "power"),
                        continuous = TRUE)
dtest <- dtest %>% test_relations(param = c("KA", "K", "V"),
                                  cov = "SEX",
                                  state = "linear",
                                  continuous = FALSE)

#> # A tibble: 15 × 4
#>    param cov   state  continuous
#>    <chr> <chr> <chr>  <lgl>     
#>  1 KA    BWT   linear TRUE      
#>  2 K     BWT   linear TRUE      
#>  3 V     BWT   linear TRUE      
#>  4 KA    AGE   linear TRUE      
#>  5 K     AGE   linear TRUE      
#>  6 V     AGE   linear TRUE      
#>  7 KA    BWT   power  TRUE      
#>  8 K     BWT   power  TRUE      
#>  9 V     BWT   power  TRUE      
#> 10 KA    AGE   power  TRUE      
#> 11 K     AGE   power  TRUE      
#> 12 V     AGE   power  TRUE      
#> 13 KA    SEX   linear FALSE     
#> 14 K     SEX   linear FALSE     
#> 15 V     SEX   linear FALSE

The workflow for forward (and backward) steps then becomes

  1. create a tibble of runs for the next step with covariate_step_tibble()
  2. run all with run_nm()
  3. collect results with bind_covariate_results()
  4. evaluate top performing runs
  5. Either select a run to take to the next step (go back to step 1) OR stop.

Here are steps 1-3 in action:

## create tibble of covariate step with model objects as column m
dsm1 <- m1 %>% covariate_step_tibble(
  run_id = "m1_f1",
  dtest = dtest,
  direction = "forward"

## run all models greedily and wait for them to finish
dsm1$m %>% run_nm() %>% wait_finish()

## extract results and put into tibble
dsm1 <- dsm1 %>% bind_covariate_results()

Here are steps 4-5: evaluate and select

## sort by BIC (for example) and view
dsm1 <- dsm1 %>% arrange(BIC)

## check condition number, covariance,...

## run diagnostic reports on the top three
dsm1$m[1:3] %>% nm_render("Scripts/basic_gof.Rmd")
## diagnostic reports will not be in "Results" like other runs
## to see the location, look at the "results_dir" field of the object

## In this case we selec the first: dsm1$m[1]
m1_f1 <- dsm1$m[1]   ## select most significant BIC

Do next forward step

dsm2 <- m1_f1 %>% covariate_step_tibble(
  run_id = "m1_f2",
  dtest = dtest,
  direction = "forward"

continue for as many steps as needed.

Covariate modelling PsN