Advanced Topics • NMproject

Vectorisation

The basic nm object is fully vectorised. To see this compare the following two outputs


m1
#> List of 1
#>  $ execute.Models/m1
#>   ..$ type        : chr "execute"
#>   ..$ run_id      : chr "m1"
#>   ..$ run_in      : chr "Models"
#>   ..$ executed    : logi TRUE
#>   ..$ ctl_contents: chr collapsed - view with ctl_contents()
#>   ..$ ctl_orig    : chr collapsed - view with ctl_orig()
#>   ..$ data_path   : chr "DerivedData/data.csv"
#>   ..$ cmd         : chr "execute runm1.mod -dir=m1"
#>   ..$ cores       : int 1
#>   ..$ parafile    : chr "path/to/parafile.pnm"
#>   ..$ run_dir     : chr "m1"
#>   ..$ ctl_name    : chr "runm1.mod"
#>   ..$ results_dir : chr "Results"
#>   ..$ unique_id   : chr "execute.Models/m1"
#>   ..$ lst_path    : chr "m1/NM_run1/psn.lst"
#>   ..- attr(*, "class")= chr [1:3] "nm_execute" "nm_generic" "list"
#>  - attr(*, "class")= chr [1:2] "nm_list" "list"
c(m1, m2)
#> List of 2
#>  $ execute.Models/m1
#>   ..$ type        : chr "execute"
#>   ..$ run_id      : chr "m1"
#>   ..$ run_in      : chr "Models"
#>   ..$ executed    : logi TRUE
#>   ..$ ctl_contents: chr collapsed - view with ctl_contents()
#>   ..$ ctl_orig    : chr collapsed - view with ctl_orig()
#>   ..$ data_path   : chr "DerivedData/data.csv"
#>   ..$ cmd         : chr "execute runm1.mod -dir=m1"
#>   ..$ cores       : int 1
#>   ..$ parafile    : chr "path/to/parafile.pnm"
#>   ..$ run_dir     : chr "m1"
#>   ..$ ctl_name    : chr "runm1.mod"
#>   ..$ results_dir : chr "Results"
#>   ..$ unique_id   : chr "execute.Models/m1"
#>   ..$ lst_path    : chr "m1/NM_run1/psn.lst"
#>   ..- attr(*, "class")= chr [1:3] "nm_execute" "nm_generic" "list"
#>  $ execute.Models/m2
#>   ..$ type              : chr "execute"
#>   ..$ run_id            : chr "m2"
#>   ..$ run_in            : chr "Models"
#>   ..$ parent_run_id     : chr "m1"
#>   ..$ parent_run_in     : chr "Models"
#>   ..$ parent_ctl_name   : chr "runm1.mod"
#>   ..$ parent_results_dir: chr "Results"
#>   ..$ executed          : logi TRUE
#>   ..$ ctl_contents      : chr collapsed - view with ctl_contents()
#>   ..$ ctl_orig          : chr collapsed - view with ctl_orig()
#>   ..$ data_path         : chr "DerivedData/data.csv"
#>   ..$ cmd               : chr "execute runm2.mod -dir=m2"
#>   ..$ cores             : int 1
#>   ..$ parafile          : chr "path/to/parafile.pnm"
#>   ..$ run_dir           : chr "m2"
#>   ..$ ctl_name          : chr "runm2.mod"
#>   ..$ results_dir       : chr "Results"
#>   ..$ unique_id         : chr "execute.Models/m2"
#>   ..$ lst_path          : chr "m2/NM_run1/psn.lst"
#>   ..- attr(*, "class")= chr [1:3] "nm_execute" "nm_generic" "list"
#>  - attr(*, "class")= chr [1:2] "nm_list" "list"

Both are nm objects with length 1 and 2, respectively. We used the c(m1, m2) above when we wanted to use nm_render() on both runs. We didn’t need to write any loops or special code to handle this because nm objects and the functions that operate on them have been designed with parallelisation in mind. This is because in pharmacometrics we are often dealing with multiple models, perhaps moreso than other statistical modelling disciplines.

The vectorised nature of the nm object allows groups of runs to be created and run. To demonstrate, lets repeat the the previous initial perturbation exercise, but create 5 runs each with their own perturbed initial estimates.

The basic idea is to start with the parent object m1 and then supplying a vector rather than a scalar to the child() function. Since the rep column is length 5, this will make the nm_object length 5.

m1rep <- m1 %>% ## start with parent - m1 is length 1
  child(run_id = 1:5) %>% ## now object is length = length(1:5) = 5
  init_theta(init = rnorm(init, mean = init, sd = 0.3)) %>% ## vectorised operation
  run_in("Models/m1rep") %>% ## for tidiness, run these in their own sub-directory
  run_nm() ## run all 5 runs

Notice how the functions init_theta(), run_in() and run_nm() all worked on the nm object vector without needing loops.

To see all the $THETAS we can just run

m1rep %>% dollar("THETA")
#> $`execute.Models/m1rep/1`
#>   1| $THETA
#>   2| -0.076487   ; KA ; h-1 ; LOG
#>   3| -1.8508   ; K ; h-1 ; LOG
#>   4| -0.2257   ; V ; L ; LOG
#> 
#> $`execute.Models/m1rep/2`
#>   1| $THETA
#>   2| 0.43859   ; KA ; h-1 ; LOG
#>   3| -2.9739   ; K ; h-1 ; LOG
#>   4| -0.29088   ; V ; L ; LOG
#> 
#> $`execute.Models/m1rep/3`
#>   1| $THETA
#>   2| 0.93967   ; KA ; h-1 ; LOG
#>   3| -2.6674   ; K ; h-1 ; LOG
#>   4| -0.6201   ; V ; L ; LOG
#> 
#> $`execute.Models/m1rep/4`
#>   1| $THETA
#>   2| 0.2535   ; KA ; h-1 ; LOG
#>   3| -2.4012   ; K ; h-1 ; LOG
#>   4| -0.09186   ; V ; L ; LOG
#> 
#> $`execute.Models/m1rep/5`
#>   1| $THETA
#>   2| 0.40514   ; KA ; h-1 ; LOG
#>   3| -2.0764   ; K ; h-1 ; LOG
#>   4| -0.15977   ; V ; L ; LOG

We can also use standard dplyr to embed nm objects in data.frames. This is a useful way to organise groups of runs. E.g. the following would be an alternative way of running the previous lines which isn’t that useful here, but will be useful in the following section.


d1rep <- tibble(rep = 1:5) %>%
  mutate(
    m = m1 %>%
      child(run_id = rep) %>%
      init_theta(init = rnorm(init, mean = init, sd = 0.3)) %>% 
      run_in("Models/m1rep")
  )

d1rep$m %>% run_nm()

The parallel efficiency test in the NONMEM Execution Article is also an example of vectorized nm object use.

Automating model checking

Following on from the previous section we will now

Lets expand the earlier subroutine example to build a vector of runs that test multiple different ADVANs and TRANS at the same time.

We’ll use a similar structure to the previous section using .available_advans to list all available ADVAN/TRANS options, filter to isolate specific ADVAN/TRANS options, and mutate and subroutine() to perform the control file manipulation function.

Then we’ll display all the $PK subroutines to view the changes.


.available_advans ## display available advans
#> # A tibble: 20 × 6
#>    advan trans   cmt oral  params                  label
#>    <dbl> <dbl> <dbl> <lgl> <chr>                   <chr>
#>  1     1     1     1 FALSE K,V                     a1t1 
#>  2     1     2     1 FALSE CL,V                    a1t2 
#>  3     2     1     1 TRUE  KA,K,V                  a2t1 
#>  4     2     2     1 TRUE  KA,CL,V                 a2t2 
#>  5     3     1     2 FALSE K,K12,K21,V             a3t1 
#>  6     3     3     2 FALSE CL,Q,V,VSS              a3t3 
#>  7     3     4     2 FALSE CL,Q,V1,V2              a3t4 
#>  8     4     1     2 TRUE  KA,K,K23,K32,V2         a4t1 
#>  9     4     3     2 TRUE  KA,CL,Q,V,VSS           a4t3 
#> 10     4     4     2 TRUE  KA,CL,Q,V2,V3           a4t4 
#> 11     5     1    NA NA    KXY,VX                  a5t1 
#> 12     6     1    NA NA    KXY,VX                  a6t1 
#> 13     7     1    NA NA    KXY,VX                  a7t1 
#> 14     8     1    NA NA    KXY,VX                  a8t1 
#> 15     9     1    NA NA    KXY,VX                  a9t1 
#> 16    11     1     3 FALSE K,K12,K21,K13,K31,V     a11t1
#> 17    11     4     3 FALSE CL,Q2,V1,V2,Q3,V3       a11t4
#> 18    12     1     3 TRUE  KA,K,K23,K32,K24,K42,V2 a12t1
#> 19    12     4     3 TRUE  KA,CL,Q3,V2,V3,Q4,V4    a12t4
#> 20    13     1    NA NA    KXY,VX                  a13t1

dt <- .available_advans %>%
  ## filter only for oral dosing advans
  filter(oral %in% TRUE) %>%
  ## mutate state create a column vector m of nm objects
  ## first step is to create children runs from the parent object m1
  ## this is done by supplying a vector of run_ids to the child() function
  mutate(m = m1 %>% ## start with parent m1 again
           child(run_id = label) %>% ## create multiple children using label column
           subroutine(advan = advan, trans = trans) ## set subroutine using advan and trans columns
         )

## view the $PK blocks of each
dt$m %>% dollar("PK")
#> $`execute.Models/a2t1`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVK=EXP(THETA(2))
#>   8| MU_2=LOG(TVK)
#>   9| K = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV=EXP(THETA(3))
#>  12| MU_3=LOG(TVV)
#>  13| V = EXP(MU_3+ETA(3))
#>  14| 
#>  15| CL = K*V
#>  16| 
#>  17| S2 = V
#>  18| 
#> 
#> $`execute.Models/a2t2`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVCL=EXP(THETA(2))
#>   8| MU_2=LOG(TVCL)
#>   9| CL = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV=EXP(THETA(3))
#>  12| MU_3=LOG(TVV)
#>  13| V = EXP(MU_3+ETA(3))
#>  14| 
#>  15| ; CL = K*V
#>  16| 
#>  17| S2 = V
#>  18| 
#> 
#> $`execute.Models/a4t1`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVK=EXP(THETA(2))
#>   8| MU_2=LOG(TVK)
#>   9| K = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV2=EXP(THETA(3))
#>  12| MU_3=LOG(TVV2)
#>  13| V2 = EXP(MU_3+ETA(3))
#>  14| 
#>  15| CL = K*V2
#>  16| 
#>  17| S2 = V2
#>  18| 
#>  19| TVK23=EXP(THETA(4))
#>  20| MU_4=LOG(TVK23)
#>  21| K23 = EXP(MU_4+ETA(4))
#>  22| TVK32=EXP(THETA(5))
#>  23| MU_5=LOG(TVK32)
#>  24| K32 = EXP(MU_5+ETA(5))
#> 
#> $`execute.Models/a4t3`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVCL=EXP(THETA(2))
#>   8| MU_2=LOG(TVCL)
#>   9| CL = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV=EXP(THETA(3))
#>  12| MU_3=LOG(TVV)
#>  13| V = EXP(MU_3+ETA(3))
#>  14| 
#>  15| ; CL = K*V
#>  16| 
#>  17| S2 = V
#>  18| 
#>  19| TVQ=EXP(THETA(4))
#>  20| MU_4=LOG(TVQ)
#>  21| Q = EXP(MU_4+ETA(4))
#>  22| TVVSS=EXP(THETA(5))
#>  23| MU_5=LOG(TVVSS)
#>  24| VSS = EXP(MU_5+ETA(5))
#> 
#> $`execute.Models/a4t4`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVCL=EXP(THETA(2))
#>   8| MU_2=LOG(TVCL)
#>   9| CL = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV2=EXP(THETA(3))
#>  12| MU_3=LOG(TVV2)
#>  13| V2 = EXP(MU_3+ETA(3))
#>  14| 
#>  15| ; CL = K*V
#>  16| 
#>  17| S2 = V2
#>  18| 
#>  19| TVQ=EXP(THETA(4))
#>  20| MU_4=LOG(TVQ)
#>  21| Q = EXP(MU_4+ETA(4))
#>  22| TVV3=EXP(THETA(5))
#>  23| MU_5=LOG(TVV3)
#>  24| V3 = EXP(MU_5+ETA(5))
#> 
#> $`execute.Models/a12t1`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVK=EXP(THETA(2))
#>   8| MU_2=LOG(TVK)
#>   9| K = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV2=EXP(THETA(3))
#>  12| MU_3=LOG(TVV2)
#>  13| V2 = EXP(MU_3+ETA(3))
#>  14| 
#>  15| CL = K*V2
#>  16| 
#>  17| S2 = V2
#>  18| 
#>  19| TVK23=EXP(THETA(4))
#>  20| MU_4=LOG(TVK23)
#>  21| K23 = EXP(MU_4+ETA(4))
#>  22| TVK32=EXP(THETA(5))
#>  23| MU_5=LOG(TVK32)
#>  24| K32 = EXP(MU_5+ETA(5))
#>  25| TVK24=EXP(THETA(6))
#>  26| MU_6=LOG(TVK24)
#>  27| K24 = EXP(MU_6+ETA(6))
#>  28| TVK42=EXP(THETA(7))
#>  29| MU_7=LOG(TVK42)
#>  30| K42 = EXP(MU_7+ETA(7))
#> 
#> $`execute.Models/a12t4`
#>   1| $PK
#>   2| 
#>   3| TVKA=EXP(THETA(1))
#>   4| MU_1=LOG(TVKA)
#>   5| KA = EXP(MU_1+ETA(1))
#>   6| 
#>   7| TVCL=EXP(THETA(2))
#>   8| MU_2=LOG(TVCL)
#>   9| CL = EXP(MU_2+ETA(2))
#>  10| 
#>  11| TVV2=EXP(THETA(3))
#>  12| MU_3=LOG(TVV2)
#>  13| V2 = EXP(MU_3+ETA(3))
#>  14| 
#>  15| ; CL = K*V
#>  16| 
#>  17| S2 = V2
#>  18| 
#>  19| TVQ3=EXP(THETA(4))
#>  20| MU_4=LOG(TVQ3)
#>  21| Q3 = EXP(MU_4+ETA(4))
#>  22| TVV3=EXP(THETA(5))
#>  23| MU_5=LOG(TVV3)
#>  24| V3 = EXP(MU_5+ETA(5))
#>  25| TVQ4=EXP(THETA(6))
#>  26| MU_6=LOG(TVQ4)
#>  27| Q4 = EXP(MU_6+ETA(6))
#>  28| TVV4=EXP(THETA(7))
#>  29| MU_7=LOG(TVV4)
#>  30| V4 = EXP(MU_7+ETA(7))

Let’s run these, summarise the results, and generate goodness of fit diagnostics for the ones that gave somewhat reasonable outputs


## run them all and wait for them to finish
dt$m %>% run_nm() %>% wait_finish()

## summarise all runs in a table
summary_wide(dt$m)

## plot goodness of fits for all runs with ofv < 120
dt$m %>% 
  subset(ofv(.) < 120) %>%  ## subsetting is a powerful way of isolating functions to particular model objects
  nm_render("Scripts/basic_gof.Rmd")

Simulation re-estimation

There will soon a be a simple wrapper for the code below, as with the bootstrap and step wise covariate functionality above, but for now the code below is a good example of how the flexibility of the vectorised object can be used to create complex workflows whilst still providing granular control of runs.

It’s good advice to start with 1 or 2 replicates and scale up only when you’ve confirmed your code is working (here we’re just using 3 for demo purposes). You will not waste time because run_nm() will skip over runs that have already completed.


n_sims <- 3  ## start small, scale up later

dsr <- tibble(sim = 1:n_sims) %>%
  mutate(
    msim = m1 %>%              ## start with single parent run, m1, an nm object of length 1
      update_parameters() %>%  ## update inits to finals, here nm object is still length 1
      child(run_id = sim, parent = m1) %>%  ## at this point it becomes length n_sims
      run_in("Models/m1_simest") %>% ## this applies the run_in modification to all n_sims runs
      convert_to_simulation(subpr = 1, seed = sim) ## converts all to simulation
  )

## run, wait, read results and then write to run_dir paths of simulations
dsr$msim %>% run_nm() %>% 
  wait_finish() %>%
  output_table(only_append = "DV_OUT") %>%
  write_derived_data(file.path(run_dir_path(dsr$msim), "simdata.csv"))

## Now create mest column
dsr <- dsr %>%
  mutate(
    mest = m1 %>% child(run_id = sim) %>%  ## estimations derived from m1
      run_in("Models/m1_simest/est") %>%   ## run in a new subdirectory
      data_path(file.path(run_dir_path(msim), "simdata.csv")) %>%    ## set new data_paths
      ## refill $INPUT. Rename DV to be DV_OUT column. Run nm_diff() command below to see
      ## what has changed
      fill_input(rename = list("DV_OBS" = "DV", "DV" = "DV_OUT"))
  )

# nm_diff(dsr$mest[1])

dsr$mest %>% run_nm() %>% 
  wait_finish() %>%
  summary_wide(parameters = "all")