diff --git a/404.html b/404.html index f2967906..91e437db 100644 --- a/404.html +++ b/404.html @@ -1,66 +1,27 @@ - - - - + + + + - Page not found (404) • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + - - - - -
-
- + +
+ + + - - -
+
+
-
+ + - - diff --git a/CONDUCT.html b/CONDUCT.html index 39edbb3c..8dd58094 100644 --- a/CONDUCT.html +++ b/CONDUCT.html @@ -1,66 +1,12 @@ - - - - - - - -Contributor Code of Conduct • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Contributor Code of Conduct • casebase + + - - - - -
-
- -
- -
+
+
-
- - + + diff --git a/LICENSE-text.html b/LICENSE-text.html index 66316b9b..c1d9e4d8 100644 --- a/LICENSE-text.html +++ b/LICENSE-text.html @@ -1,66 +1,12 @@ - - - - - - - -License • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -License • casebase + + - - - - -
-
- -
- -
+
+
-
- - + + diff --git a/articles/competingRisk.html b/articles/competingRisk.html index a5a2e568..b44e1564 100644 --- a/articles/competingRisk.html +++ b/articles/competingRisk.html @@ -19,6 +19,8 @@ + +

-

From this last plot, we can see that there is no censoring during the first 10 months. Moreover, we see that the last competing event occurs around 20 months. Putting all this information together, we have evidence of two types of patients: very sick patients who either relapse or have a competing event early on, and healthier patients who are eventually lost to follow-up.

+

From this last plot, we can see that there is no censoring during the +first 10 months. Moreover, we see that the last competing event occurs +around 20 months. Putting all this information together, we have +evidence of two types of patients: very sick patients who either relapse +or have a competing event early on, and healthier patients who are +eventually lost to follow-up.

-
-

-Analysis

-

We now turn to the analysis of this dataset. The population-time plots above give evidence of non-constant hazard; therefore, we will explicitly include time in the model. Note that we also include all other variables as possible confounders. First, we include time as a linear term:

+
+

Analysis +

+

We now turn to the analysis of this dataset. The population-time +plots above give evidence of non-constant hazard; therefore, we will +explicitly include time in the model. Note that we also include all +other variables as possible confounders. First, we include time as a +linear term:

-model1 <- fitSmoothHazard(Status ~ ftime + Sex + D + Phase + Source + Age, 
-                          data = bmtcrr, 
-                          ratio = 100,
-                          time = "ftime")
-summary(model1)
-
## 
-## Call:
-## fitSmoothHazard(formula = Status ~ ftime + Sex + D + Phase + 
-##     Source + Age, data = bmtcrr, time = "ftime", ratio = 100)
-## 
-## Coefficients: 
-##                 Estimate Std. Error z value Pr(>|z|)    
-## (Intercept):1  -3.527146   0.685168  -5.148 2.63e-07 ***
-## (Intercept):2  -2.648451   0.463012  -5.720 1.06e-08 ***
-## ftime:1        -0.070927   0.014929  -4.751 2.02e-06 ***
-## ftime:2        -0.105177   0.018349  -5.732 9.93e-09 ***
-## SexM:1         -0.289067   0.283217  -1.021 0.307418    
-## SexM:2         -0.382981   0.236935  -1.616 0.106008    
-## DAML:1         -0.575749   0.299617  -1.922 0.054654 .  
-## DAML:2         -0.100149   0.274099  -0.365 0.714833    
-## PhaseCR2:1      0.186766   0.467042   0.400 0.689237    
-## PhaseCR2:2      0.286425   0.332270   0.862 0.388673    
-## PhaseCR3:1      0.586630   0.696521   0.842 0.399660    
-## PhaseCR3:2      0.310781   0.530986   0.585 0.558353    
-## PhaseRelapse:1  1.448907   0.391878   3.697 0.000218 ***
-## PhaseRelapse:2  0.792938   0.307933   2.575 0.010023 *  
-## SourcePB:1      0.456442   0.571108   0.799 0.424162    
-## SourcePB:2     -1.013983   0.355666  -2.851 0.004359 ** 
-## Age:1          -0.005242   0.011917  -0.440 0.660007    
-## Age:2           0.028597   0.009929   2.880 0.003976 ** 
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
-## 
-## Residual deviance: 1409.076 on 26444 degrees of freedom
-## 
-## Log-likelihood: -704.5378 on 26444 degrees of freedom
-## 
-## Number of Fisher scoring iterations: 10 
-## 
-## Warning: Hauck-Donner effect detected in the following estimate(s):
-## '(Intercept):1', '(Intercept):2', 'ftime:1', 'ftime:2'
-## 
-## 
-## Reference group is level  1  of the response
-

Because of the results in Turgeon et al (n.d.), the standard errors we obtain from the multinomial logit fit are asymptotically correct, and therefore can be used to construct asymptotic confidence intervals.

-

From this summary, we see that time is indeed significant, as is Phase (only relapse vs. CR1). Interestingly, we see that the type of disease is only significant for the event of interest, whereas the type of transplant and the age of the patient are only significant for the competing event.

-

Next, we include the logarithm of time in the model (which leads to a Weibull hazard):

+model1 <- fitSmoothHazard(Status ~ ftime + Sex + D + Phase + Source + Age, + data = bmtcrr, + ratio = 100, + time = "ftime") +summary(model1)
+
## 
+## Call:
+## fitSmoothHazard(formula = Status ~ ftime + Sex + D + Phase + 
+##     Source + Age, data = bmtcrr, time = "ftime", ratio = 100)
+## 
+## Coefficients: 
+##                 Estimate Std. Error z value Pr(>|z|)    
+## (Intercept):1  -3.527146   0.685168  -5.148 2.63e-07 ***
+## (Intercept):2  -2.648451   0.463012  -5.720 1.06e-08 ***
+## ftime:1        -0.070927   0.014929  -4.751 2.02e-06 ***
+## ftime:2        -0.105177   0.018349  -5.732 9.93e-09 ***
+## SexM:1         -0.289067   0.283217  -1.021 0.307418    
+## SexM:2         -0.382981   0.236935  -1.616 0.106008    
+## DAML:1         -0.575749   0.299617  -1.922 0.054654 .  
+## DAML:2         -0.100149   0.274099  -0.365 0.714833    
+## PhaseCR2:1      0.186766   0.467042   0.400 0.689237    
+## PhaseCR2:2      0.286425   0.332270   0.862 0.388673    
+## PhaseCR3:1      0.586630   0.696521   0.842 0.399660    
+## PhaseCR3:2      0.310781   0.530986   0.585 0.558353    
+## PhaseRelapse:1  1.448907   0.391878   3.697 0.000218 ***
+## PhaseRelapse:2  0.792938   0.307933   2.575 0.010023 *  
+## SourcePB:1      0.456442   0.571108   0.799 0.424162    
+## SourcePB:2     -1.013983   0.355666  -2.851 0.004359 ** 
+## Age:1          -0.005242   0.011917  -0.440 0.660007    
+## Age:2           0.028597   0.009929   2.880 0.003976 ** 
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
+## 
+## Residual deviance: 1409.076 on 26444 degrees of freedom
+## 
+## Log-likelihood: -704.5378 on 26444 degrees of freedom
+## 
+## Number of Fisher scoring iterations: 10 
+## 
+## Warning: Hauck-Donner effect detected in the following estimate(s):
+## '(Intercept):1', '(Intercept):2', 'ftime:1', 'ftime:2'
+## 
+## 
+## Reference group is level  1  of the response
+

Because of the results in Turgeon et al (In Preparation), the standard errors we obtain +from the multinomial logit fit are asymptotically correct, and therefore +can be used to construct asymptotic confidence intervals.

+

From this summary, we see that time is indeed significant, as is +Phase (only relapse vs. CR1). Interestingly, we see that the type of +disease is only significant for the event of interest, whereas the type +of transplant and the age of the patient are only significant for the +competing event.

+

Next, we include the logarithm of time in the model (which leads to a +Weibull hazard):

-model2 <- fitSmoothHazard(Status ~ log(ftime) + Sex + D + Phase + Source + Age, 
-                          data = bmtcrr, 
-                          ratio = 100, 
-                          time = "ftime")
-summary(model2)
-
## 
-## Call:
-## fitSmoothHazard(formula = Status ~ log(ftime) + Sex + D + Phase + 
-##     Source + Age, data = bmtcrr, time = "ftime", ratio = 100)
-## 
-## Coefficients: 
-##                 Estimate Std. Error z value Pr(>|z|)    
-## (Intercept):1  -3.976762   0.699660  -5.684 1.32e-08 ***
-## (Intercept):2  -3.069308   0.465495  -6.594 4.29e-11 ***
-## log(ftime):1   -0.327063   0.069777  -4.687 2.77e-06 ***
-## log(ftime):2   -0.403220   0.056786  -7.101 1.24e-12 ***
-## SexM:1         -0.413731   0.291497  -1.419  0.15580    
-## SexM:2         -0.521801   0.240157  -2.173  0.02980 *  
-## DAML:1         -0.695303   0.306421  -2.269  0.02326 *  
-## DAML:2         -0.180805   0.287170  -0.630  0.52895    
-## PhaseCR2:1      0.252923   0.468205   0.540  0.58906    
-## PhaseCR2:2      0.365004   0.332952   1.096  0.27296    
-## PhaseCR3:1      0.441402   0.710580   0.621  0.53448    
-## PhaseCR3:2      0.118189   0.535142   0.221  0.82521    
-## PhaseRelapse:1  1.447889   0.394226   3.673  0.00024 ***
-## PhaseRelapse:2  0.821988   0.309721   2.654  0.00796 ** 
-## SourcePB:1      0.662955   0.598582   1.108  0.26806    
-## SourcePB:2     -0.920304   0.370576  -2.483  0.01301 *  
-## Age:1          -0.003153   0.011766  -0.268  0.78873    
-## Age:2           0.028695   0.009869   2.908  0.00364 ** 
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
-## 
-## Residual deviance: 1508.083 on 26444 degrees of freedom
-## 
-## Log-likelihood: -754.0415 on 26444 degrees of freedom
-## 
-## Number of Fisher scoring iterations: 8 
-## 
-## Warning: Hauck-Donner effect detected in the following estimate(s):
-## '(Intercept):1', '(Intercept):2'
-## 
-## 
-## Reference group is level  1  of the response
-

As we can see, the results are similar to the ones with a Gompertz hazard, although Sex is now significant for the competing event.

-

Finally, using splines, we can be quite flexible about the way the hazard depends on time:

+model2 <- fitSmoothHazard(Status ~ log(ftime) + Sex + D + Phase + Source + Age, + data = bmtcrr, + ratio = 100, + time = "ftime") +summary(model2)
+
## 
+## Call:
+## fitSmoothHazard(formula = Status ~ log(ftime) + Sex + D + Phase + 
+##     Source + Age, data = bmtcrr, time = "ftime", ratio = 100)
+## 
+## Coefficients: 
+##                 Estimate Std. Error z value Pr(>|z|)    
+## (Intercept):1  -3.976762   0.699660  -5.684 1.32e-08 ***
+## (Intercept):2  -3.069308   0.465495  -6.594 4.29e-11 ***
+## log(ftime):1   -0.327063   0.069777  -4.687 2.77e-06 ***
+## log(ftime):2   -0.403220   0.056786  -7.101 1.24e-12 ***
+## SexM:1         -0.413731   0.291497  -1.419  0.15580    
+## SexM:2         -0.521801   0.240157  -2.173  0.02980 *  
+## DAML:1         -0.695303   0.306421  -2.269  0.02326 *  
+## DAML:2         -0.180805   0.287170  -0.630  0.52895    
+## PhaseCR2:1      0.252923   0.468205   0.540  0.58906    
+## PhaseCR2:2      0.365004   0.332952   1.096  0.27296    
+## PhaseCR3:1      0.441402   0.710580   0.621  0.53448    
+## PhaseCR3:2      0.118189   0.535142   0.221  0.82521    
+## PhaseRelapse:1  1.447889   0.394226   3.673  0.00024 ***
+## PhaseRelapse:2  0.821988   0.309721   2.654  0.00796 ** 
+## SourcePB:1      0.662955   0.598582   1.108  0.26806    
+## SourcePB:2     -0.920304   0.370576  -2.483  0.01301 *  
+## Age:1          -0.003153   0.011766  -0.268  0.78873    
+## Age:2           0.028695   0.009869   2.908  0.00364 ** 
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
+## 
+## Residual deviance: 1508.083 on 26444 degrees of freedom
+## 
+## Log-likelihood: -754.0415 on 26444 degrees of freedom
+## 
+## Number of Fisher scoring iterations: 8 
+## 
+## Warning: Hauck-Donner effect detected in the following estimate(s):
+## '(Intercept):1', '(Intercept):2'
+## 
+## 
+## Reference group is level  1  of the response
+

As we can see, the results are similar to the ones with a Gompertz +hazard, although Sex is now significant for the competing event.

+

Finally, using splines, we can be quite flexible about the way the +hazard depends on time:

-model3 <- fitSmoothHazard(
-    Status ~ splines::bs(ftime) + Sex + D + Phase + Source + Age, 
-    data = bmtcrr, 
-    ratio = 100, 
-    time = "ftime")
-summary(model3)
-
## 
-## Call:
-## fitSmoothHazard(formula = Status ~ splines::bs(ftime) + Sex + 
-##     D + Phase + Source + Age, data = bmtcrr, time = "ftime", 
-##     ratio = 100)
-## 
-## Coefficients: 
-##                         Estimate Std. Error z value Pr(>|z|)    
-## (Intercept):1          -3.714285   0.697993  -5.321 1.03e-07 ***
-## (Intercept):2          -3.168984   0.498239  -6.360 2.01e-10 ***
-## splines::bs(ftime)1:1  -0.212237   2.256878  -0.094 0.925077    
-## splines::bs(ftime)1:2   6.902278   3.669973   1.881 0.060007 .  
-## splines::bs(ftime)2:1 -15.567038   8.068389      NA       NA    
-## splines::bs(ftime)2:2 -76.712396  25.661616      NA       NA    
-## splines::bs(ftime)3:1  -2.723383  10.472710      NA       NA    
-## splines::bs(ftime)3:2  -2.864418  22.204096      NA       NA    
-## SexM:1                 -0.283588   0.282655  -1.003 0.315715    
-## SexM:2                 -0.420961   0.236815  -1.778 0.075470 .  
-## DAML:1                 -0.623451   0.301696  -2.066 0.038782 *  
-## DAML:2                 -0.127162   0.275996  -0.461 0.644985    
-## PhaseCR2:1              0.120167   0.464896   0.258 0.796035    
-## PhaseCR2:2              0.215708   0.330313   0.653 0.513729    
-## PhaseCR3:1              0.494530   0.692452   0.714 0.475121    
-## PhaseCR3:2              0.228801   0.525830   0.435 0.663473    
-## PhaseRelapse:1          1.451148   0.392110   3.701 0.000215 ***
-## PhaseRelapse:2          0.821627   0.310515   2.646 0.008145 ** 
-## SourcePB:1              0.444775   0.571636   0.778 0.436526    
-## SourcePB:2             -1.127417   0.358987  -3.141 0.001686 ** 
-## Age:1                  -0.005600   0.011888  -0.471 0.637619    
-## Age:2                   0.028070   0.009914   2.831 0.004634 ** 
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
-## 
-## Residual deviance: 1402.024 on 26440 degrees of freedom
-## 
-## Log-likelihood: -701.0122 on 26440 degrees of freedom
-## 
-## Number of Fisher scoring iterations: 16 
-## 
-## Warning: Hauck-Donner effect detected in the following estimate(s):
-## '(Intercept):1', '(Intercept):2', 'splines::bs(ftime)2:1', 'splines::bs(ftime)2:2', 'splines::bs(ftime)3:1', 'splines::bs(ftime)3:2'
-## 
-## 
-## Reference group is level  1  of the response
-

Again, we see that the results are quite similar for this third model.

-
-

-Absolute risk

+model3 <- fitSmoothHazard( + Status ~ splines::bs(ftime) + Sex + D + Phase + Source + Age, + data = bmtcrr, + ratio = 100, + time = "ftime") +summary(model3)
+
## 
+## Call:
+## fitSmoothHazard(formula = Status ~ splines::bs(ftime) + Sex + 
+##     D + Phase + Source + Age, data = bmtcrr, time = "ftime", 
+##     ratio = 100)
+## 
+## Coefficients: 
+##                         Estimate Std. Error z value Pr(>|z|)    
+## (Intercept):1          -3.714285   0.697993  -5.321 1.03e-07 ***
+## (Intercept):2          -3.168984   0.498239  -6.360 2.01e-10 ***
+## splines::bs(ftime)1:1  -0.212237   2.256878  -0.094 0.925077    
+## splines::bs(ftime)1:2   6.902278   3.669973   1.881 0.060007 .  
+## splines::bs(ftime)2:1 -15.567038   8.068389      NA       NA    
+## splines::bs(ftime)2:2 -76.712396  25.661616      NA       NA    
+## splines::bs(ftime)3:1  -2.723383  10.472710      NA       NA    
+## splines::bs(ftime)3:2  -2.864418  22.204096      NA       NA    
+## SexM:1                 -0.283588   0.282655  -1.003 0.315715    
+## SexM:2                 -0.420961   0.236815  -1.778 0.075470 .  
+## DAML:1                 -0.623451   0.301696  -2.066 0.038782 *  
+## DAML:2                 -0.127162   0.275996  -0.461 0.644985    
+## PhaseCR2:1              0.120167   0.464896   0.258 0.796035    
+## PhaseCR2:2              0.215708   0.330313   0.653 0.513729    
+## PhaseCR3:1              0.494530   0.692452   0.714 0.475121    
+## PhaseCR3:2              0.228801   0.525830   0.435 0.663473    
+## PhaseRelapse:1          1.451148   0.392110   3.701 0.000215 ***
+## PhaseRelapse:2          0.821627   0.310515   2.646 0.008145 ** 
+## SourcePB:1              0.444775   0.571636   0.778 0.436526    
+## SourcePB:2             -1.127417   0.358987  -3.141 0.001686 ** 
+## Age:1                  -0.005600   0.011888  -0.471 0.637619    
+## Age:2                   0.028070   0.009914   2.831 0.004634 ** 
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+## Names of linear predictors: log(mu[,2]/mu[,1]), log(mu[,3]/mu[,1])
+## 
+## Residual deviance: 1402.024 on 26440 degrees of freedom
+## 
+## Log-likelihood: -701.0122 on 26440 degrees of freedom
+## 
+## Number of Fisher scoring iterations: 16 
+## 
+## Warning: Hauck-Donner effect detected in the following estimate(s):
+## '(Intercept):1', '(Intercept):2', 'splines::bs(ftime)2:1', 'splines::bs(ftime)2:2', 'splines::bs(ftime)3:1', 'splines::bs(ftime)3:2'
+## 
+## 
+## Reference group is level  1  of the response
+

Again, we see that the results are quite similar for this third +model.

+
+

Absolute risk +

We now look at the 2-year risk of relapse:

-linearRisk <- absoluteRisk(object = model1, time = 24, newdata = bmtcrr[1:10,])
-logRisk <- absoluteRisk(object = model2, time = 24, newdata = bmtcrr[1:10,])
-splineRisk <- absoluteRisk(object = model3, time = 24, newdata = bmtcrr[1:10,])
+linearRisk <- absoluteRisk(object = model1, time = 24, newdata = bmtcrr[1:10,]) +logRisk <- absoluteRisk(object = model2, time = 24, newdata = bmtcrr[1:10,]) +splineRisk <- absoluteRisk(object = model3, time = 24, newdata = bmtcrr[1:10,])
-plot(linearRisk, logRisk,
-     xlab = "Linear", ylab = "Log/Spline", pch = 19,
-     xlim = c(0,1), ylim = c(0,1), col = 'red')
-points(linearRisk, splineRisk,
-       col = 'blue', pch = 19)
-abline(a = 0, b = 1, lty = 2, lwd = 2)
-legend("topleft", legend = c("Log", "Spline"),
-       pch = 19, col = c("red", "blue"))
+plot(linearRisk, logRisk, + xlab = "Linear", ylab = "Log/Spline", pch = 19, + xlim = c(0,1), ylim = c(0,1), col = 'red') +points(linearRisk, splineRisk, + col = 'blue', pch = 19) +abline(a = 0, b = 1, lty = 2, lwd = 2) +legend("topleft", legend = c("Log", "Spline"), + pch = 19, col = c("red", "blue"))

-
-

-Session information

-
## R version 4.0.2 (2020-06-22)
-## Platform: x86_64-pc-linux-gnu (64-bit)
-## Running under: Ubuntu 16.04.6 LTS
-## 
-## Matrix products: default
-## BLAS:   /usr/lib/openblas-base/libblas.so.3
-## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
-## 
-## attached base packages:
-## [1] stats     graphics  grDevices utils     datasets  methods   base     
-## 
-## other attached packages:
-## [1] casebase_0.9.1.9999
-## 
-## loaded via a namespace (and not attached):
-##  [1] highr_0.8         compiler_4.0.2    pillar_1.4.7      tools_4.0.2      
-##  [5] digest_0.6.27     lattice_0.20-41   nlme_3.1-148      evaluate_0.14    
-##  [9] memoise_2.0.0     lifecycle_0.2.0   tibble_3.0.6      gtable_0.3.0     
-## [13] mgcv_1.8-31       pkgconfig_2.0.3   rlang_0.4.10      Matrix_1.2-18    
-## [17] yaml_2.2.1        pkgdown_1.6.1     xfun_0.20         fastmap_1.1.0    
-## [21] stringr_1.4.0     knitr_1.31        desc_1.2.0        fs_1.5.0         
-## [25] vctrs_0.3.6       systemfonts_1.0.0 stats4_4.0.2      rprojroot_2.0.2  
-## [29] grid_4.0.2        glue_1.4.2        data.table_1.13.6 R6_2.5.0         
-## [33] textshaping_0.2.1 survival_3.1-12   VGAM_1.1-5        rmarkdown_2.6    
-## [37] ggplot2_3.3.3     magrittr_2.0.1    scales_1.1.1      htmltools_0.5.1.1
-## [41] ellipsis_0.3.1    splines_4.0.2     assertthat_0.2.1  colorspace_2.0-0 
-## [45] ragg_0.4.1        stringi_1.5.3     munsell_0.5.0     cachem_1.0.3     
-## [49] crayon_1.4.0
+
+

Session information +

+
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.2 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## attached base packages:
+## [1] stats     graphics  grDevices utils     datasets  methods   base     
+## 
+## other attached packages:
+## [1] casebase_0.10.2.9999
+## 
+## loaded via a namespace (and not attached):
+##  [1] sass_0.4.7        utf8_1.2.3        generics_0.1.3    stringi_1.7.12   
+##  [5] lattice_0.21-8    digest_0.6.33     magrittr_2.0.3    evaluate_0.21    
+##  [9] grid_4.3.1        fastmap_1.1.1     rprojroot_2.0.3   jsonlite_1.8.7   
+## [13] Matrix_1.5-4.1    survival_3.5-5    mgcv_1.8-42       purrr_1.0.1      
+## [17] fansi_1.0.4       scales_1.2.1      textshaping_0.3.6 jquerylib_0.1.4  
+## [21] cli_3.6.1         rlang_1.1.1       munsell_0.5.0     splines_4.3.1    
+## [25] cachem_1.0.8      yaml_2.3.7        tools_4.3.1       memoise_2.0.1    
+## [29] dplyr_1.1.2       colorspace_2.1-0  ggplot2_3.4.2     VGAM_1.1-8       
+## [33] vctrs_0.6.3       R6_2.5.1          stats4_4.3.1      lifecycle_1.0.3  
+## [37] stringr_1.5.0     fs_1.6.3          ragg_1.2.5        pkgconfig_2.0.3  
+## [41] desc_1.4.2        pkgdown_2.0.7     bslib_0.5.0       pillar_1.9.0     
+## [45] gtable_0.3.3      data.table_1.14.8 glue_1.6.2        systemfonts_1.0.4
+## [49] xfun_0.39         tibble_3.2.1      tidyselect_1.2.0  highr_0.10       
+## [53] knitr_1.43        htmltools_0.5.5   nlme_3.1-162      rmarkdown_2.23   
+## [57] compiler_4.3.1
-
-

-References

+
+

References +

  1. -Efron, Bradley. 1977. “The Efficiency of Cox’s Likelihood Function for Censored Data.” Journal of the American Statistical Association 72 (359). Taylor & Francis Group: 557–65. +Efron, Bradley. 1977. “The Efficiency of Cox’s Likelihood Function for +Censored Data.” Journal of the American Statistical Association +72 (359). Taylor & Francis Group: 557–65.

  2. -Hanley, James A, and Olli S Miettinen. 2009. “Fitting Smooth-in-Time Prognostic Risk Functions via Logistic Regression.” The International Journal of Biostatistics 5 (1). +Hanley, James A, and Olli S Miettinen. 2009. “Fitting Smooth-in-Time +Prognostic Risk Functions via Logistic Regression.” The +International Journal of Biostatistics 5 (1).

  3. -Mantel, Nathan. 1973. “Synthetic Retrospective Studies and Related Topics.” Biometrics. JSTOR, 479–86. +Mantel, Nathan. 1973. “Synthetic Retrospective Studies and Related +Topics.” Biometrics. JSTOR, 479–86.

  4. -Saarela, Olli. 2015. “A Case-Base Sampling Method for Estimating Recurrent Event Intensities.” Lifetime Data Analysis. Springer, 1–17. +Saarela, Olli. 2015. “A Case-Base Sampling Method for Estimating +Recurrent Event Intensities.” Lifetime Data Analysis. Springer, +1–17.

  5. -Saarela, Olli, and Elja Arjas. 2015. “Non-Parametric Bayesian Hazard Regression for Chronic Disease Risk Assessment.” Scandinavian Journal of Statistics 42 (2). Wiley Online Library: 609–26. +Saarela, Olli, and Elja Arjas. 2015. “Non-Parametric Bayesian Hazard +Regression for Chronic Disease Risk Assessment.” Scandinavian +Journal of Statistics 42 (2). Wiley Online Library: 609–26.

  6. -Scrucca, L, A Santucci, and F Aversa. 2010. “Regression Modeling of Competing Risk Using R: An in Depth Guide for Clinicians.” Bone Marrow Transplantation 45 (9). Nature Publishing Group: 1388–95. +Scrucca, L, A Santucci, and F Aversa. 2010. “Regression Modeling of +Competing Risk Using R: An in Depth Guide for Clinicians.” Bone +Marrow Transplantation 45 (9). Nature Publishing Group: 1388–95.

  7. -Kalbfleisch, John D., and Ross L. Prentice. The statistical analysis of failure time data. Vol. 360. John Wiley & Sons, 2011. +Kalbfleisch, John D., and Ross L. Prentice. The statistical analysis of +failure time data. Vol. 360. John Wiley & Sons, 2011.

  8. -Cox, D. R. “Regression models and life tables.” Journal of the Royal Statistical Society 34 (1972): 187-220. +Cox, D. R. “Regression models and life tables.” Journal of the Royal +Statistical Society 34 (1972): 187-220.

-
-
-

Scrucca, L., A. Santucci, and F. Aversa. 2010. “Regression Modeling of Competing Risk Using R: An in Depth Guide for Clinicians.” Bone Marrow Transplantation 45 (9). Nature Publishing Group: 1388–95.

+
+
+Scrucca, L., A. Santucci, and F. Aversa. 2010. “Regression +Modeling of Competing Risk Using r: An in Depth Guide for +Clinicians.” Bone Marrow Transplantation 45 (9): +1388–95.
-
-

Turgeon, M., S. Bhatnagar, and O. Saarela. n.d. “A Novel Approach to Competing Risk Analysis Using Case-Base Sampling.”

+
+Turgeon, M., S. Bhatnagar, and O. Saarela. In Preparation. “A +Novel Approach to Competing Risk Analysis Using Case-Base +Sampling.”
@@ -542,11 +614,13 @@

@@ -555,5 +629,7 @@

+ + diff --git a/articles/competingRisk_files/figure-html/absRiskPlot-1.png b/articles/competingRisk_files/figure-html/absRiskPlot-1.png index b93b27a1..6433f459 100644 Binary files a/articles/competingRisk_files/figure-html/absRiskPlot-1.png and b/articles/competingRisk_files/figure-html/absRiskPlot-1.png differ diff --git a/articles/competingRisk_files/figure-html/poptime1-1.png b/articles/competingRisk_files/figure-html/poptime1-1.png index 32c92d36..39788949 100644 Binary files a/articles/competingRisk_files/figure-html/poptime1-1.png and b/articles/competingRisk_files/figure-html/poptime1-1.png differ diff --git a/articles/competingRisk_files/figure-html/poptime2-1.png b/articles/competingRisk_files/figure-html/poptime2-1.png index 0d708081..5fa04e5e 100644 Binary files a/articles/competingRisk_files/figure-html/poptime2-1.png and b/articles/competingRisk_files/figure-html/poptime2-1.png differ diff --git a/articles/competingRisk_files/figure-html/poptime3-1.png b/articles/competingRisk_files/figure-html/poptime3-1.png index 23039178..6f33614b 100644 Binary files a/articles/competingRisk_files/figure-html/poptime3-1.png and b/articles/competingRisk_files/figure-html/poptime3-1.png differ diff --git a/articles/customizingpopTime.html b/articles/customizingpopTime.html index bed34663..b4b81cdc 100644 --- a/articles/customizingpopTime.html +++ b/articles/customizingpopTime.html @@ -19,6 +19,8 @@ + +

-
-

-Change the Facet Labels

-

The default arguments to the facet.params argument is given by:

+
+

Change the Facet Labels +

+

The default arguments to the facet.params argument is +given by:

-exposure_variable <- attr(x, "exposure")
-default_facet_params <- list(facets = exposure_variable, ncol = 1)
-

The population time area stratified by treatment arm is then plotted using the following code

+exposure_variable <- attr(x, "exposure") +default_facet_params <- list(facets = exposure_variable, ncol = 1)
+

The population time area stratified by treatment arm is then plotted +using the following code

-ggplot() + 
-    base::do.call("geom_ribbon", new_ribbon_params) + 
-    base::do.call("facet_wrap", default_facet_params) 
+ggplot() + + base::do.call("geom_ribbon", new_ribbon_params) + + base::do.call("facet_wrap", default_facet_params)

-# this is equivalent to
-# plot(x, add.case.series = FALSE)
-

We can modify the facet labels by either changing the factor labels in the data or specifying the labeller argument. See this blog post for further details. Here is an example of how we can change the facet labels using the plot method provided by the casebase package:

+# this is equivalent to +# plot(x, add.case.series = FALSE)

+

We can modify the facet labels by either changing the factor labels +in the data or specifying the labeller argument. See this +blog post for further details. Here is an example of how we can +change the facet labels using the plot method provided by +the casebase package:

-# Use character vectors as lookup tables:
-group_status <- c(
-  `0` = "Control Arm",
-  `1` = "Screening Arm"
-)
-
-plot(x, 
-     add.case.series = FALSE, # do not plot the case serires
-     facet.params = list(labeller = labeller(ScrArm = group_status), # change labels
-                         strip.position = "right") # change facet position
-     ) 
+# Use character vectors as lookup tables: +group_status <- c( + `0` = "Control Arm", + `1` = "Screening Arm" +) + +plot(x, + add.case.series = FALSE, # do not plot the case serires + facet.params = list(labeller = labeller(ScrArm = group_status), # change labels + strip.position = "right") # change facet position + )

-
-

-Changing the Plot Aesthetics

-

Suppose we want to change the color of the points and the legend labels. We use the bmtcrr dataset as the example in this section.

-

The reason there are both fill.params and color.params arguments, is because by default, we use shape = 21 which is a filled circle (see the pch argument of the graphics::points function for details). Shapes from 21 to 25 can be colored and filled with different colors: color.params gives the border color and fill.params gives the fill color (sometimes referred to as the background color).

-

The default fill colors for the case series, base series and competing event are given by the qualitative palette from the colorspace R package:

+
+

Changing the Plot Aesthetics +

+

Suppose we want to change the color of the points and the legend +labels. We use the bmtcrr dataset as the example in this +section.

+

The reason there are both fill.params and +color.params arguments, is because by default, we use +shape = 21 which is a filled circle (see the +pch argument of the graphics::points function +for details). Shapes from 21 to 25 can be colored and filled with +different colors: color.params gives the border color and +fill.params gives the fill color (sometimes referred to as +the background color).

+

The default fill colors for the case series, base series and +competing event are given by the qualitative palette from the colorspace +R package:

-fill_cols <- colorspace::qualitative_hcl(n = 3, palette = "Dark3")
-
-(fill_colors <- c("Case series" = fill_cols[1],
-                 "Competing event" = fill_cols[3],
-                 "Base series" = fill_cols[2]))
-
##     Case series Competing event     Base series 
-##       "#E16A86"       "#009ADE"       "#50A315"
-

The corresponding default border colors are given by the colorspace::darken function applied to the fill colors above:

+fill_cols <- colorspace::qualitative_hcl(n = 3, palette = "Dark3") + +(fill_colors <- c("Case series" = fill_cols[1], + "Competing event" = fill_cols[3], + "Base series" = fill_cols[2]))
+
##     Case series Competing event     Base series 
+##       "#E16A86"       "#009ADE"       "#50A315"
+

The corresponding default border colors are given by the +colorspace::darken function applied to the fill colors +above:

-color_cols <- colorspace::darken(col = fill_cols, amount = 0.3)
-
-(color_colors <- c("Case series" = color_cols[1],
-                   "Competing event" = color_cols[3],
-                   "Base series" = color_cols[2]))
-
##     Case series Competing event     Base series 
-##       "#AB3A59"       "#026A9A"       "#347004"
+color_cols <- colorspace::darken(col = fill_cols, amount = 0.3) + +(color_colors <- c("Case series" = color_cols[1], + "Competing event" = color_cols[3], + "Base series" = color_cols[2]))
+
##     Case series Competing event     Base series 
+##       "#AB3A59"       "#026A9A"       "#347004"

This is what the points look like:

-
-

-Change only the point colors

-

If you only want to change the color points, you must specify a named vector exactly as specified in the fill_colors object created above. Note that the names Case series, Base Series and Competing event must remain the same, otherwise the function won’t know how to map the colors to the corresponding points. This is because the colour and fill aesthetic mappings in the geom_point functions have been set to Case series, Base Series and Competing event. For example, the default call to geom_point for the case series is given by:

+
+

Change only the point colors +

+

If you only want to change the color points, you must specify a named +vector exactly as specified in the fill_colors object +created above. Note that the names Case series, +Base Series and Competing event must remain +the same, otherwise the function won’t know how to map the colors to the +corresponding points. This is because the colour and +fill aesthetic mappings in the geom_point +functions have been set to Case series, +Base Series and Competing event. For example, +the default call to geom_point for the case series is given +by:

-ggplot() + do.call("geom_point", list(data = x[event == 1],
-                     mapping = aes(x = time, y = yc, 
-                                   colour = "Case series", fill = "Case series"),
-                     size = 1.5,
-                     alpha = 0.5,
-                     shape = 21))
+ggplot() + do.call("geom_point", list(data = x[event == 1], + mapping = aes(x = time, y = yc, + colour = "Case series", fill = "Case series"), + size = 1.5, + alpha = 0.5, + shape = 21))

-

We define a new set of colors using a sequential (multi-hue) palette:

+

We define a new set of colors using a sequential (multi-hue) +palette:

-fill_cols <- colorspace::sequential_hcl(n = 3, palette = "Viridis")
-
-(fill_colors <- c("Case series" = fill_cols[1],
-                 "Competing event" = fill_cols[3],
-                 "Base series" = fill_cols[2]))
-
##     Case series Competing event     Base series 
-##       "#4B0055"       "#FDE333"       "#009B95"
+fill_cols <- colorspace::sequential_hcl(n = 3, palette = "Viridis") + +(fill_colors <- c("Case series" = fill_cols[1], + "Competing event" = fill_cols[3], + "Base series" = fill_cols[2]))
+
##     Case series Competing event     Base series 
+##       "#4B0055"       "#FDE333"       "#009B95"
-color_cols <- colorspace::darken(col = fill_cols, amount = 0.3)
-
-(color_colors <- c("Case series" = color_cols[1],
-                   "Competing event" = color_cols[3],
-                   "Base series" = color_cols[2]))
-
##     Case series Competing event     Base series 
-##       "#3A0142"       "#AC9900"       "#0A6B66"
-

We then pass fill_cols and color_cols to the fill.params and color.params arguments, respectively. Internally, this gets passed to the ggplot2::scale_fill_manual and ggplot2::scale_color_manual functions, respectively:

+color_cols <- colorspace::darken(col = fill_cols, amount = 0.3) + +(color_colors <- c("Case series" = color_cols[1], + "Competing event" = color_cols[3], + "Base series" = color_cols[2]))
+
##     Case series Competing event     Base series 
+##       "#3A0142"       "#AC9900"       "#0A6B66"
+

We then pass fill_cols and color_cols to +the fill.params and color.params arguments, +respectively. Internally, this gets passed to the ggplot2::scale_fill_manual +and ggplot2::scale_color_manual +functions, respectively:

-do.call("scale_fill_manual", utils::modifyList(
-  list(name = element_blank(),
-       breaks = c("Case series", "Competing event", "Base series"),
-       values = old_cols), list(values = fill_colors))
-)
-
-do.call("scale_colour_manual", utils::modifyList(
-  list(name = element_blank(),
-       breaks = c("Case series", "Competing event", "Base series"),
-       values = old_cols), list(values = color_colors))
-)
+do.call("scale_fill_manual", utils::modifyList( + list(name = element_blank(), + breaks = c("Case series", "Competing event", "Base series"), + values = old_cols), list(values = fill_colors)) +) + +do.call("scale_colour_manual", utils::modifyList( + list(name = element_blank(), + breaks = c("Case series", "Competing event", "Base series"), + values = old_cols), list(values = color_colors)) +)

Here is the code to only change the colors:

-# this data ships with the casebase package
-data("bmtcrr")
-
-popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D")
-
## 'Status' will be used as the event variable
+# this data ships with the casebase package +data("bmtcrr") + +popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D") +
## 'Status' will be used as the event variable
-plot(popTimeData,
-     add.case.series = TRUE, 
-     add.base.series = TRUE,
-     add.competing.event = TRUE,
-     comprisk = TRUE,
-     fill.params = list(values = fill_colors),
-     color.params = list(value = color_colors))
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + add.competing.event = TRUE, + comprisk = TRUE, + fill.params = list(values = fill_colors), + color.params = list(value = color_colors))

-

Note that if you only specify one of the fill.params or color.params arguments, the plot method will automatically set one equal to the other and return a warning message:

+

Note that if you only specify one of the fill.params or +color.params arguments, the plot method will automatically +set one equal to the other and return a warning message:

-plot(popTimeData,
-     add.case.series = TRUE, 
-     add.base.series = TRUE,
-     add.competing.event = TRUE,
-     ratio = 1,
-     comprisk = TRUE,
-     legend = TRUE,
-     fill.params = list(values = fill_colors))
-
## Warning in plot.popTime(popTimeData, add.case.series = TRUE, add.base.series
-## = TRUE, : fill.params has been specified by the user but color.params has not.
-## Setting color.params to be equal to fill.params.
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + add.competing.event = TRUE, + ratio = 1, + comprisk = TRUE, + legend = TRUE, + fill.params = list(values = fill_colors)) +
## Warning in plot.popTime(popTimeData, add.case.series = TRUE, add.base.series =
+## TRUE, : fill.params has been specified by the user but color.params has not.
+## Setting color.params to be equal to fill.params.

-
-

-Change Point Color and Legend Labels

-

In order to change both the point colors and legend labels, we must modify the aesthetic mapping of the geom_point calls as follows:

+
+

Change Point Color and Legend Labels +

+

In order to change both the point colors and legend labels, we must +modify the aesthetic mapping of the geom_point calls as +follows:

-# this data ships with the casebase package
-data("bmtcrr")
-
-popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D")
-
## 'Status' will be used as the event variable
+# this data ships with the casebase package +data("bmtcrr") + +popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D")
+
## 'Status' will be used as the event variable
-plot(popTimeData,
-     add.case.series = TRUE, 
-     add.base.series = TRUE,
-     add.competing.event = TRUE,
-     comprisk = TRUE,
-     case.params = list(mapping = aes(x = time, y = yc, fill = "Relapse", colour = "Relapse")),
-     base.params = list(mapping = aes(x = time, y = ycoord, fill = "Base series", colour = "Base series")),
-     competing.params = list(mapping = aes(x = time, y = yc, fill = "Competing event", colour = "Competing event")),
-     fill.params = list(name = "Legend Name",
-                          breaks = c("Relapse", "Base series", "Competing event"),
-                          values = c("Relapse" = "blue", "Competing event" = "hotpink", "Base series" = "orange")))
-
## Warning in plot.popTime(popTimeData, add.case.series = TRUE, add.base.series
-## = TRUE, : fill.params has been specified by the user but color.params has not.
-## Setting color.params to be equal to fill.params.
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + add.competing.event = TRUE, + comprisk = TRUE, + case.params = list(mapping = aes(x = time, y = yc, fill = "Relapse", colour = "Relapse")), + base.params = list(mapping = aes(x = time, y = ycoord, fill = "Base series", colour = "Base series")), + competing.params = list(mapping = aes(x = time, y = yc, fill = "Competing event", colour = "Competing event")), + fill.params = list(name = "Legend Name", + breaks = c("Relapse", "Base series", "Competing event"), + values = c("Relapse" = "blue", "Competing event" = "hotpink", "Base series" = "orange")))
+
## Warning in plot.popTime(popTimeData, add.case.series = TRUE, add.base.series =
+## TRUE, : fill.params has been specified by the user but color.params has not.
+## Setting color.params to be equal to fill.params.

-

NOTE: the lists being passed to the .params arguments must be named arguments, otherwise they will give unexpected behavior. For example

+

NOTE: the lists being passed to the .params arguments +must be named arguments, otherwise they will give unexpected behavior. +For example

-# this will work because mapping is the name of the 
-# argument of the list 
-case.params = list(mapping = aes(x = time, y = yc, colour = "Relapse", fill = "Relapse"))
+# this will work because mapping is the name of the +# argument of the list +case.params = list(mapping = aes(x = time, y = yc, colour = "Relapse", fill = "Relapse"))
-# this will NOT work because the argument of the list has no name
-# and therefore utils::modifyList, will not override the defaults. 
-case.params = list(aes(x = time, y = yc, colour = "Relapse", fill = "Relapse"))
+# this will NOT work because the argument of the list has no name +# and therefore utils::modifyList, will not override the defaults. +case.params = list(aes(x = time, y = yc, colour = "Relapse", fill = "Relapse")) -
-

-Session information

-
## R version 4.0.2 (2020-06-22)
-## Platform: x86_64-pc-linux-gnu (64-bit)
-## Running under: Ubuntu 16.04.6 LTS
-## 
-## Matrix products: default
-## BLAS:   /usr/lib/openblas-base/libblas.so.3
-## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
-## 
-## attached base packages:
-## [1] stats     graphics  grDevices utils     datasets  methods   base     
-## 
-## other attached packages:
-## [1] colorspace_2.0-0    data.table_1.13.6   ggplot2_3.3.3      
-## [4] casebase_0.9.1.9999 survival_3.1-12    
-## 
-## loaded via a namespace (and not attached):
-##  [1] highr_0.8         pillar_1.4.7      compiler_4.0.2    tools_4.0.2      
-##  [5] digest_0.6.27     nlme_3.1-148      evaluate_0.14     memoise_2.0.0    
-##  [9] lifecycle_0.2.0   tibble_3.0.6      gtable_0.3.0      lattice_0.20-41  
-## [13] mgcv_1.8-31       pkgconfig_2.0.3   rlang_0.4.10      Matrix_1.2-18    
-## [17] yaml_2.2.1        pkgdown_1.6.1     xfun_0.20         fastmap_1.1.0    
-## [21] withr_2.4.1       stringr_1.4.0     knitr_1.31        vctrs_0.3.6      
-## [25] desc_1.2.0        fs_1.5.0          systemfonts_1.0.0 stats4_4.0.2     
-## [29] rprojroot_2.0.2   grid_4.0.2        glue_1.4.2        R6_2.5.0         
-## [33] textshaping_0.2.1 VGAM_1.1-5        rmarkdown_2.6     farver_2.0.3     
-## [37] magrittr_2.0.1    scales_1.1.1      htmltools_0.5.1.1 ellipsis_0.3.1   
-## [41] splines_4.0.2     assertthat_0.2.1  labeling_0.4.2    ragg_0.4.1       
-## [45] stringi_1.5.3     munsell_0.5.0     cachem_1.0.3      crayon_1.4.0
+
+

Session information +

+
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.2 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## attached base packages:
+## [1] stats     graphics  grDevices utils     datasets  methods   base     
+## 
+## other attached packages:
+## [1] colorspace_2.1-0     data.table_1.14.8    ggplot2_3.4.2       
+## [4] casebase_0.10.2.9999 survival_3.5-5      
+## 
+## loaded via a namespace (and not attached):
+##  [1] sass_0.4.7        utf8_1.2.3        generics_0.1.3    stringi_1.7.12   
+##  [5] lattice_0.21-8    digest_0.6.33     magrittr_2.0.3    evaluate_0.21    
+##  [9] grid_4.3.1        fastmap_1.1.1     rprojroot_2.0.3   jsonlite_1.8.7   
+## [13] Matrix_1.5-4.1    mgcv_1.8-42       purrr_1.0.1       fansi_1.0.4      
+## [17] scales_1.2.1      textshaping_0.3.6 jquerylib_0.1.4   cli_3.6.1        
+## [21] rlang_1.1.1       munsell_0.5.0     splines_4.3.1     withr_2.5.0      
+## [25] cachem_1.0.8      yaml_2.3.7        tools_4.3.1       memoise_2.0.1    
+## [29] dplyr_1.1.2       VGAM_1.1-8        vctrs_0.6.3       R6_2.5.1         
+## [33] stats4_4.3.1      lifecycle_1.0.3   stringr_1.5.0     fs_1.6.3         
+## [37] ragg_1.2.5        pkgconfig_2.0.3   desc_1.4.2        pkgdown_2.0.7    
+## [41] bslib_0.5.0       pillar_1.9.0      gtable_0.3.3      glue_1.6.2       
+## [45] systemfonts_1.0.4 highr_0.10        xfun_0.39         tibble_3.2.1     
+## [49] tidyselect_1.2.0  knitr_1.43        farver_2.1.1      nlme_3.1-162     
+## [53] htmltools_0.5.5   labeling_0.4.2    rmarkdown_2.23    compiler_4.3.1
@@ -480,11 +543,13 @@

@@ -493,5 +558,7 @@

+ + diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-11-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-11-1.png index cc4f1cb6..e10899d8 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-11-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-11-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-12-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-12-1.png index 0d55f4bf..a8fa48b3 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-12-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-12-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-15-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-15-1.png index 4167aeb6..1ca78465 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-15-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-16-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-16-1.png index 23a4eedb..c0afeb4f 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-16-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-16-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-17-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-17-1.png index 16d965b0..b6e64161 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-17-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-17-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-5-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-5-1.png index 2d8e31a2..92ca988a 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-5-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-5-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-7-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-7-1.png index 81f95327..2b19ea99 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-7-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-7-1.png differ diff --git a/articles/customizingpopTime_files/figure-html/unnamed-chunk-8-1.png b/articles/customizingpopTime_files/figure-html/unnamed-chunk-8-1.png index 3eeab808..1b39f520 100644 Binary files a/articles/customizingpopTime_files/figure-html/unnamed-chunk-8-1.png and b/articles/customizingpopTime_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/articles/index.html b/articles/index.html index 967de9f8..43f74eb3 100644 --- a/articles/index.html +++ b/articles/index.html @@ -1,66 +1,12 @@ - - - - - - - -Articles • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Articles • casebase + + - - - - -
-
- -
- - -
- - + + diff --git a/articles/plotabsRisk.html b/articles/plotabsRisk.html index 131d4324..732bcb7f 100644 --- a/articles/plotabsRisk.html +++ b/articles/plotabsRisk.html @@ -19,6 +19,8 @@ + +
-
-

-Introduction

-

In this short vignette, we will introduce the plot method for absoluteRisk objects. This method allows you to plot cumulative incidence (CI) or survival curves as a function of time and a given covariate profile. More specifically, the cumulative incidence is given by:

-

\[ CI(x, t) = 1 - exp\left[ - \int_0^t h(x, u) \textrm{d}u \right] \] where \( h(x, t) \) is the hazard function, \( t \) denotes the numerical value (number of units) of a point in prognostic/prospective time and \( x \) is the realization of the vector \( X \) of variates based on the patient’s profile and intervention (if any). And the survival function is given by \[ S(x, t) = 1 - CI(x,t) = exp\left[ - \int_0^t h(x, u) \textrm{d}u \right] \]

+
+

Introduction +

+

In this short vignette, we will introduce the plot +method for absoluteRisk objects. This method allows you to +plot cumulative incidence (CI) or survival curves as a function of time +and a given covariate profile. More specifically, the cumulative +incidence is given by:

+

\[ CI(x, t) = 1 - exp\left[ - \int_0^t +h(x, u) \textrm{d}u \right] \] where \( h(x, t) \) is the hazard +function, \( t \) denotes the numerical value (number of units) of a +point in prognostic/prospective time and \( x \) is the realization of +the vector \( X \) of variates based on the patient’s profile and +intervention (if any). And the survival function is given by \[ S(x, t) = 1 - CI(x,t) = exp\left[ - \int_0^t +h(x, u) \textrm{d}u \right] \]

-
-

-Analysis of the brcancer dataset

-

To illustrate hazard function plots, we will use the breast cancer dataset which contains the observations of 686 women taken from the TH.data package. This dataset is also available from the casebase package. In the following, we will show the CI curve for several covariate profiles.

+
+

Analysis of the brcancer dataset +

+

To illustrate hazard function plots, we will use the breast cancer +dataset which contains the observations of 686 women taken from the TH.data +package. This dataset is also available from the casebase +package. In the following, we will show the CI curve for several +covariate profiles.

-library(casebase)
-#> See example usage at http://sahirbhatnagar.com/casebase/
-library(survival)
-library(ggplot2)
-
-
-data("brcancer")
-mod_cb_glm <- fitSmoothHazard(cens ~ estrec*log(time) + 
-                                  horTh + 
-                                  age + 
-                                  menostat + 
-                                  tsize + 
-                                  tgrade + 
-                                  pnodes + 
-                                  progrec,
-                              data = brcancer,
-                              time = "time", ratio = 10)
-
-summary(mod_cb_glm)
-
-

-Plotting Cumulative Incidence Curves

-

We can use the plot method for objects of class absRiskCB, which is returned by the absoluteRisk function, to plot cumulative incidence curves. For example, suppose we want to compare the cumulative incidence curves of the 1st and 50th individuals in the brcancer dataset. We first call the absoluteRisk function and specify the newdata argument. Note that since time is missing, the risk estimate is calculated at the observed failure times.

+library(casebase) +#> See example usage at http://sahirbhatnagar.com/casebase/ +library(survival) +library(ggplot2) + + +data("brcancer") +mod_cb_glm <- fitSmoothHazard(cens ~ estrec*log(time) + + horTh + + age + + menostat + + tsize + + tgrade + + pnodes + + progrec, + data = brcancer, + time = "time", ratio = 10) + +summary(mod_cb_glm)
+
+

Plotting Cumulative Incidence Curves +

+

We can use the plot method for objects of class +absRiskCB, which is returned by the +absoluteRisk function, to plot cumulative incidence curves. +For example, suppose we want to compare the cumulative incidence curves +of the 1st and 50th individuals in the brcancer dataset. We +first call the absoluteRisk function and specify the +newdata argument. Note that since time is missing, the risk +estimate is calculated at the observed failure times.

-smooth_risk_brcancer <- absoluteRisk(object = mod_cb_glm, 
-                                     newdata = brcancer[c(1,50),])
-
-class(smooth_risk_brcancer)
-plot(smooth_risk_brcancer)
+smooth_risk_brcancer <- absoluteRisk(object = mod_cb_glm, + newdata = brcancer[c(1,50),]) + +class(smooth_risk_brcancer) +plot(smooth_risk_brcancer)

-

These curves can be further customized. For example, suppose we want to change the legend title and legend keys:

+

These curves can be further customized. For example, suppose we want +to change the legend title and legend keys:

-plot(smooth_risk_brcancer, 
-     id.names = c("Covariate Profile 1","Covariate Profile 50"), 
-     legend.title = "Type", 
-     xlab = "time (days)", 
-     ylab = "Cumulative Incidence (%)") 
+plot(smooth_risk_brcancer, + id.names = c("Covariate Profile 1","Covariate Profile 50"), + legend.title = "Type", + xlab = "time (days)", + ylab = "Cumulative Incidence (%)")

-

The call to plot on a absRiskCB object returns a ggplot2 object, and therefore can be used downstream with other ggplot2 functions. For example, suppose we want to change the theme:

+

The call to plot on a absRiskCB object +returns a ggplot2 object, and therefore can be used +downstream with other ggplot2 functions. For example, +suppose we want to change the theme:

-plot(smooth_risk_brcancer, 
-     id.names = c("Covariate Profile 1","Covariate Profile 50"), 
-     legend.title = "Type", 
-     xlab = "time (days)", 
-     ylab = "Cumulative Incidence (%)") + ggplot2::theme_linedraw() 
+plot(smooth_risk_brcancer, + id.names = c("Covariate Profile 1","Covariate Profile 50"), + legend.title = "Type", + xlab = "time (days)", + ylab = "Cumulative Incidence (%)") + ggplot2::theme_linedraw()

-
-

-Using graphics::matplot -

-

By default, the plot method uses ggplot2 to produce the curves. Alternatively, you can use graphics::matplot by specifying gg = FALSE. This option is particularly useful if you want to add the cumulative incidence curve to an existing plot, e.g., adding the adjusted smooth curve to a Kaplan-Meier curve. In this example, we calculate the cumulative incidence for a typical individual in the dataset:

+
+

Using graphics::matplot +

+

By default, the plot method uses ggplot2 to +produce the curves. Alternatively, you can use +graphics::matplot by specifying gg = FALSE. +This option is particularly useful if you want to add the cumulative +incidence curve to an existing plot, e.g., adding the adjusted smooth +curve to a Kaplan-Meier curve. In this example, we calculate the +cumulative incidence for a typical individual in the +dataset:

-cols <- c("#8E063B","#023FA5")
-
-smooth_risk_typical <- absoluteRisk(object = mod_cb_glm, newdata = "typical")
-y <- with(brcancer, survival::Surv(time, cens))
-plot(y, fun = "event", conf.int = F, col = cols[1], lwd = 2)
-plot(smooth_risk_typical, add = TRUE, col = cols[2], lwd = 2, gg = FALSE)
-legend("bottomright", 
-       legend = c("Kaplan-Meier", "casebase"), 
-       col = cols,
-       lty = 1,
-       lwd = 2,
-       bg = "gray90")
+cols <- c("#8E063B","#023FA5") + +smooth_risk_typical <- absoluteRisk(object = mod_cb_glm, newdata = "typical") +y <- with(brcancer, survival::Surv(time, cens)) +plot(y, fun = "event", conf.int = F, col = cols[1], lwd = 2) +plot(smooth_risk_typical, add = TRUE, col = cols[2], lwd = 2, gg = FALSE) +legend("bottomright", + legend = c("Kaplan-Meier", "casebase"), + col = cols, + lty = 1, + lwd = 2, + bg = "gray90")

-
-

-Survival Curves

-

We can also easily calculate and plot survival curves by specifying type = 'survival' in the call to absoluteRisk. The corresponding call to plot is the same as with cumulative incidence curves:

+
+

Survival Curves +

+

We can also easily calculate and plot survival curves by specifying +type = 'survival' in the call to absoluteRisk. +The corresponding call to plot is the same as with +cumulative incidence curves:

-smooth_surv_brcancer <- absoluteRisk(object = mod_cb_glm, 
-                                     newdata = brcancer[c(1,50),],
-                                     type = "survival")
-
-plot(smooth_surv_brcancer)
+smooth_surv_brcancer <- absoluteRisk(object = mod_cb_glm, + newdata = brcancer[c(1,50),], + type = "survival") + +plot(smooth_surv_brcancer)

-
-

-Other families

-
-

-glmnet +
+

Other families

-

We can also plot cumulative incidence curves for other families. For example, using the family = "glmnet", we can plot the cumulative incidence curves for the first 10 individuals in the brcancer dataset, using the tuning parameter which minimizes the 10-fold cross-validation error (\(\lambda_{min}\)):

+
+

+glmnet +

+

We can also plot cumulative incidence curves for other families. For +example, using the family = "glmnet", we can plot the +cumulative incidence curves for the first 10 individuals in the +brcancer dataset, using the tuning parameter which +minimizes the 10-fold cross-validation error (\(\lambda_{min}\)):

-mod_cb_glmnet <- fitSmoothHazard(cens ~ estrec*time + 
-                                     horTh + 
-                                     age + 
-                                     menostat + 
-                                     tsize + 
-                                     tgrade + 
-                                     pnodes + 
-                                     progrec,
-                                 data = brcancer,
-                                 time = "time", 
-                                 ratio = 1, 
-                                 family = "glmnet")
-
-smooth_risk_glmnet <- absoluteRisk(object = mod_cb_glmnet, 
-                                   newdata = brcancer[1:10,], 
-                                   s = "lambda.min")
-plot(smooth_risk_glmnet)
+mod_cb_glmnet <- fitSmoothHazard(cens ~ estrec*time + + horTh + + age + + menostat + + tsize + + tgrade + + pnodes + + progrec, + data = brcancer, + time = "time", + ratio = 1, + family = "glmnet") + +smooth_risk_glmnet <- absoluteRisk(object = mod_cb_glmnet, + newdata = brcancer[1:10,], + s = "lambda.min") +plot(smooth_risk_glmnet)

-
-

-gam -

-

Here we produce the same plot but for family = "gam" for generalised additive models.

+
+

+gam +

+

Here we produce the same plot but for family = "gam" for +generalised additive models.

-mod_cb_gam <- fitSmoothHazard(cens ~ estrec + time + 
-                                     horTh + 
-                                     age + 
-                                     menostat + 
-                                     tsize + 
-                                     tgrade + 
-                                     pnodes + 
-                                     progrec,
-                                 data = brcancer,
-                                 time = "time", 
-                                 ratio = 1, 
-                                 family = "gam")
-
-smooth_risk_gam <- absoluteRisk(object = mod_cb_gam, 
-                                newdata = brcancer[1:10,])
-plot(smooth_risk_gam)
+mod_cb_gam <- fitSmoothHazard(cens ~ estrec + time + + horTh + + age + + menostat + + tsize + + tgrade + + pnodes + + progrec, + data = brcancer, + time = "time", + ratio = 1, + family = "gam") + +smooth_risk_gam <- absoluteRisk(object = mod_cb_gam, + newdata = brcancer[1:10,]) +plot(smooth_risk_gam)

-
-

-Session information

+
+

Session information +

@@ -270,11 +316,13 @@

@@ -283,5 +331,7 @@

+ + diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-2-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-2-1.png index 2bb69dec..4578ccc3 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-2-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-2-1.png differ diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-3-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-3-1.png index 78ba6b92..44323335 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-3-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-3-1.png differ diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-4-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-4-1.png index f7216adf..4399cb6a 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-4-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-4-1.png differ diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-5-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-5-1.png index 351109b2..8f481e9f 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-5-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-5-1.png differ diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-6-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-6-1.png index b029e12e..d954f9f9 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-6-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-6-1.png differ diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-7-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-7-1.png index ac6aa039..e9620913 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-7-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-7-1.png differ diff --git a/articles/plotabsRisk_files/figure-html/unnamed-chunk-8-1.png b/articles/plotabsRisk_files/figure-html/unnamed-chunk-8-1.png index 0f8d9ac3..5ea56cb6 100644 Binary files a/articles/plotabsRisk_files/figure-html/unnamed-chunk-8-1.png and b/articles/plotabsRisk_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/articles/plotsmoothHazard.html b/articles/plotsmoothHazard.html index 5d7cb09c..41fb50c4 100644 --- a/articles/plotsmoothHazard.html +++ b/articles/plotsmoothHazard.html @@ -19,6 +19,8 @@ + +
-
-

-One binary predictor with interaction

-

Next, we fit an interaction model with a time-varying covariate, i.e. to test the hypothesis that the effect of hormonal therapy on the hazard varies with time.

+
+

One binary predictor with interaction +

+

Next, we fit an interaction model with a time-varying covariate, +i.e. to test the hypothesis that the effect of hormonal therapy on the +hazard varies with time.

-mod_cb_tvc <- fitSmoothHazard(cens ~ hormon * ns(log(time), df = 3),
-                              data = brcancer,
-                              time = "time")
-#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
-summary(mod_cb_tvc)
-#> Fitting smooth hazards with case-base sampling
-#> 
-#> Sample size: 686 
-#> Number of events: 299 
-#> Number of base moments: 29900 
-#> ----
-#> 
-#> Call:
-#> fitSmoothHazard(formula = cens ~ hormon * ns(log(time), df = 3), 
-#>     data = brcancer, time = "time")
-#> 
-#> Deviance Residuals: 
-#>     Min       1Q   Median       3Q      Max  
-#> -0.1818  -0.1601  -0.1457  -0.1261   3.7916  
-#> 
-#> Coefficients:
-#>                               Estimate Std. Error z value Pr(>|z|)    
-#> (Intercept)                    -85.321     22.783  -3.745 0.000180 ***
-#> hormon                         -39.539     51.713  -0.765 0.444516    
-#> ns(log(time), df = 3)1          52.188     15.055   3.466 0.000527 ***
-#> ns(log(time), df = 3)2         149.400     44.293   3.373 0.000744 ***
-#> ns(log(time), df = 3)3          30.928      9.078   3.407 0.000657 ***
-#> hormon:ns(log(time), df = 3)1   26.200     34.384   0.762 0.446068    
-#> hormon:ns(log(time), df = 3)2   75.570    100.097   0.755 0.450267    
-#> hormon:ns(log(time), df = 3)3   16.081     20.713   0.776 0.437525    
-#> ---
-#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-#> 
-#> (Dispersion parameter for binomial family taken to be 1)
-#> 
-#>     Null deviance: 3354.9  on 30198  degrees of freedom
-#> Residual deviance: 3271.9  on 30191  degrees of freedom
-#> AIC: 3287.9
-#> 
-#> Number of Fisher Scoring iterations: 11
-

Now we can easily plot the hazard function over time for each hormon group:

+mod_cb_tvc <- fitSmoothHazard(cens ~ hormon * ns(log(time), df = 3), + data = brcancer, + time = "time") +#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred +summary(mod_cb_tvc) +#> Fitting smooth hazards with case-base sampling +#> +#> Sample size: 686 +#> Number of events: 299 +#> Number of base moments: 29900 +#> ---- +#> +#> Call: +#> fitSmoothHazard(formula = cens ~ hormon * ns(log(time), df = 3), +#> data = brcancer, time = "time") +#> +#> Coefficients: +#> Estimate Std. Error z value Pr(>|z|) +#> (Intercept) -74.617 19.978 -3.735 0.000188 *** +#> hormon -32.356 44.755 -0.723 0.469705 +#> ns(log(time), df = 3)1 44.845 13.132 3.415 0.000638 *** +#> ns(log(time), df = 3)2 128.876 38.912 3.312 0.000926 *** +#> ns(log(time), df = 3)3 26.617 7.971 3.339 0.000839 *** +#> hormon:ns(log(time), df = 3)1 21.414 29.609 0.723 0.469528 +#> hormon:ns(log(time), df = 3)2 61.676 86.779 0.711 0.477258 +#> hormon:ns(log(time), df = 3)3 13.310 17.942 0.742 0.458177 +#> --- +#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 +#> +#> (Dispersion parameter for binomial family taken to be 1) +#> +#> Null deviance: 3354.9 on 30198 degrees of freedom +#> Residual deviance: 3275.3 on 30191 degrees of freedom +#> AIC: 3291.3 +#> +#> Number of Fisher Scoring iterations: 11
+

Now we can easily plot the hazard function over time for each +hormon group:

-plot(mod_cb_tvc,
-     hazard.params = list(xvar = "time",
-                          by = "hormon",
-                          alpha = 0.05,
-                          ylab = "Hazard")) 
+plot(mod_cb_tvc, + hazard.params = list(xvar = "time", + by = "hormon", + alpha = 0.05, + ylab = "Hazard"))

-
-

-One continuous predictor with interaction

-

Now we fit a model with an interaction between a continuous variable, estrogen receptor (in fmol), and time.

+
+

One continuous predictor with interaction +

+

Now we fit a model with an interaction between a continuous variable, +estrogen receptor (in fmol), and time.

-mod_cb_tvc <- fitSmoothHazard(cens ~ estrec * ns(log(time), df = 3),
-                              data = brcancer,
-                              time = "time")
-#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
-summary(mod_cb_tvc)
-#> Fitting smooth hazards with case-base sampling
-#> 
-#> Sample size: 686 
-#> Number of events: 299 
-#> Number of base moments: 29900 
-#> ----
-#> 
-#> Call:
-#> fitSmoothHazard(formula = cens ~ estrec * ns(log(time), df = 3), 
-#>     data = brcancer, time = "time")
-#> 
-#> Deviance Residuals: 
-#>     Min       1Q   Median       3Q      Max  
-#> -0.1879  -0.1621  -0.1426  -0.1355   4.0680  
-#> 
-#> Coefficients:
-#>                               Estimate Std. Error z value Pr(>|z|)    
-#> (Intercept)                   -62.7884    16.8829  -3.719  0.00020 ***
-#> estrec                         -0.6145     0.3129  -1.964  0.04953 *  
-#> ns(log(time), df = 3)1         36.1663    11.0712   3.267  0.00109 ** 
-#> ns(log(time), df = 3)2        107.3742    32.9163   3.262  0.00111 ** 
-#> ns(log(time), df = 3)3         21.1891     6.7225   3.152  0.00162 ** 
-#> estrec:ns(log(time), df = 3)1   0.4205     0.2102   2.001  0.04538 *  
-#> estrec:ns(log(time), df = 3)2   1.1623     0.6003   1.936  0.05286 .  
-#> estrec:ns(log(time), df = 3)3   0.2596     0.1285   2.021  0.04329 *  
-#> ---
-#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-#> 
-#> (Dispersion parameter for binomial family taken to be 1)
-#> 
-#>     Null deviance: 3354.9  on 30198  degrees of freedom
-#> Residual deviance: 3259.0  on 30191  degrees of freedom
-#> AIC: 3275
-#> 
-#> Number of Fisher Scoring iterations: 13
-

There are now many ways to plot the time-varying effect of estrogen receptor on the hazard function. The default is to plot the 10th, 50th and 90th quantiles of the by variable:

+mod_cb_tvc <- fitSmoothHazard(cens ~ estrec * ns(log(time), df = 3), + data = brcancer, + time = "time") +#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred +summary(mod_cb_tvc) +#> Fitting smooth hazards with case-base sampling +#> +#> Sample size: 686 +#> Number of events: 299 +#> Number of base moments: 29900 +#> ---- +#> +#> Call: +#> fitSmoothHazard(formula = cens ~ estrec * ns(log(time), df = 3), +#> data = brcancer, time = "time") +#> +#> Coefficients: +#> Estimate Std. Error z value Pr(>|z|) +#> (Intercept) -109.0806 32.0380 -3.405 0.000662 *** +#> estrec -1.0734 0.5681 -1.889 0.058830 . +#> ns(log(time), df = 3)1 66.9677 21.1446 3.167 0.001540 ** +#> ns(log(time), df = 3)2 196.2522 62.0330 3.164 0.001558 ** +#> ns(log(time), df = 3)3 40.4393 12.9916 3.113 0.001854 ** +#> estrec:ns(log(time), df = 3)1 0.7285 0.3808 1.913 0.055739 . +#> estrec:ns(log(time), df = 3)2 2.0345 1.0877 1.870 0.061416 . +#> estrec:ns(log(time), df = 3)3 0.4561 0.2360 1.932 0.053300 . +#> --- +#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 +#> +#> (Dispersion parameter for binomial family taken to be 1) +#> +#> Null deviance: 3354.9 on 30198 degrees of freedom +#> Residual deviance: 3264.9 on 30191 degrees of freedom +#> AIC: 3280.9 +#> +#> Number of Fisher Scoring iterations: 14
+

There are now many ways to plot the time-varying effect of estrogen +receptor on the hazard function. The default is to plot the 10th, 50th +and 90th quantiles of the by variable:

-# computed at the 10th, 50th and 90th quantiles of estrec
-plot(mod_cb_tvc,
-     hazard.params = list(xvar = "time",
-                          by = "estrec",
-                          alpha = 1,
-                          ylab = "Hazard")) 
+# computed at the 10th, 50th and 90th quantiles of estrec +plot(mod_cb_tvc, + hazard.params = list(xvar = "time", + by = "estrec", + alpha = 1, + ylab = "Hazard"))

-

We can also show the quartiles of estrec by specifying the breaks argument. If breaks is a single number, that will be the used as the number of breaks:

+

We can also show the quartiles of estrec by specifying +the breaks argument. If breaks is a single +number, that will be the used as the number of breaks:

-# computed at quartiles of estrec
-plot(mod_cb_tvc,
-     hazard.params = list(xvar = c("time"),
-                          by = "estrec",
-                          alpha = 1,
-                          breaks = 4,
-                          ylab = "Hazard")) 
+# computed at quartiles of estrec +plot(mod_cb_tvc, + hazard.params = list(xvar = c("time"), + by = "estrec", + alpha = 1, + breaks = 4, + ylab = "Hazard"))

-

Alternatively, if breaks is a vector, it will be used as the actual values to be used:

+

Alternatively, if breaks is a vector, it will be used as +the actual values to be used:

-# computed where I want
-plot(mod_cb_tvc,
-     hazard.params = list(xvar = c("time"),
-                          by = "estrec",
-                          alpha = 1,
-                          breaks = c(3,2200),
-                          ylab = "Hazard")) 
+# computed where I want +plot(mod_cb_tvc, + hazard.params = list(xvar = c("time"), + by = "estrec", + alpha = 1, + breaks = c(3,2200), + ylab = "Hazard"))

-visreg2d(mod_cb_tvc, 
-         xvar = "time",
-         yvar = "estrec",
-         trans = exp,
-         print.cond = TRUE,
-         zlab = "Hazard",
-         plot.type = "image")
-
-visreg2d(mod_cb_tvc, 
-         xvar = "time",
-         yvar = "estrec",
-         trans = exp,
-         print.cond = TRUE,
-         zlab = "Hazard",
-         plot.type = "persp")
-
-# this can also work if 'rgl' is installed
-# visreg2d(mod_cb_tvc, 
-#          xvar = "time",
-#          yvar = "estrec",
-#          trans = exp,
-#          print.cond = TRUE,
-#          zlab = "Hazard",
-#          plot.type = "rgl")
+visreg2d(mod_cb_tvc, + xvar = "time", + yvar = "estrec", + trans = exp, + print.cond = TRUE, + zlab = "Hazard", + plot.type = "image") + +visreg2d(mod_cb_tvc, + xvar = "time", + yvar = "estrec", + trans = exp, + print.cond = TRUE, + zlab = "Hazard", + plot.type = "persp") + +# this can also work if 'rgl' is installed +# visreg2d(mod_cb_tvc, +# xvar = "time", +# yvar = "estrec", +# trans = exp, +# print.cond = TRUE, +# zlab = "Hazard", +# plot.type = "rgl")
-
-

-One continuous predictor with interaction and several other predictors

-

All the examples so far have only included two predictors in the regression equation. In this example, we fit a smooth hazard model with several predictors:

+
+

One continuous predictor with interaction and several other +predictors +

+

All the examples so far have only included two predictors in the +regression equation. In this example, we fit a smooth hazard model with +several predictors:

-mod_cb_tvc <- fitSmoothHazard(cens ~ estrec * ns(log(time), df = 3) + 
-                                horTh + 
-                                age + 
-                                menostat + 
-                                tsize + 
-                                tgrade + 
-                                pnodes + 
-                                progrec,
-                              data = brcancer,
-                              time = "time")
-#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
-summary(mod_cb_tvc)
-#> Fitting smooth hazards with case-base sampling
-#> 
-#> Sample size: 686 
-#> Number of events: 299 
-#> Number of base moments: 29900 
-#> ----
-#> 
-#> Call:
-#> fitSmoothHazard(formula = cens ~ estrec * ns(log(time), df = 3) + 
-#>     horTh + age + menostat + tsize + tgrade + pnodes + progrec, 
-#>     data = brcancer, time = "time")
-#> 
-#> Deviance Residuals: 
-#>     Min       1Q   Median       3Q      Max  
-#> -0.6718  -0.1616  -0.1345  -0.0956   4.0786  
-#> 
-#> Coefficients:
-#>                                 Estimate Std. Error z value Pr(>|z|)    
-#> (Intercept)                   -6.747e+01  1.841e+01  -3.665 0.000247 ***
-#> estrec                        -5.177e-01  3.279e-01  -1.579 0.114405    
-#> ns(log(time), df = 3)1         3.944e+01  1.206e+01   3.270 0.001074 ** 
-#> ns(log(time), df = 3)2         1.159e+02  3.584e+01   3.233 0.001225 ** 
-#> ns(log(time), df = 3)3         2.383e+01  7.369e+00   3.234 0.001222 ** 
-#> horThyes                      -3.352e-01  1.297e-01  -2.585 0.009734 ** 
-#> age                           -9.951e-03  9.220e-03  -1.079 0.280478    
-#> menostatPost                   3.056e-01  1.833e-01   1.667 0.095557 .  
-#> tsize                          7.209e-03  3.953e-03   1.823 0.068238 .  
-#> tgrade.L                       5.434e-01  1.907e-01   2.849 0.004387 ** 
-#> tgrade.Q                      -1.996e-01  1.227e-01  -1.627 0.103813    
-#> pnodes                         5.240e-02  7.984e-03   6.564 5.25e-11 ***
-#> progrec                       -2.140e-03  5.717e-04  -3.743 0.000182 ***
-#> estrec:ns(log(time), df = 3)1  3.555e-01  2.198e-01   1.617 0.105835    
-#> estrec:ns(log(time), df = 3)2  9.786e-01  6.289e-01   1.556 0.119699    
-#> estrec:ns(log(time), df = 3)3  2.214e-01  1.353e-01   1.636 0.101753    
-#> ---
-#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-#> 
-#> (Dispersion parameter for binomial family taken to be 1)
-#> 
-#>     Null deviance: 3354.9  on 30198  degrees of freedom
-#> Residual deviance: 3164.1  on 30183  degrees of freedom
-#> AIC: 3196.1
-#> 
-#> Number of Fisher Scoring iterations: 13
-

In the following plot, we show the time-varying effect of estrec while controlling for all other variables. By default, the other terms in the model are set to their median if the term is numeric or the most common category if the term is a factor. The values of the other variables are shown in the output:

+mod_cb_tvc <- fitSmoothHazard(cens ~ estrec * ns(log(time), df = 3) + + horTh + + age + + menostat + + tsize + + tgrade + + pnodes + + progrec, + data = brcancer, + time = "time") +#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred +summary(mod_cb_tvc) +#> Fitting smooth hazards with case-base sampling +#> +#> Sample size: 686 +#> Number of events: 299 +#> Number of base moments: 29900 +#> ---- +#> +#> Call: +#> fitSmoothHazard(formula = cens ~ estrec * ns(log(time), df = 3) + +#> horTh + age + menostat + tsize + tgrade + pnodes + progrec, +#> data = brcancer, time = "time") +#> +#> Coefficients: +#> Estimate Std. Error z value Pr(>|z|) +#> (Intercept) -91.212702 24.493162 -3.724 0.000196 *** +#> estrec -0.618100 0.412486 -1.498 0.134009 +#> ns(log(time), df = 3)1 55.570819 16.218020 3.426 0.000611 *** +#> ns(log(time), df = 3)2 161.589843 47.523250 3.400 0.000673 *** +#> ns(log(time), df = 3)3 33.263824 9.783363 3.400 0.000674 *** +#> horThyes -0.352159 0.130173 -2.705 0.006824 ** +#> age -0.010362 0.009372 -1.106 0.268884 +#> menostatPost 0.271752 0.183668 1.480 0.138985 +#> tsize 0.007826 0.004045 1.935 0.053008 . +#> tgrade.L 0.534837 0.190911 2.801 0.005087 ** +#> tgrade.Q -0.224313 0.122553 -1.830 0.067200 . +#> pnodes 0.053247 0.007986 6.667 2.60e-11 *** +#> progrec -0.002255 0.000577 -3.909 9.27e-05 *** +#> estrec:ns(log(time), df = 3)1 0.425991 0.278114 1.532 0.125594 +#> estrec:ns(log(time), df = 3)2 1.167845 0.790237 1.478 0.139450 +#> estrec:ns(log(time), df = 3)3 0.262861 0.169300 1.553 0.120511 +#> --- +#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 +#> +#> (Dispersion parameter for binomial family taken to be 1) +#> +#> Null deviance: 3354.9 on 30198 degrees of freedom +#> Residual deviance: 3157.7 on 30183 degrees of freedom +#> AIC: 3189.7 +#> +#> Number of Fisher Scoring iterations: 13
+

In the following plot, we show the time-varying effect of +estrec while controlling for all other variables. By +default, the other terms in the model are set to their median if the +term is numeric or the most common category if the term is a factor. The +values of the other variables are shown in the output:

-plot(mod_cb_tvc,
-     hazard.params = list(xvar = "time",
-                          by = "estrec",
-                          alpha = 1,
-                          breaks = 2,
-                          ylab = "Hazard"))
-#> Conditions used in construction of plot
-#> estrec: 8 / 173
-#> horTh: no
-#> age: 53
-#> menostat: Post
-#> tsize: 25
-#> tgrade: II
-#> pnodes: 3
-#> progrec: 48
-#> offset: 0
+plot(mod_cb_tvc, + hazard.params = list(xvar = "time", + by = "estrec", + alpha = 1, + breaks = 2, + ylab = "Hazard")) +#> Conditions used in construction of plot +#> estrec: 8 / 175 +#> horTh: no +#> age: 53 +#> menostat: Post +#> tsize: 25 +#> tgrade: II +#> pnodes: 3 +#> progrec: 51 +#> offset: 0

-

You can of course set the values of the other covariates as before, i.e. by specifying the cond argument as a named list to the hazard.params argument:

+

You can of course set the values of the other covariates as before, +i.e. by specifying the cond argument as a named list to the +hazard.params argument:

-plot(mod_cb_tvc,
-     hazard.params = list(xvar = "time",
-                          by = "estrec",
-                          cond = list(tgrade = "III", age = 49),
-                          alpha = 1,
-                          breaks = 2,
-                          ylab = "Hazard"))
-#> Conditions used in construction of plot
-#> estrec: 8 / 173
-#> horTh: no
-#> age: 49
-#> menostat: Post
-#> tsize: 25
-#> tgrade: III
-#> pnodes: 3
-#> progrec: 48
-#> offset: 0
+plot(mod_cb_tvc, + hazard.params = list(xvar = "time", + by = "estrec", + cond = list(tgrade = "III", age = 49), + alpha = 1, + breaks = 2, + ylab = "Hazard")) +#> Conditions used in construction of plot +#> estrec: 8 / 175 +#> horTh: no +#> age: 49 +#> menostat: Post +#> tsize: 25 +#> tgrade: III +#> pnodes: 3 +#> progrec: 51 +#> offset: 0

-
-

-Hazard Ratio

-

In this section we illustrate how to plot hazard ratios using the plot method for objects of class singleEventCB which is obtained from running the fitSmoothHazard function. Note that these function have only been thoroughly tested with family = "glm".

+
+

Hazard Ratio +

+

In this section we illustrate how to plot hazard ratios using the +plot method for objects of class singleEventCB +which is obtained from running the fitSmoothHazard +function. Note that these function have only been thoroughly tested with +family = "glm".

In what follows, the hazard ratio for a variable \(X\) is defined as

\[ -\frac{h\left(t | X=x_1, \mathbf{Z}=\mathbf{z_1} ; \hat{\beta}\right)}{h(t | X=x_0, \mathbf{Z}=\mathbf{z_0} ; \hat{\beta})} -\] where \(h(t|\cdot;\hat{\beta})\) is the hazard rate as a function of the variable \(t\) (which is usually time, but can be any other continuous variable), \(x_1\) is the value of \(X\) for the exposed group, \(x_0\) is the value of \(X\) for the unexposed group, \(\mathbf{Z}\) are other covariates in the model which are equal to \(\mathbf{z_1}\) in the exposed and \(\mathbf{z_0}\) in the unexposed group, and \(\hat{\beta}\) are the estimated regression coefficients.

-

As indicated by the formula above, it is most instructive to plot the hazard ratio as a function of a variable \(t\) only if there is an interaction between \(t\) and \(X\). Otherwise, the resulting plot will simply be a horizontal line across time.

-
-

-Manson Trial (eprchd)

-

We use data from the Manson trial (NEJM 2003) which is included in the casebase package. This randomized clinical trial investigated the effect of estrogen plus progestin (estPro) on coronary heart disease (CHD) risk in 16,608 postmenopausal women who were 50 to 79 years of age at base line. Participants were randomly assigned to receive estPro or placebo. The primary efficacy outcome of the trial was CHD (nonfatal myocardial infarction or death due to CHD).

-

We fit a model with the interaction between time and treatment arm. We are therefore interested in visualizing the hazard ratio of the treatment over time.

+\frac{h\left(t | X=x_1, \mathbf{Z}=\mathbf{z_1} ; +\hat{\beta}\right)}{h(t | X=x_0, \mathbf{Z}=\mathbf{z_0} ; \hat{\beta})} +\] where \(h(t|\cdot;\hat{\beta})\) is the hazard rate +as a function of the variable \(t\) +(which is usually time, but can be any other continuous variable), \(x_1\) is the value of \(X\) for the exposed group, \(x_0\) is the value of \(X\) for the unexposed group, \(\mathbf{Z}\) are other covariates in the +model which are equal to \(\mathbf{z_1}\) in the exposed and \(\mathbf{z_0}\) in the unexposed group, and +\(\hat{\beta}\) are the estimated +regression coefficients.

+

As indicated by the formula above, it is most instructive to plot the +hazard ratio as a function of a variable \(t\) only if there is an interaction between +\(t\) and \(X\). Otherwise, the resulting plot will +simply be a horizontal line across time.

+
+

Manson Trial (eprchd) +

+

We use data from the Manson trial (NEJM 2003) which is included in +the casebase package. This randomized clinical trial +investigated the effect of estrogen plus progestin (estPro) +on coronary heart disease (CHD) risk in 16,608 postmenopausal women who +were 50 to 79 years of age at base line. Participants were randomly +assigned to receive estPro or placebo. The +primary efficacy outcome of the trial was CHD (nonfatal myocardial +infarction or death due to CHD).

+

We fit a model with the interaction between time and treatment arm. +We are therefore interested in visualizing the hazard ratio of the +treatment over time.

-data("eprchd")
-eprchd <- transform(eprchd, 
-                    treatment = factor(treatment, levels = c("placebo","estPro")))
-str(eprchd)
-#> 'data.frame':    16608 obs. of  3 variables:
-#>  $ time     : num  0.0833 0.0833 0.0833 0.0833 0.0833 ...
-#>  $ status   : num  0 0 0 0 0 0 0 0 0 0 ...
-#>  $ treatment: Factor w/ 2 levels "placebo","estPro": 1 1 1 1 1 1 1 1 1 1 ...
-
-fit_mason <- fitSmoothHazard(status ~ treatment*time,
-                             data = eprchd,
-                             time = "time")
-summary(fit_mason)
-#> Fitting smooth hazards with case-base sampling
-#> 
-#> Sample size: 16608 
-#> Number of events: 324 
-#> Number of base moments: 32400 
-#> ----
-#> 
-#> Call:
-#> fitSmoothHazard(formula = status ~ treatment * time, data = eprchd, 
-#>     time = "time")
-#> 
-#> Deviance Residuals: 
-#>     Min       1Q   Median       3Q      Max  
-#> -0.1686  -0.1495  -0.1461  -0.1308   3.1884  
-#> 
-#> Coefficients:
-#>                      Estimate Std. Error z value Pr(>|z|)    
-#> (Intercept)          -6.11336    0.17532 -34.870  < 2e-16 ***
-#> treatmentestPro       0.63431    0.22485   2.821  0.00479 ** 
-#> time                  0.12002    0.04767   2.518  0.01182 *  
-#> treatmentestPro:time -0.13776    0.06367  -2.164  0.03048 *  
-#> ---
-#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-#> 
-#> (Dispersion parameter for binomial family taken to be 1)
-#> 
-#>     Null deviance: 3635.4  on 32723  degrees of freedom
-#> Residual deviance: 3625.1  on 32720  degrees of freedom
-#> AIC: 3633.1
-#> 
-#> Number of Fisher Scoring iterations: 7
-

To plot the hazard ratio, we must specify the newdata argument with a covariate pattern for the reference group. In this example, we treat the placebo as the reference group. Because we have fit an interaction with time, we also provide a sequence of times at which we would like to calculate the hazard ratio.

+data("eprchd") +eprchd <- transform(eprchd, + treatment = factor(treatment, levels = c("placebo","estPro"))) +str(eprchd) +#> 'data.frame': 16608 obs. of 3 variables: +#> $ time : num 0.0833 0.0833 0.0833 0.0833 0.0833 ... +#> $ status : num 0 0 0 0 0 0 0 0 0 0 ... +#> $ treatment: Factor w/ 2 levels "placebo","estPro": 1 1 1 1 1 1 1 1 1 1 ... + +fit_mason <- fitSmoothHazard(status ~ treatment*time, + data = eprchd, + time = "time") +summary(fit_mason) +#> Fitting smooth hazards with case-base sampling +#> +#> Sample size: 16608 +#> Number of events: 324 +#> Number of base moments: 32400 +#> ---- +#> +#> Call: +#> fitSmoothHazard(formula = status ~ treatment * time, data = eprchd, +#> time = "time") +#> +#> Coefficients: +#> Estimate Std. Error z value Pr(>|z|) +#> (Intercept) -6.08972 0.17545 -34.709 < 2e-16 *** +#> treatmentestPro 0.58336 0.22419 2.602 0.00927 ** +#> time 0.11465 0.04772 2.403 0.01627 * +#> treatmentestPro:time -0.12567 0.06336 -1.983 0.04733 * +#> --- +#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 +#> +#> (Dispersion parameter for binomial family taken to be 1) +#> +#> Null deviance: 3635.4 on 32723 degrees of freedom +#> Residual deviance: 3626.3 on 32720 degrees of freedom +#> AIC: 3634.3 +#> +#> Number of Fisher Scoring iterations: 7
+

To plot the hazard ratio, we must specify the newdata +argument with a covariate pattern for the reference group. In this +example, we treat the placebo as the reference group. +Because we have fit an interaction with time, we also provide a sequence +of times at which we would like to calculate the hazard ratio.

-newtime <- quantile(fit_mason[["originalData"]][[fit_mason[["timeVar"]]]], 
-                    probs = seq(0.01, 0.99, 0.01))
-
-# reference category
-newdata <- data.frame(treatment = factor("placebo", 
-                                         levels = c("placebo", "estPro")), 
-                      time = newtime)
-str(newdata)
-#> 'data.frame':    99 obs. of  2 variables:
-#>  $ treatment: Factor w/ 2 levels "placebo","estPro": 1 1 1 1 1 1 1 1 1 1 ...
-#>  $ time     : num  0.917 1.75 2.5 3.167 3.417 ...
-
-plot(fit_mason, 
-     type = "hr", 
-     newdata = newdata,
-     var = "treatment",
-     increment = 1,
-     xvar = "time",
-     ci = T,
-     rug = T)
+newtime <- quantile(fit_mason[["originalData"]][[fit_mason[["timeVar"]]]], + probs = seq(0.01, 0.99, 0.01)) + +# reference category +newdata <- data.frame(treatment = factor("placebo", + levels = c("placebo", "estPro")), + time = newtime) +str(newdata) +#> 'data.frame': 99 obs. of 2 variables: +#> $ treatment: Factor w/ 2 levels "placebo","estPro": 1 1 1 1 1 1 1 1 1 1 ... +#> $ time : num 0.917 1.75 2.5 3.167 3.417 ... + +plot(fit_mason, + type = "hr", + newdata = newdata, + var = "treatment", + increment = 1, + xvar = "time", + ci = T, + rug = T)

-

In the call to plot we specify the xvar which is the variable plotted on the x-axis, the var argument which specified the variable for which we want the hazard ratio. The increment = 1 indicates that we want to increment var by 1 level, which in this case is estPro. Alternatively, we can specify the exposed argument which should be a function that takes newdata and returns the exposed dataset. The following call is equivalent to the one above:

+

In the call to plot we specify the xvar +which is the variable plotted on the x-axis, the var +argument which specified the variable for which we want the hazard +ratio. The increment = 1 indicates that we want to +increment var by 1 level, which in this case is +estPro. Alternatively, we can specify the +exposed argument which should be a function that takes +newdata and returns the exposed dataset. The following call +is equivalent to the one above:

-plot(fit_mason, 
-     type = "hr", 
-     newdata = newdata,
-     exposed = function(data) transform(data, treatment = "estPro"),
-     xvar = "time",
-     ci = T,
-     rug = T)
+plot(fit_mason, + type = "hr", + newdata = newdata, + exposed = function(data) transform(data, treatment = "estPro"), + xvar = "time", + ci = T, + rug = T)

-

Alternatively, if we want the placebo group to be the exposed group, we can change the newdata argument to the following:

+

Alternatively, if we want the placebo group to be the +exposed group, we can change the newdata argument to the +following:

-newdata <- data.frame(treatment = factor("estPro", 
-                                         levels = c("placebo", "estPro")), 
-                      time = newtime)
-str(newdata)
-#> 'data.frame':    99 obs. of  2 variables:
-#>  $ treatment: Factor w/ 2 levels "placebo","estPro": 2 2 2 2 2 2 2 2 2 2 ...
-#>  $ time     : num  0.917 1.75 2.5 3.167 3.417 ...
-
-levels(newdata$treatment)
-#> [1] "placebo" "estPro"
-

Note that the reference category in newdata is still placebo. Therefore we must set increment = -1 in order to get the exposed dataset:

+newdata <- data.frame(treatment = factor("estPro", + levels = c("placebo", "estPro")), + time = newtime) +str(newdata) +#> 'data.frame': 99 obs. of 2 variables: +#> $ treatment: Factor w/ 2 levels "placebo","estPro": 2 2 2 2 2 2 2 2 2 2 ... +#> $ time : num 0.917 1.75 2.5 3.167 3.417 ... + +levels(newdata$treatment) +#> [1] "placebo" "estPro"
+

Note that the reference category in newdata is still +placebo. Therefore we must set increment = -1 +in order to get the exposed dataset:

-plot(fit_mason, 
-     type = "hr", 
-     newdata = newdata,
-     var = "treatment",
-     increment = -1,
-     xvar = "time",
-     ci = TRUE,
-     rug = TRUE)
+plot(fit_mason, + type = "hr", + newdata = newdata, + var = "treatment", + increment = -1, + xvar = "time", + ci = TRUE, + rug = TRUE)

-

If the \(X\) variable has more than two levels, than, increment works the same way, e.g. increment = 2 will provide an exposed group two levels above the value in newdata.

+

If the \(X\) variable has more than +two levels, than, increment works the same way, +e.g. increment = 2 will provide an exposed +group two levels above the value in newdata.

-
-

-Save results

-

In order to save the data used to make the plot, you simply have to assign the call to plot to a variable. This is particularly useful if you want to really customize the plot aesthetics:

+
+

Save results +

+

In order to save the data used to make the plot, you simply have to +assign the call to plot to a variable. This is particularly +useful if you want to really customize the plot aesthetics:

-result <- plot(fit_mason, 
-               type = "hr", 
-               newdata = newdata,
-               var = "treatment",
-               increment = -1,
-               xvar = "time",
-               ci = TRUE,
-               rug = TRUE)
+result <- plot(fit_mason, + type = "hr", + newdata = newdata, + var = "treatment", + increment = -1, + xvar = "time", + ci = TRUE, + rug = TRUE)

-head(result)
-#>    treatment      time log_hazard_ratio standarderror hazard_ratio lowerbound
-#> 1%    estPro 0.9166667      -0.50803484     0.1768734    0.6016768  0.4254107
-#> 2%    estPro 1.7500000      -0.39323869     0.1402646    0.6748676  0.5126549
-#> 3%    estPro 2.5000000      -0.28992214     0.1184836    0.7483218  0.5932462
-#> 4%    estPro 3.1666667      -0.19808522     0.1133880    0.8202999  0.6568355
-#> 5%    estPro 3.4166667      -0.16364637     0.1155103    0.8490422  0.6770282
-#> 6%    estPro 3.9166667      -0.09476868     0.1258340    0.9095833  0.7107754
-#>    upperbound
-#> 1%  0.8509777
-#> 2%  0.8884072
-#> 3%  0.9439345
-#> 4%  1.0244452
-#> 5%  1.0647603
-#> 6%  1.1639989
+head(result) +#> treatment time log_hazard_ratio standarderror hazard_ratio lowerbound +#> 1% estPro 0.9166667 -0.4681589 0.1764989 0.6261540 0.4430421 +#> 2% estPro 1.7500000 -0.3634299 0.1401219 0.6952874 0.5283144 +#> 3% estPro 2.5000000 -0.2691739 0.1184741 0.7640104 0.6056949 +#> 4% estPro 3.1666667 -0.1853907 0.1133671 0.8307796 0.6652541 +#> 5% estPro 3.4166667 -0.1539721 0.1154480 0.8572960 0.6836932 +#> 6% estPro 3.9166667 -0.0911347 0.1256429 0.9128947 0.7136302 +#> upperbound +#> 1% 0.8849471 +#> 2% 0.9150321 +#> 3% 0.9637061 +#> 4% 1.0374905 +#> 5% 1.0749797 +#> 6% 1.1677992
-
-

-Session information

+
+

Session information +

@@ -666,11 +748,13 @@

@@ -679,5 +763,7 @@

+ + diff --git a/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-1.png b/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-1.png index bcc50393..1982032d 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-1.png and b/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-2-1.png b/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-2-1.png index ef1eced7..e3b83737 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-2-1.png and b/articles/plotsmoothHazard_files/figure-html/many-predictors-plot-2-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/plot-mason-1.png b/articles/plotsmoothHazard_files/figure-html/plot-mason-1.png index 215b5aa5..c7f2a62b 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/plot-mason-1.png and b/articles/plotsmoothHazard_files/figure-html/plot-mason-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-11-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-11-1.png index 1ad4bd07..be772ea2 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-11-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-11-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-13-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-13-1.png index 32dad957..48f4ee83 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-13-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-13-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-14-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-14-1.png index cd19865d..4cb906d9 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-14-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-14-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-15-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-15-1.png index 0cbfe3f9..549eb6b1 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-15-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-16-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-16-1.png index 215b5aa5..c7f2a62b 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-16-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-16-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-18-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-18-1.png index 8f2387d6..b83ffc1f 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-18-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-18-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-19-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-19-1.png index 8f2387d6..b83ffc1f 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-19-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-19-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-3-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-3-1.png index 4fb90e26..f6cf355d 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-3-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-3-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-4-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-4-1.png index a72b13e5..b12c1168 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-4-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-4-1.png differ diff --git a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-9-1.png b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-9-1.png index e92c7fed..799cc49e 100644 Binary files a/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-9-1.png and b/articles/plotsmoothHazard_files/figure-html/unnamed-chunk-9-1.png differ diff --git a/articles/popTime.html b/articles/popTime.html index feb08986..294a78cf 100644 --- a/articles/popTime.html +++ b/articles/popTime.html @@ -19,6 +19,8 @@ + +

-

One benefit from these plots is that it allows you to see the incidence density. This can be seen from the distribution of the red dots in the above plot. We can see that more events are observed later on in time. Therefore a constant hazard model would not be appropriate in this instance as it would overestimate the cumulative incidence earlier on in time, and underestimate it later on.

-

The unique ‘step shape’ of the population time plot is due to the randomization of the Finnish cohorts which were carried out on January 1 of each of the 4 years 1996 to 1999. This, coupled with the uniform December 31 2006 censoring date, lead to large numbers of men with exactly 11, 10, 9 or 8 years of follow-up.

-

It is important to note that the red points are random distributed across the gray area for each time of event. That is, imagine we draw a vertical line at a specific event time. We then plot the red point at a randomly sampled y-coordinate along this vertical line. This is done to avoid having all red points along the upper edge of the plot (because the subjects with the least amount of observation time are plotted at the top of the y-axis). By randomly distributing them, we can get a better sense of the incidence density.

-
-

-Exposure Stratified Population Time Plot

-

Often the observations in a study are in specific groups such as treatment arms. It may be of interest to compare the population time plots between these two groups. Here we compare the control group and the screening group. We create exposure stratified plots by specifying the exposure argument in the popTime function:

+

One benefit from these plots is that it allows you to see the +incidence density. This can be seen from the distribution of the red +dots in the above plot. We can see that more events are observed later +on in time. Therefore a constant hazard model would not be appropriate +in this instance as it would overestimate the cumulative incidence +earlier on in time, and underestimate it later on.

+

The unique ‘step shape’ of the population time plot is due to the +randomization of the Finnish cohorts which were carried out on January 1 +of each of the 4 years 1996 to 1999. This, coupled with the uniform +December 31 2006 censoring date, lead to large numbers of men with +exactly 11, 10, 9 or 8 years of follow-up.

+

It is important to note that the red points are random distributed +across the gray area for each time of event. That is, imagine we draw a +vertical line at a specific event time. We then plot the red point at a +randomly sampled y-coordinate along this vertical line. This is done to +avoid having all red points along the upper edge of the plot (because +the subjects with the least amount of observation time are plotted at +the top of the y-axis). By randomly distributing them, we can get a +better sense of the incidence density.

+
+

Exposure Stratified Population Time Plot +

+

Often the observations in a study are in specific groups such as +treatment arms. It may be of interest to compare the population time +plots between these two groups. Here we compare the control group and +the screening group. We create exposure stratified plots by specifying +the exposure argument in the popTime +function:

-# Appropriately label the factor variable so that these labels appear in the 
-# stratified population time plot
-ERSPC$ScrArm <- factor(ERSPC$ScrArm, 
-                       levels = c(0,1), 
-                       labels = c("Control group", "Screening group"))
-pt_object_strat <- casebase::popTime(ERSPC, 
-                                     event = "DeadOfPrCa", 
-                                     exposure = "ScrArm")
-
## 'Follow.Up.Time' will be used as the time variable
-

We can see its contents, class and that it has an exposure attribute:

+# stratified population time plot +pt_object_strat <- casebase::popTime(ERSPC, + event = "DeadOfPrCa", + exposure = "ScrArm")
+
## 'Follow.Up.Time' will be used as the time variable
+

We can see its contents, class and that it has an exposure +attribute:

-head(pt_object_strat)
-
##           ScrArm   time event original.time original.event event status ycoord
-## 1: Control group 0.0027     0        0.0027              0     censored  88232
-## 2: Control group 0.0027     0        0.0027              0     censored  88231
-## 3: Control group 0.0027     0        0.0027              0     censored  88230
-## 4: Control group 0.0027     0        0.0027              0     censored  88229
-## 5: Control group 0.0137     0        0.0137              0     censored  88228
-## 6: Control group 0.0137     0        0.0137              0     censored  88227
-##    yc n_available
-## 1:  0           0
-## 2:  0           0
-## 3:  0           0
-## 4:  0           0
-## 5:  0           0
-## 6:  0           0
+head(pt_object_strat)
+
##           ScrArm   time event original.time original.event event status ycoord
+## 1: Control group 0.0027     0        0.0027              0     censored  88232
+## 2: Control group 0.0027     0        0.0027              0     censored  88231
+## 3: Control group 0.0027     0        0.0027              0     censored  88230
+## 4: Control group 0.0027     0        0.0027              0     censored  88229
+## 5: Control group 0.0137     0        0.0137              0     censored  88228
+## 6: Control group 0.0137     0        0.0137              0     censored  88227
+##    yc n_available
+## 1:  0           0
+## 2:  0           0
+## 3:  0           0
+## 4:  0           0
+## 5:  0           0
+## 6:  0           0
-class(pt_object_strat)
-
## [1] "popTime"    "data.table" "data.frame"
-

We can also see that the pt_object_strat has an exposure attribute which contains the name of the exposure variable in the dataset:

+class(pt_object_strat)
+
## [1] "popTime"    "data.table" "data.frame"
+

We can also see that the pt_object_strat has an exposure +attribute which contains the name of the exposure variable in the +dataset:

-attr(pt_object_strat, "exposure")
-
## [1] "ScrArm"
-

The plot method for objects of class popTime will use this exposure attribute to create exposure stratified population time plots:

+attr(pt_object_strat, "exposure") +
## [1] "ScrArm"
+

The plot method for objects of class popTime will use +this exposure attribute to create exposure stratified population time +plots:

-plot(pt_object_strat)
+plot(pt_object_strat)

-

We can also plot them side-by-side using the facet.params argument, which is a list of arguments that are passed to the facet_wrap() function in the ggplot2 package:

+

We can also plot them side-by-side using the +facet.params argument, which is a list of arguments that +are passed to the facet_wrap() +function in the ggplot2 package:

-plot(pt_object_strat,
-     facet.params = list(ncol = 2))
+plot(pt_object_strat, + facet.params = list(ncol = 2))

-
-

-Plotting the base series

-

To illustrate the casebase sampling methodology, we can also plot the base series using the add.base.series function:

+
+

Plotting the base series +

+

To illustrate the casebase sampling methodology, we can also plot the +base series using the add.base.series function:

-plot(pt_object_strat,
-     add.base.series = TRUE)
+plot(pt_object_strat, + add.base.series = TRUE)

-

Note that the theme.params argument is a list of arguments passed to the ggplot2::theme() function.

+

Note that the theme.params argument is a list of +arguments passed to the ggplot2::theme() +function.

-
-

-Stem Cell Data

-

Next we show population time plots when there is a competing event. The bmtcrr dataset contains information on patients who underwent haematopoietic stem cell transplantation for acute leukaemia. This data is included in the casebase package. See help("bmtcrr", package = "casebase") for more details. We will use this dataset to further illustrate the fundamental components of a population time plot.

-

Note that for this dataset, the popTime fails to identify a time variable if you didn’t specify one:

+
+

Stem Cell Data +

+

Next we show population time plots when there is a competing event. +The bmtcrr dataset contains information on patients who +underwent haematopoietic stem cell transplantation for acute leukaemia. +This data is included in the casebase package. See +help("bmtcrr", package = "casebase") for more details. We +will use this dataset to further illustrate the fundamental components +of a population time plot.

+

Note that for this dataset, the popTime fails to +identify a time variable if you didn’t specify one:

-# load data
-data(bmtcrr)
-str(bmtcrr)
-
## 'data.frame':    177 obs. of  7 variables:
-##  $ Sex   : Factor w/ 2 levels "F","M": 2 1 2 1 1 2 2 1 2 1 ...
-##  $ D     : Factor w/ 2 levels "ALL","AML": 1 2 1 1 1 1 1 1 1 1 ...
-##  $ Phase : Factor w/ 4 levels "CR1","CR2","CR3",..: 4 2 3 2 2 4 1 1 1 4 ...
-##  $ Age   : int  48 23 7 26 36 17 7 17 26 8 ...
-##  $ Status: int  2 1 0 2 2 2 0 2 0 1 ...
-##  $ Source: Factor w/ 2 levels "BM+PB","PB": 1 1 1 1 1 1 1 1 1 1 ...
-##  $ ftime : num  0.67 9.5 131.77 24.03 1.47 ...
+# load data +data(bmtcrr) +str(bmtcrr)
+
## 'data.frame':    177 obs. of  7 variables:
+##  $ Sex   : Factor w/ 2 levels "F","M": 2 1 2 1 1 2 2 1 2 1 ...
+##  $ D     : Factor w/ 2 levels "ALL","AML": 1 2 1 1 1 1 1 1 1 1 ...
+##  $ Phase : Factor w/ 4 levels "CR1","CR2","CR3",..: 4 2 3 2 2 4 1 1 1 4 ...
+##  $ Age   : int  48 23 7 26 36 17 7 17 26 8 ...
+##  $ Status: int  2 1 0 2 2 2 0 2 0 1 ...
+##  $ Source: Factor w/ 2 levels "BM+PB","PB": 1 1 1 1 1 1 1 1 1 1 ...
+##  $ ftime : num  0.67 9.5 131.77 24.03 1.47 ...
-# table of event status by exposure
-table(bmtcrr$Status, bmtcrr$D)
-
##    
-##     ALL AML
-##   0  17  29
-##   1  28  28
-##   2  28  47
+# table of event status by exposure +table(bmtcrr$Status, bmtcrr$D)
+
##    
+##     ALL AML
+##   0  17  29
+##   1  28  28
+##   2  28  47
-# error because it can't determine a time variable
-popTimeData <- popTime(data = bmtcrr)
-
## Error in checkArgsTimeEvent(data = data, time = time, event = event): data does not contain time variable
-

In this case, you must be explicit about what the time variable is:

+# error because it can't determine a time variable +popTimeData <- popTime(data = bmtcrr) +
## Error in checkArgsTimeEvent(data = data, time = time, event = event): data does not contain time variable
+

In this case, you must be explicit about what the time variable +is:

-popTimeData <- popTime(data = bmtcrr, time = "ftime")
-
## 'Status' will be used as the event variable
+popTimeData <- popTime(data = bmtcrr, time = "ftime") +
## 'Status' will be used as the event variable
-class(popTimeData)
-
## [1] "popTime"    "data.table" "data.frame"
-
-

-Plotting the follow-up times for each observation

-

We first plot that area on the graph representing the observed follow-up time. Fundamentally, this area is constructed by plotting a line for each individual, where the length of each line represents their follow-up time in the cohort. The follow-up times are plotted from top (shortest follow-up time) to bottom (longest follow-up time). In practice, we instead plot a polygon using the ggplot2::geom_ribbon() function. The following figure shows this area for the bmtcrr dataset. Note that we must specify add.case.series = FALSE because the default is to add the case series:

+class(popTimeData)
+
## [1] "popTime"    "data.table" "data.frame"
+
+

Plotting the follow-up times for each observation +

+

We first plot that area on the graph representing the observed +follow-up time. Fundamentally, this area is constructed by plotting a +line for each individual, where the length of each line represents their +follow-up time in the cohort. The follow-up times are plotted from top +(shortest follow-up time) to bottom (longest follow-up time). In +practice, we instead plot a polygon using the ggplot2::geom_ribbon() +function. The following figure shows this area for the +bmtcrr dataset. Note that we must specify +add.case.series = FALSE because the default is to add the +case series:

-plot(popTimeData,
-     add.case.series = FALSE)
+plot(popTimeData, + add.case.series = FALSE)

-

Note that we can change the aesthetics of the area by using the ribbon.params() argument as follows. These arguments are passed to the ggplot2::geom_ribbon() function:

+

Note that we can change the aesthetics of the area by using the +ribbon.params() argument as follows. These arguments are +passed to the ggplot2::geom_ribbon() +function:

-plot(popTimeData,
-     add.case.series = FALSE,
-     ribbon.params = list(color = "red", fill = "blue", size = 2, alpha = 0.2))
+plot(popTimeData, + add.case.series = FALSE, + ribbon.params = list(color = "red", fill = "blue", size = 2, alpha = 0.2))

-
-

-Plot the Case Series

-

Next we add the case series. Note that because the Status column has a competing event (coded as 2), we must specify comprisk = TRUE (even if you don’t want to plot the competing event):

+
+

Plot the Case Series +

+

Next we add the case series. Note that because the +Status column has a competing event (coded as 2), we must +specify comprisk = TRUE (even if you don’t want to plot the +competing event):

-plot(popTimeData,
-     add.case.series = TRUE,
-     comprisk = TRUE)
+plot(popTimeData, + add.case.series = TRUE, + comprisk = TRUE)

-

In the above plot we can clearly see many of the deaths occur at the beginning, so in this case, a constant hazard assumption isn’t valid. This information is useful when deciding on the type of model to use.

+

In the above plot we can clearly see many of the deaths occur at the +beginning, so in this case, a constant hazard assumption isn’t valid. +This information is useful when deciding on the type of model to +use.

-
-

-Plot the Base Series

-

We can now add the base series with the add.base.series argument. Internally, the plot method calls the casebase::sampleCaseBase function to sample the base series from the total person moments. This requires us to specify the ratio of base series to case series in the ratio argument which we will leave at its default of 1. A legend is also added by default:

+
+

Plot the Base Series +

+

We can now add the base series with the add.base.series +argument. Internally, the plot method calls the +casebase::sampleCaseBase function to sample the base series +from the total person moments. This requires us to specify the ratio of +base series to case series in the ratio argument which we +will leave at its default of 1. A legend is also added by default:

-plot(popTimeData,
-     add.case.series = TRUE,
-     add.base.series = TRUE,
-     comprisk = TRUE)
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + comprisk = TRUE)

-
-

-Plot the Competing event

-

We specify the add.competing.event = TRUE in order to also plot the competing event. Note, that like the case series, the competing event is sampled randomly on the vertical axis in order to see the incidence density.

+
+

Plot the Competing event +

+

We specify the add.competing.event = TRUE in order to +also plot the competing event. Note, that like the case series, the +competing event is sampled randomly on the vertical axis in order to see +the incidence density.

-plot(popTimeData,
-     add.case.series = TRUE,
-     add.base.series = TRUE,
-     add.competing.event = TRUE,
-     comprisk = TRUE)
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + add.competing.event = TRUE, + comprisk = TRUE)

-

We can also only plot the case series and competing event (or any combination):

+

We can also only plot the case series and competing event (or any +combination):

-plot(popTimeData,
-     add.case.series = TRUE,
-     add.base.series = FALSE,
-     add.competing.event = TRUE,
-     comprisk = TRUE)
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = FALSE, + add.competing.event = TRUE, + comprisk = TRUE)

-
-

-Stratified by Disease

-

Next we stratify by disease; lymphoblastic or myeloblastic leukemia, abbreviated as ALL and AML, respectively. We must specify the exposure variable. Furthermore it is important to properly label the factor variable corresponding to the exposure variable; this will ensure proper labeling of the panels:

+
+

Stratified by Disease +

+

Next we stratify by disease; lymphoblastic or myeloblastic leukemia, +abbreviated as ALL and AML, respectively. We must specify the +exposure variable. Furthermore it is important to properly +label the factor variable corresponding to the exposure variable; this +will ensure proper labeling of the panels:

-# create 'popTime' object
-popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D")
-
## 'Status' will be used as the event variable
+# create 'popTime' object +popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D")
+
## 'Status' will be used as the event variable
-attr(popTimeData, "exposure")
-
## [1] "D"
+attr(popTimeData, "exposure")
+
## [1] "D"
-plot(popTimeData,
-     add.case.series = TRUE,
-     add.base.series = TRUE,
-     add.competing.event = TRUE,
-     comprisk = TRUE)
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + add.competing.event = TRUE, + comprisk = TRUE)

-
-

-Change color points and legend labels

-

Here is some code to change color points and legend labels. For a more thorough description, please see the Customizing Population Time Plots vignette.

+
+

Change color points and legend labels +

+

Here is some code to change color points and legend labels. For a +more thorough description, please see the Customizing +Population Time Plots vignette.

-plot(popTimeData,
-     add.case.series = TRUE,
-     add.base.series = TRUE,
-     add.competing.event = TRUE,
-     comprisk = TRUE,
-     case.params = list(mapping = aes(x = time, y = yc, colour = "Relapse", fill = "Relapse")),
-     base.params = list(mapping = aes(x = time, y = ycoord, colour = "Base series", fill = "Base series")),
-     competing.params = list(mapping = aes(x = time, y = yc, colour = "Competing event", fill = "Competing event")),
-     fill.params = list(name = "Legend Name",
-                        breaks = c("Relapse", "Base series", "Competing event"),
-                        values = c("Relapse" = "blue", 
-                                   "Competing event" = "red", 
-                                   "Base series" = "orange"))
-     )
-
## Warning in plot.popTime(popTimeData, add.case.series = TRUE, add.base.series
-## = TRUE, : fill.params has been specified by the user but color.params has not.
-## Setting color.params to be equal to fill.params.
+plot(popTimeData, + add.case.series = TRUE, + add.base.series = TRUE, + add.competing.event = TRUE, + comprisk = TRUE, + case.params = list(mapping = aes(x = time, y = yc, colour = "Relapse", fill = "Relapse")), + base.params = list(mapping = aes(x = time, y = ycoord, colour = "Base series", fill = "Base series")), + competing.params = list(mapping = aes(x = time, y = yc, colour = "Competing event", fill = "Competing event")), + fill.params = list(name = "Legend Name", + breaks = c("Relapse", "Base series", "Competing event"), + values = c("Relapse" = "blue", + "Competing event" = "red", + "Base series" = "orange")) + )
+
## Warning in plot.popTime(popTimeData, add.case.series = TRUE, add.base.series =
+## TRUE, : fill.params has been specified by the user but color.params has not.
+## Setting color.params to be equal to fill.params.

-
-

-Veteran Data

-

Below are the steps to create a population time plot for the Veterans’ Administration Lung Cancer study (see help("veteran", package = "survival") for more details on this dataset).

+
+

Veteran Data +

+

Below are the steps to create a population time plot for the +Veterans’ Administration Lung Cancer study (see +help("veteran", package = "survival") for more details on +this dataset).

-# veteran data in library(survival)
-data("veteran")
-str(veteran)
-
## 'data.frame':    137 obs. of  8 variables:
-##  $ trt     : num  1 1 1 1 1 1 1 1 1 1 ...
-##  $ celltype: Factor w/ 4 levels "squamous","smallcell",..: 1 1 1 1 1 1 1 1 1 1 ...
-##  $ time    : num  72 411 228 126 118 10 82 110 314 100 ...
-##  $ status  : num  1 1 1 1 1 1 1 1 1 0 ...
-##  $ karno   : num  60 70 60 60 70 20 40 80 50 70 ...
-##  $ diagtime: num  7 5 3 9 11 5 10 29 18 6 ...
-##  $ age     : num  69 64 38 63 65 49 69 68 43 70 ...
-##  $ prior   : num  0 10 0 10 10 0 10 0 0 0 ...
+# veteran data in library(survival) +data("veteran")
+
## Warning in data("veteran"): data set 'veteran' not found
-popTimeData <- casebase::popTime(data = veteran)
-
## 'time' will be used as the time variable
-
## 'status' will be used as the event variable
-
-class(popTimeData)
-
## [1] "popTime"    "data.table" "data.frame"
+str(veteran)
+
## 'data.frame':    137 obs. of  8 variables:
+##  $ trt     : num  1 1 1 1 1 1 1 1 1 1 ...
+##  $ celltype: Factor w/ 4 levels "squamous","smallcell",..: 1 1 1 1 1 1 1 1 1 1 ...
+##  $ time    : num  72 411 228 126 118 10 82 110 314 100 ...
+##  $ status  : num  1 1 1 1 1 1 1 1 1 0 ...
+##  $ karno   : num  60 70 60 60 70 20 40 80 50 70 ...
+##  $ diagtime: num  7 5 3 9 11 5 10 29 18 6 ...
+##  $ age     : num  69 64 38 63 65 49 69 68 43 70 ...
+##  $ prior   : num  0 10 0 10 10 0 10 0 0 0 ...
+
+popTimeData <- casebase::popTime(data = veteran)
+
## 'time' will be used as the time variable
+
## 'status' will be used as the event variable
-plot(popTimeData)
+class(popTimeData) +
## [1] "popTime"    "data.table" "data.frame"
+
+plot(popTimeData)

-

We can see in this example that the dots are fairly evenly spread out. That is, we don’t see any particular clusters of red dots indicating that perhaps a constant hazard assumption would be appropriate.

-
-

-Stratified by treatment population time plot

-

In this example we compare the standard and test treatment groups. A reminder that this is done by simply specifying the exposure argument in the casebase::popTime function:

-
-# Label the factor so that it appears in the plot
-veteran <- transform(veteran, trt = factor(trt, levels = 1:2,
-                                           labels = c("standard", "test")))
-
-# create 'popTime' object
-popTimeData <- popTime(data = veteran, exposure = "trt")
-
## 'time' will be used as the time variable
-
## 'status' will be used as the event variable
-
-# object of class 'popTime'
-class(popTimeData)
-
## [1] "popTime"    "data.table" "data.frame"
+

We can see in this example that the dots are fairly evenly spread +out. That is, we don’t see any particular clusters of red dots +indicating that perhaps a constant hazard assumption would be +appropriate.

+
+

Stratified by treatment population time plot +

+

In this example we compare the standard and test treatment groups. A +reminder that this is done by simply specifying the +exposure argument in the casebase::popTime +function:

+
+# Label the factor so that it appears in the plot
+veteran <- transform(veteran, trt = factor(trt, levels = 1:2,
+                                           labels = c("standard", "test")))
+
+# create 'popTime' object
+popTimeData <- popTime(data = veteran, exposure = "trt")
+
## 'time' will be used as the time variable
+
## 'status' will be used as the event variable
-# has name of exposure variable as an attribute
-attr(popTimeData, "exposure")
-
## [1] "trt"
-

Again, we simply pass this object to the plot function to get an exposure stratified population time plot:

+# object of class 'popTime' +class(popTimeData)
+
## [1] "popTime"    "data.table" "data.frame"
-# plot method for objects of class 'popTime'
-plot(popTimeData)
+# has name of exposure variable as an attribute +attr(popTimeData, "exposure")
+
## [1] "trt"
+

Again, we simply pass this object to the plot function +to get an exposure stratified population time plot:

+
+# plot method for objects of class 'popTime'
+plot(popTimeData)

-
-

-Stanford Heart Transplant Data

-

Population time plots also allow you to explain patterns in the data. We use the Stanford Heart Transplant Data to demonstrate this. See help("heart", package = "survival") for details about this dataset. For this example, we must create a time variable, because we only have the start and stop times. This is a good example to show that population time plots are also valid for this type of data (i.e. subjects who have different entry times) because we are only plotting the time spent in the study on the x-axis.

-
-# data from library(survival)
-data("heart")
-str(heart)
-
## 'data.frame':    172 obs. of  8 variables:
-##  $ start     : num  0 0 0 1 0 36 0 0 0 51 ...
-##  $ stop      : num  50 6 1 16 36 39 18 3 51 675 ...
-##  $ event     : num  1 1 0 1 0 1 1 1 0 1 ...
-##  $ age       : num  -17.16 3.84 6.3 6.3 -7.74 ...
-##  $ year      : num  0.123 0.255 0.266 0.266 0.49 ...
-##  $ surgery   : num  0 0 0 0 0 0 0 0 0 0 ...
-##  $ transplant: Factor w/ 2 levels "0","1": 1 1 1 2 1 2 1 1 1 2 ...
-##  $ id        : num  1 2 3 3 4 4 5 6 7 7 ...
+
+

Stanford Heart Transplant Data +

+

Population time plots also allow you to explain patterns in the data. +We use the Stanford Heart Transplant Data to demonstrate this. See +help("heart", package = "survival") for details about this +dataset. For this example, we must create a time variable, because we +only have the start and stop times. This is a good example to show that +population time plots are also valid for this type of data +(i.e. subjects who have different entry times) because we are only +plotting the time spent in the study on the x-axis.

-# create time variable for time in study
-heart <- transform(heart,
-                   time = stop - start,
-                   transplant = factor(transplant,
-                                       labels = c("no transplant", "transplant")))
-
-# stratify by transplant indicator
-popTimeData <- popTime(data = heart, exposure = "transplant")
-
## 'time' will be used as the time variable
-
## 'event' will be used as the event variable
-
-plot(popTimeData)
+# data from library(survival) +data("heart") +str(heart)
+
## 'data.frame':    172 obs. of  8 variables:
+##  $ start     : num  0 0 0 1 0 36 0 0 0 51 ...
+##  $ stop      : num  50 6 1 16 36 39 18 3 51 675 ...
+##  $ event     : num  1 1 0 1 0 1 1 1 0 1 ...
+##  $ age       : num  -17.16 3.84 6.3 6.3 -7.74 ...
+##  $ year      : num  0.123 0.255 0.266 0.266 0.49 ...
+##  $ surgery   : num  0 0 0 0 0 0 0 0 0 0 ...
+##  $ transplant: Factor w/ 2 levels "0","1": 1 1 1 2 1 2 1 1 1 2 ...
+##  $ id        : num  1 2 3 3 4 4 5 6 7 7 ...
+
+# create time variable for time in study
+heart <- transform(heart,
+                   time = stop - start,
+                   transplant = factor(transplant,
+                                       labels = c("no transplant", "transplant")))
+
+# stratify by transplant indicator
+popTimeData <- popTime(data = heart, exposure = "transplant")
+
## 'time' will be used as the time variable
+
## 'event' will be used as the event variable
+
+plot(popTimeData)

-

In the plot above we see that those who didn’t receive transplant died very early (many red dots at the start of the x-axis). Those who did receive the transplant have much better survival (as indicated by the grey area). Does this show clear evidence that getting a heart transplant increases survival? Not exactly. This is a classic case of confounding by indication. In this study, the doctors only gave a transplant to the healthier patients because they had a better chance of surviving surgery.

+

In the plot above we see that those who didn’t receive transplant +died very early (many red dots at the start of the x-axis). Those who +did receive the transplant have much better survival (as indicated by +the grey area). Does this show clear evidence that getting a heart +transplant increases survival? Not exactly. This is a classic case of +confounding by indication. In this study, the doctors only gave a +transplant to the healthier patients because they had a better chance of +surviving surgery.

-
-

-NCCTG Lung Cancer Data

-

The following example is from survival in patients with advanced lung cancer from the North Central Cancer Treatment Group. See help("cancer", package = "survival") for details about this data.

-
-# data from library(survival)
-data("cancer")
-str(cancer)
-
## 'data.frame':    228 obs. of  10 variables:
-##  $ inst     : num  3 3 3 5 1 12 7 11 1 7 ...
-##  $ time     : num  306 455 1010 210 883 ...
-##  $ status   : num  2 2 1 2 2 1 2 2 2 2 ...
-##  $ age      : num  74 68 56 57 60 74 68 71 53 61 ...
-##  $ sex      : num  1 1 1 1 1 1 2 2 1 1 ...
-##  $ ph.ecog  : num  1 0 0 1 0 1 2 2 1 2 ...
-##  $ ph.karno : num  90 90 90 90 100 50 70 60 70 70 ...
-##  $ pat.karno: num  100 90 90 60 90 80 60 80 80 70 ...
-##  $ meal.cal : num  1175 1225 NA 1150 NA ...
-##  $ wt.loss  : num  NA 15 15 11 0 0 10 1 16 34 ...
+
+

NCCTG Lung Cancer Data +

+

The following example is from survival in patients with advanced lung +cancer from the North Central Cancer Treatment Group. See +help("cancer", package = "survival") for details about this +data.

-# since the event indicator 'status' is numeric, it must have
-# 0 for censored and 1 for event
-cancer <- transform(cancer,
-                    status = status - 1,
-                    sex = factor(sex, levels = 1:2,
-                                 labels = c("Male", "Female")))
-
-popTimeData <- popTime(data = cancer)
-
## 'time' will be used as the time variable
-
## 'status' will be used as the event variable
-
-plot(popTimeData)
+# data from library(survival) +data("cancer") +str(cancer)
+
## 'data.frame':    228 obs. of  10 variables:
+##  $ inst     : num  3 3 3 5 1 12 7 11 1 7 ...
+##  $ time     : num  306 455 1010 210 883 ...
+##  $ status   : num  2 2 1 2 2 1 2 2 2 2 ...
+##  $ age      : num  74 68 56 57 60 74 68 71 53 61 ...
+##  $ sex      : num  1 1 1 1 1 1 2 2 1 1 ...
+##  $ ph.ecog  : num  1 0 0 1 0 1 2 2 1 2 ...
+##  $ ph.karno : num  90 90 90 90 100 50 70 60 70 70 ...
+##  $ pat.karno: num  100 90 90 60 90 80 60 80 80 70 ...
+##  $ meal.cal : num  1175 1225 NA 1150 NA ...
+##  $ wt.loss  : num  NA 15 15 11 0 0 10 1 16 34 ...
+
+# since the event indicator 'status' is numeric, it must have
+# 0 for censored and 1 for event
+cancer <- transform(cancer,
+                    status = status - 1,
+                    sex = factor(sex, levels = 1:2,
+                                 labels = c("Male", "Female")))
+
+popTimeData <- popTime(data = cancer)
+
## 'time' will be used as the time variable
+
## 'status' will be used as the event variable
+
+plot(popTimeData)

-
-

-Stratified by gender

-

We can also switch back to the default ggplot2 theme by specifying casebase.theme = FALSE:

-
-popTimeData <- popTime(data = cancer, exposure = "sex")
-
## 'time' will be used as the time variable
-
## 'status' will be used as the event variable
-
-plot(popTimeData,
-     casebase.theme = FALSE)
+
+

Stratified by gender +

+

We can also switch back to the default ggplot2 theme by +specifying casebase.theme = FALSE:

+
+popTimeData <- popTime(data = cancer, exposure = "sex")
+
## 'time' will be used as the time variable
+
## 'status' will be used as the event variable
+
+plot(popTimeData,
+     casebase.theme = FALSE)

-
-

-Simulated Data Example

+
+

Simulated Data Example +

Below is an example based on simulated data.

-
-

-Simulate the data

-
-set.seed(1)
-nobs <- 500
-
-# simulation parameters
-a1 <- 1.0
-b1 <- 200
-a2 <- 1.0
-b2 <- 50
-c1 <- 0.0
-c2 <- 0.0
-
-# end of study time
-eost <- 10
-
-# e event type 0-censored, 1-event of interest, 2-competing event
-# t observed time/endpoint
-# z is a binary covariate
-DTsim <- data.table(ID = seq_len(nobs), z=rbinom(nobs, 1, 0.5))
-setkey(DTsim, ID)
-DTsim[,`:=` (event_time = rweibull(nobs, a1, b1 * exp(z * c1)^(-1/a1)),
-             competing_time = rweibull(nobs, a2, b2 * exp(z * c2)^(-1/a2)),
-             end_of_study_time = eost)]
-DTsim[,`:=`(event = 1 * (event_time < competing_time) +
-                2 * (event_time >= competing_time),
-            time = pmin(event_time, competing_time))]
-DTsim[time >= end_of_study_time, event := 0]
-DTsim[time >= end_of_study_time, time:=end_of_study_time]
+
+

Simulate the data +

+
+set.seed(1)
+nobs <- 500
+
+# simulation parameters
+a1 <- 1.0
+b1 <- 200
+a2 <- 1.0
+b2 <- 50
+c1 <- 0.0
+c2 <- 0.0
+
+# end of study time
+eost <- 10
+
+# e event type 0-censored, 1-event of interest, 2-competing event
+# t observed time/endpoint
+# z is a binary covariate
+DTsim <- data.table(ID = seq_len(nobs), z=rbinom(nobs, 1, 0.5))
+setkey(DTsim, ID)
+DTsim[,`:=` (event_time = rweibull(nobs, a1, b1 * exp(z * c1)^(-1/a1)),
+             competing_time = rweibull(nobs, a2, b2 * exp(z * c2)^(-1/a2)),
+             end_of_study_time = eost)]
+DTsim[,`:=`(event = 1 * (event_time < competing_time) +
+                2 * (event_time >= competing_time),
+            time = pmin(event_time, competing_time))]
+DTsim[time >= end_of_study_time, event := 0]
+DTsim[time >= end_of_study_time, time:=end_of_study_time]
-
-

-Population Time Plot

-
-# create 'popTime' object
-popTimeData <- popTime(data = DTsim, time = "time", event = "event")
-plot(popTimeData)
+
+

Population Time Plot +

+
+# create 'popTime' object
+popTimeData <- popTime(data = DTsim, time = "time", event = "event")
+plot(popTimeData)

-
-

-Stratified by Binary Covariate z

-
-# stratified by binary covariate z
-popTimeData <- popTime(data = DTsim, time = "time", event = "event", exposure = "z")
-
-# we can line up the plots side-by-side instead of one on top of the other
-# we can also change the theme by adding 
-plot(popTimeData,
-     facet.params = list(ncol = 2)) + theme_linedraw()
+
+

Stratified by Binary Covariate z +

+
+# stratified by binary covariate z
+popTimeData <- popTime(data = DTsim, time = "time", event = "event", exposure = "z")
+
+# we can line up the plots side-by-side instead of one on top of the other
+# we can also change the theme by adding 
+plot(popTimeData,
+     facet.params = list(ncol = 2)) + theme_linedraw()

-
-

-Session information

-
## R version 4.0.2 (2020-06-22)
-## Platform: x86_64-pc-linux-gnu (64-bit)
-## Running under: Ubuntu 16.04.6 LTS
-## 
-## Matrix products: default
-## BLAS:   /usr/lib/openblas-base/libblas.so.3
-## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
-## 
-## attached base packages:
-## [1] stats     graphics  grDevices utils     datasets  methods   base     
-## 
-## other attached packages:
-## [1] data.table_1.13.6   ggplot2_3.3.3       casebase_0.9.1.9999
-## [4] survival_3.1-12    
-## 
-## loaded via a namespace (and not attached):
-##  [1] highr_0.8         pillar_1.4.7      compiler_4.0.2    tools_4.0.2      
-##  [5] digest_0.6.27     nlme_3.1-148      evaluate_0.14     memoise_2.0.0    
-##  [9] lifecycle_0.2.0   tibble_3.0.6      gtable_0.3.0      lattice_0.20-41  
-## [13] mgcv_1.8-31       pkgconfig_2.0.3   rlang_0.4.10      Matrix_1.2-18    
-## [17] yaml_2.2.1        pkgdown_1.6.1     xfun_0.20         fastmap_1.1.0    
-## [21] withr_2.4.1       stringr_1.4.0     knitr_1.31        vctrs_0.3.6      
-## [25] desc_1.2.0        fs_1.5.0          systemfonts_1.0.0 stats4_4.0.2     
-## [29] rprojroot_2.0.2   grid_4.0.2        glue_1.4.2        R6_2.5.0         
-## [33] textshaping_0.2.1 VGAM_1.1-5        rmarkdown_2.6     farver_2.0.3     
-## [37] magrittr_2.0.1    scales_1.1.1      htmltools_0.5.1.1 ellipsis_0.3.1   
-## [41] splines_4.0.2     assertthat_0.2.1  colorspace_2.0-0  labeling_0.4.2   
-## [45] ragg_0.4.1        stringi_1.5.3     munsell_0.5.0     cachem_1.0.3     
-## [49] crayon_1.4.0
+
+

Session information +

+
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.2 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## attached base packages:
+## [1] stats     graphics  grDevices utils     datasets  methods   base     
+## 
+## other attached packages:
+## [1] data.table_1.14.8    ggplot2_3.4.2        casebase_0.10.2.9999
+## [4] survival_3.5-5      
+## 
+## loaded via a namespace (and not attached):
+##  [1] sass_0.4.7        utf8_1.2.3        generics_0.1.3    stringi_1.7.12   
+##  [5] lattice_0.21-8    digest_0.6.33     magrittr_2.0.3    evaluate_0.21    
+##  [9] grid_4.3.1        fastmap_1.1.1     rprojroot_2.0.3   jsonlite_1.8.7   
+## [13] Matrix_1.5-4.1    mgcv_1.8-42       purrr_1.0.1       fansi_1.0.4      
+## [17] scales_1.2.1      textshaping_0.3.6 jquerylib_0.1.4   cli_3.6.1        
+## [21] rlang_1.1.1       munsell_0.5.0     splines_4.3.1     withr_2.5.0      
+## [25] cachem_1.0.8      yaml_2.3.7        tools_4.3.1       memoise_2.0.1    
+## [29] dplyr_1.1.2       colorspace_2.1-0  VGAM_1.1-8        vctrs_0.6.3      
+## [33] R6_2.5.1          stats4_4.3.1      lifecycle_1.0.3   stringr_1.5.0    
+## [37] fs_1.6.3          ragg_1.2.5        pkgconfig_2.0.3   desc_1.4.2       
+## [41] pkgdown_2.0.7     bslib_0.5.0       pillar_1.9.0      gtable_0.3.3     
+## [45] glue_1.6.2        systemfonts_1.0.4 highr_0.10        xfun_0.39        
+## [49] tibble_3.2.1      tidyselect_1.2.0  knitr_1.43        farver_2.1.1     
+## [53] htmltools_0.5.5   nlme_3.1-162      labeling_0.4.2    rmarkdown_2.23   
+## [57] compiler_4.3.1
@@ -619,11 +749,13 @@

@@ -632,5 +764,7 @@

+ + diff --git a/articles/popTime_files/figure-html/unnamed-chunk-10-1.png b/articles/popTime_files/figure-html/unnamed-chunk-10-1.png index 3fccce97..6b564956 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-10-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-10-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-13-1.png b/articles/popTime_files/figure-html/unnamed-chunk-13-1.png index b891f010..50636fc4 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-13-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-13-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-14-1.png b/articles/popTime_files/figure-html/unnamed-chunk-14-1.png index 2ffffb36..52216553 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-14-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-14-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-15-1.png b/articles/popTime_files/figure-html/unnamed-chunk-15-1.png index 84e16845..97b224d0 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-15-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-16-1.png b/articles/popTime_files/figure-html/unnamed-chunk-16-1.png index 1602d595..f0857fe6 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-16-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-16-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-17-1.png b/articles/popTime_files/figure-html/unnamed-chunk-17-1.png index 97a02e70..c3f9a030 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-17-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-17-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-18-1.png b/articles/popTime_files/figure-html/unnamed-chunk-18-1.png index bdf0a4b5..0d930a49 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-18-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-18-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-19-1.png b/articles/popTime_files/figure-html/unnamed-chunk-19-1.png index 2c4b07c5..0f6ff40d 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-19-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-19-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-20-1.png b/articles/popTime_files/figure-html/unnamed-chunk-20-1.png index 08b3ddb6..cc333397 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-20-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-20-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-21-1.png b/articles/popTime_files/figure-html/unnamed-chunk-21-1.png index acfb8fb8..4812722d 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-21-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-21-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-23-1.png b/articles/popTime_files/figure-html/unnamed-chunk-23-1.png index 09bf76e0..0cf05758 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-23-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-23-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-24-1.png b/articles/popTime_files/figure-html/unnamed-chunk-24-1.png index 2066a09a..6ba649cd 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-24-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-24-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-25-1.png b/articles/popTime_files/figure-html/unnamed-chunk-25-1.png index 1fda6d80..f3c4c834 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-25-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-25-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-26-1.png b/articles/popTime_files/figure-html/unnamed-chunk-26-1.png index 0a79794f..2010b50f 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-26-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-26-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-28-1.png b/articles/popTime_files/figure-html/unnamed-chunk-28-1.png index be3a7b79..54e5636b 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-28-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-28-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-29-1.png b/articles/popTime_files/figure-html/unnamed-chunk-29-1.png index e8e9a2a2..7e7df86e 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-29-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-29-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-4-1.png b/articles/popTime_files/figure-html/unnamed-chunk-4-1.png index af1f3795..050e262e 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-4-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-4-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-8-1.png b/articles/popTime_files/figure-html/unnamed-chunk-8-1.png index 24e31e96..3460358e 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-8-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-8-1.png differ diff --git a/articles/popTime_files/figure-html/unnamed-chunk-9-1.png b/articles/popTime_files/figure-html/unnamed-chunk-9-1.png index b692eced..3ee1df3e 100644 Binary files a/articles/popTime_files/figure-html/unnamed-chunk-9-1.png and b/articles/popTime_files/figure-html/unnamed-chunk-9-1.png differ diff --git a/articles/smoothHazard.html b/articles/smoothHazard.html index 51198e9c..966f9bd5 100644 --- a/articles/smoothHazard.html +++ b/articles/smoothHazard.html @@ -19,6 +19,8 @@ + +
+

+

As we can see, the empirical survival function resembles an +exponential distribution.

+

We will first try to estimate the hazard function parametrically +using some well-known regression routines. But first, we will reformat +the data slightly.

+
+veteran$prior <- factor(veteran$prior, levels = c(0, 10), labels = c("no","yes"))
+veteran$celltype <- factor(veteran$celltype, 
+                           levels = c('large', 'squamous', 'smallcell', 'adeno'))
+veteran$trt <- factor(veteran$trt, levels = c(1, 2), labels = c("standard", "test"))
+

Using the eha package, we can fit a Weibull form, with +different values of the shape parameter. For shape = 1, we +get an exponential distribution:

-model2 <- weibreg(y ~ karno + diagtime + age + prior + celltype + trt, 
-                  data = veteran, shape = 0)
-summary(model2)
-
## Call:
-## weibreg(formula = y ~ karno + diagtime + age + prior + celltype + 
-##     trt, data = veteran, shape = 0)
-## 
-## Covariate           Mean       Coef Exp(Coef)  se(Coef)    Wald p
-## karno              68.419    -0.032     0.968     0.005     0.000 
-## diagtime            8.139     0.001     1.001     0.009     0.955 
-## age                57.379    -0.007     0.993     0.009     0.476 
-## prior 
-##               no    0.653     0         1           (reference)
-##              yes    0.347     0.047     1.048     0.229     0.836 
-## celltype 
-##            large    0.269     0         1           (reference)
-##         squamous    0.421    -0.428     0.651     0.278     0.123 
-##        smallcell    0.206     0.462     1.587     0.262     0.078 
-##            adeno    0.104     0.792     2.208     0.300     0.008 
-## trt 
-##         standard    0.477     0         1           (reference)
-##             test    0.523     0.246     1.279     0.203     0.224 
-## 
-## log(scale)                    2.864    17.537     0.671     0.000 
-## log(shape)                    0.075     1.077     0.066     0.261 
-## 
-## Events                    
-## Total time at risk         16663 
-## Max. log. likelihood      -715.55 
-## LR test statistic         65.1 
-## Degrees of freedom        8 
-## Overall p-value           4.65393e-11
-

Finally, we can also fit a Cox proportional hazard:

+library(eha) +y <- with(veteran, Surv(time, status)) + +model1 <- weibreg(y ~ karno + diagtime + age + prior + celltype + trt, + data = veteran, shape = 1) +summary(model1)
+
## Call:
+## weibreg(formula = y ~ karno + diagtime + age + prior + celltype + 
+##     trt, data = veteran, shape = 1)
+## 
+## Covariate           Mean       Coef Exp(Coef)  se(Coef)    Wald p
+## karno              68.419    -0.031     0.970     0.005     0.000 
+## diagtime            8.139     0.000     1.000     0.009     0.974 
+## age                57.379    -0.006     0.994     0.009     0.505 
+## prior 
+##               no    0.653     0         1           (reference)
+##              yes    0.347     0.049     1.051     0.227     0.827 
+## celltype 
+##            large    0.269     0         1           (reference)
+##         squamous    0.421    -0.377     0.686     0.273     0.166 
+##        smallcell    0.206     0.443     1.557     0.261     0.090 
+##            adeno    0.104     0.736     2.087     0.294     0.012 
+## trt 
+##         standard    0.477     0         1           (reference)
+##             test    0.523     0.220     1.246     0.199     0.269 
+## 
+## log(scale)                    2.811    16.633     0.713     0.000 
+## 
+##  Shape is fixed at  1 
+## 
+## Events                    
+## Total time at risk         16663 
+## Max. log. likelihood      -716.16 
+## LR test statistic         70.1 
+## Degrees of freedom        8 
+## Overall p-value           4.64229e-12
+

If we take shape = 0, the shape parameter is estimated +along with the regression coefficients:

-model3 <- coxph(y ~ karno + diagtime + age + prior + celltype + trt, 
-                data = veteran)
-summary(model3)
-
## Call:
-## coxph(formula = y ~ karno + diagtime + age + prior + celltype + 
-##     trt, data = veteran)
-## 
-##   n= 137, number of events= 128 
-## 
-##                         coef  exp(coef)   se(coef)      z Pr(>|z|)    
-## karno             -3.282e-02  9.677e-01  5.508e-03 -5.958 2.55e-09 ***
-## diagtime           8.132e-05  1.000e+00  9.136e-03  0.009  0.99290    
-## age               -8.706e-03  9.913e-01  9.300e-03 -0.936  0.34920    
-## prioryes           7.159e-02  1.074e+00  2.323e-01  0.308  0.75794    
-## celltypesquamous  -4.013e-01  6.695e-01  2.827e-01 -1.420  0.15574    
-## celltypesmallcell  4.603e-01  1.584e+00  2.662e-01  1.729  0.08383 .  
-## celltypeadeno      7.948e-01  2.214e+00  3.029e-01  2.624  0.00869 ** 
-## trttest            2.946e-01  1.343e+00  2.075e-01  1.419  0.15577    
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-##                   exp(coef) exp(-coef) lower .95 upper .95
-## karno                0.9677     1.0334    0.9573    0.9782
-## diagtime             1.0001     0.9999    0.9823    1.0182
-## age                  0.9913     1.0087    0.9734    1.0096
-## prioryes             1.0742     0.9309    0.6813    1.6937
-## celltypesquamous     0.6695     1.4938    0.3847    1.1651
-## celltypesmallcell    1.5845     0.6311    0.9403    2.6699
-## celltypeadeno        2.2139     0.4517    1.2228    4.0084
-## trttest              1.3426     0.7448    0.8939    2.0166
-## 
-## Concordance= 0.736  (se = 0.021 )
-## Likelihood ratio test= 62.1  on 8 df,   p=2e-10
-## Wald test            = 62.37  on 8 df,   p=2e-10
-## Score (logrank) test = 66.74  on 8 df,   p=2e-11
-

As we can see, all three models are significant, and they give similar information: karno and celltype are significant predictors, both treatment is not.

-

The method available in this package makes use of case-base sampling. That is, person-moments are randomly sampled across the entire follow-up time, with some moments corresponding to cases and others to controls. By sampling person-moments instead of individuals, we can then use logistic regression to fit smooth-in-time parametric hazard functions. See the previous section for more details.

-

First, we will look at the follow-up time by using population-time plots:

+model2 <- weibreg(y ~ karno + diagtime + age + prior + celltype + trt, + data = veteran, shape = 0) +summary(model2)

+
## Call:
+## weibreg(formula = y ~ karno + diagtime + age + prior + celltype + 
+##     trt, data = veteran, shape = 0)
+## 
+## Covariate           Mean       Coef Exp(Coef)  se(Coef)    Wald p
+## karno              68.419    -0.032     0.968     0.005     0.000 
+## diagtime            8.139     0.001     1.001     0.009     0.955 
+## age                57.379    -0.007     0.993     0.009     0.476 
+## prior 
+##               no    0.653     0         1           (reference)
+##              yes    0.347     0.047     1.048     0.229     0.836 
+## celltype 
+##            large    0.269     0         1           (reference)
+##         squamous    0.421    -0.428     0.651     0.278     0.123 
+##        smallcell    0.206     0.462     1.587     0.262     0.078 
+##            adeno    0.104     0.792     2.208     0.300     0.008 
+## trt 
+##         standard    0.477     0         1           (reference)
+##             test    0.523     0.246     1.279     0.203     0.224 
+## 
+## log(scale)                    2.864    17.537     0.671     0.000 
+## log(shape)                    0.075     1.077     0.066     0.261 
+## 
+## Events                    
+## Total time at risk         16663 
+## Max. log. likelihood      -715.55 
+## LR test statistic         65.1 
+## Degrees of freedom        8 
+## Overall p-value           4.65393e-11
+

Finally, we can also fit a Cox proportional hazard:

-# create popTime object
-pt_veteran <- popTime(data = veteran)
-
## 'time' will be used as the time variable
-
## 'status' will be used as the event variable
-
-class(pt_veteran)
-
## [1] "popTime"    "data.table" "data.frame"
+model3 <- coxph(y ~ karno + diagtime + age + prior + celltype + trt, + data = veteran) +summary(model3)
+
## Call:
+## coxph(formula = y ~ karno + diagtime + age + prior + celltype + 
+##     trt, data = veteran)
+## 
+##   n= 137, number of events= 128 
+## 
+##                         coef  exp(coef)   se(coef)      z Pr(>|z|)    
+## karno             -3.282e-02  9.677e-01  5.508e-03 -5.958 2.55e-09 ***
+## diagtime           8.132e-05  1.000e+00  9.136e-03  0.009  0.99290    
+## age               -8.706e-03  9.913e-01  9.300e-03 -0.936  0.34920    
+## prioryes           7.159e-02  1.074e+00  2.323e-01  0.308  0.75794    
+## celltypesquamous  -4.013e-01  6.695e-01  2.827e-01 -1.420  0.15574    
+## celltypesmallcell  4.603e-01  1.584e+00  2.662e-01  1.729  0.08383 .  
+## celltypeadeno      7.948e-01  2.214e+00  3.029e-01  2.624  0.00869 ** 
+## trttest            2.946e-01  1.343e+00  2.075e-01  1.419  0.15577    
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+##                   exp(coef) exp(-coef) lower .95 upper .95
+## karno                0.9677     1.0334    0.9573    0.9782
+## diagtime             1.0001     0.9999    0.9823    1.0182
+## age                  0.9913     1.0087    0.9734    1.0096
+## prioryes             1.0742     0.9309    0.6813    1.6937
+## celltypesquamous     0.6695     1.4938    0.3847    1.1651
+## celltypesmallcell    1.5845     0.6311    0.9403    2.6699
+## celltypeadeno        2.2139     0.4517    1.2228    4.0084
+## trttest              1.3426     0.7448    0.8939    2.0166
+## 
+## Concordance= 0.736  (se = 0.021 )
+## Likelihood ratio test= 62.1  on 8 df,   p=2e-10
+## Wald test            = 62.37  on 8 df,   p=2e-10
+## Score (logrank) test = 66.74  on 8 df,   p=2e-11
+

As we can see, all three models are significant, and they give +similar information: karno and celltype are +significant predictors, both treatment is not.

+

The method available in this package makes use of case-base +sampling. That is, person-moments are randomly sampled across the +entire follow-up time, with some moments corresponding to cases and +others to controls. By sampling person-moments instead of individuals, +we can then use logistic regression to fit smooth-in-time parametric +hazard functions. See the previous section for more details.

+

First, we will look at the follow-up time by using population-time +plots:

+
+# create popTime object
+pt_veteran <- popTime(data = veteran)
+
## 'time' will be used as the time variable
+
## 'status' will be used as the event variable
-# plot method for objects of class 'popTime'
-plot(pt_veteran)
+class(pt_veteran)
+
## [1] "popTime"    "data.table" "data.frame"
+
+# plot method for objects of class 'popTime'
+plot(pt_veteran)

-

Population-time plots are a useful way of visualizing the total follow-up experience, where individuals appear on the y-axis, and follow-up time on the x-axis; each individual’s follow-up time is represented by a gray line segment. For convenience, we have ordered the patients according to their time-to-event, and each event is represented by a red dot. The censored observations (of which there is only a few) correspond to the grey lines which do not end with a red dot.

-

Next, we use case-base sampling to fit a parametric hazard function via logistic regression. First, we will include time as a linear term; as noted above, this corresponds to an Gompertz hazard.

-
-model4 <- fitSmoothHazard(status ~ time + karno + diagtime + age + prior +
-             celltype + trt, data = veteran, ratio = 100)
-
## 'time' will be used as the time variable
+

Population-time plots are a useful way of visualizing the total +follow-up experience, where individuals appear on the y-axis, and +follow-up time on the x-axis; each individual’s follow-up time is +represented by a gray line segment. For convenience, we have ordered the +patients according to their time-to-event, and each event is represented +by a red dot. The censored observations (of which there is only a few) +correspond to the grey lines which do not end with a red dot.

+

Next, we use case-base sampling to fit a parametric hazard function +via logistic regression. First, we will include time as a linear term; +as noted above, this corresponds to an Gompertz hazard.

-summary(model4)
-
## Fitting smooth hazards with case-base sampling
-## 
-## Sample size: 137 
-## Number of events: 128 
-## Number of base moments: 12800 
-## ----
-## 
-## Call:
-## fitSmoothHazard(formula = status ~ time + karno + diagtime + 
-##     age + prior + celltype + trt, data = veteran, ratio = 100)
-## 
-## Deviance Residuals: 
-##     Min       1Q   Median       3Q      Max  
-## -0.4285  -0.1501  -0.1199  -0.1002   3.4177  
-## 
-## Coefficients:
-##                     Estimate Std. Error z value Pr(>|z|)    
-## (Intercept)       -2.6888658  0.7213316  -3.728 0.000193 ***
-## time               0.0003337  0.0006450   0.517 0.604898    
-## karno             -0.0324322  0.0052911  -6.130 8.81e-10 ***
-## diagtime           0.0035611  0.0093088   0.383 0.702056    
-## age               -0.0066025  0.0092944  -0.710 0.477471    
-## prioryes           0.0089338  0.2313008   0.039 0.969190    
-## celltypesquamous  -0.4319329  0.2844682  -1.518 0.128917    
-## celltypesmallcell  0.3941374  0.2626738   1.500 0.133489    
-## celltypeadeno      0.7020877  0.2987435   2.350 0.018767 *  
-## trttest            0.2109057  0.2018479   1.045 0.296081    
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-## (Dispersion parameter for binomial family taken to be 1)
-## 
-##     Null deviance: 1436.2  on 12927  degrees of freedom
-## Residual deviance: 1365.7  on 12918  degrees of freedom
-## AIC: 1385.7
-## 
-## Number of Fisher Scoring iterations: 8
-

Since the output object from fitSmoothHazard inherits from the glm class, we see a familiar result when using the function summary. We can quickly visualize the conditional association between each predictor and the hazard function using the plot method for objects that are fit with fitSmoothHazard. Specifically, if \(x\) is the predictor of interest, \(h\) is the hazard function, and \(\mathbf{x_{-j}}\) the other predictors in the model, the conditional association plot represents the relationship \(f(x) = \mathbb{E}(h|x, \mathbf{x_{-j}})\). By default, the other terms in the model (\(\mathbf{x_{-j}}\)) are set to their median if the term is numeric or the most common category if the term is a factor. Further details of customizing these plots are given in the Plot Hazards and Hazard Ratios vignette.

+model4 <- fitSmoothHazard(status ~ time + karno + diagtime + age + prior + + celltype + trt, data = veteran, ratio = 100)
+
## 'time' will be used as the time variable
-library(visreg)
-plot(model4, 
-     hazard.params = list(alpha = 0.05))
-
## Conditions used in construction of plot
-## karno: 70
-## diagtime: 5
-## age: 60
-## prior: no
-## celltype: squamous
-## trt: test
-## offset: 0
+summary(model4)
+
## Fitting smooth hazards with case-base sampling
+## 
+## Sample size: 137 
+## Number of events: 128 
+## Number of base moments: 12800 
+## ----
+## 
+## Call:
+## fitSmoothHazard(formula = status ~ time + karno + diagtime + 
+##     age + prior + celltype + trt, data = veteran, ratio = 100)
+## 
+## Coefficients:
+##                     Estimate Std. Error z value Pr(>|z|)    
+## (Intercept)       -2.6888658  0.7213316  -3.728 0.000193 ***
+## time               0.0003337  0.0006450   0.517 0.604898    
+## karno             -0.0324322  0.0052911  -6.130 8.81e-10 ***
+## diagtime           0.0035611  0.0093088   0.383 0.702056    
+## age               -0.0066025  0.0092944  -0.710 0.477471    
+## prioryes           0.0089338  0.2313008   0.039 0.969190    
+## celltypesquamous  -0.4319329  0.2844682  -1.518 0.128917    
+## celltypesmallcell  0.3941374  0.2626738   1.500 0.133489    
+## celltypeadeno      0.7020877  0.2987435   2.350 0.018767 *  
+## trttest            0.2109057  0.2018479   1.045 0.296081    
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+## (Dispersion parameter for binomial family taken to be 1)
+## 
+##     Null deviance: 1436.2  on 12927  degrees of freedom
+## Residual deviance: 1365.7  on 12918  degrees of freedom
+## AIC: 1385.7
+## 
+## Number of Fisher Scoring iterations: 8
+

Since the output object from fitSmoothHazard inherits +from the glm class, we see a familiar result when using the +function summary. We can quickly visualize the conditional +association between each predictor and the hazard function using the +plot method for objects that are fit with +fitSmoothHazard. Specifically, if \(x\) is the predictor of interest, \(h\) is the hazard function, and \(\mathbf{x_{-j}}\) the other predictors in +the model, the conditional association plot represents the relationship +\(f(x) = \mathbb{E}(h|x, +\mathbf{x_{-j}})\). By default, the other terms in the model +(\(\mathbf{x_{-j}}\)) are set to their +median if the term is numeric or the most common category if the term is +a factor. Further details of customizing these plots are given in the +Plot Hazards and Hazard Ratios vignette.

+
+library(visreg)
+plot(model4, 
+     hazard.params = list(alpha = 0.05))
+
## Conditions used in construction of plot
+## karno: 70
+## diagtime: 5
+## age: 60
+## prior: no
+## celltype: squamous
+## trt: test
+## offset: 0

-
## Conditions used in construction of plot
-## time: 93.41294
-## diagtime: 5
-## age: 60
-## prior: no
-## celltype: squamous
-## trt: test
-## offset: 0
+
## Conditions used in construction of plot
+## time: 93.41294
+## diagtime: 5
+## age: 60
+## prior: no
+## celltype: squamous
+## trt: test
+## offset: 0

-
## Conditions used in construction of plot
-## time: 93.41294
-## karno: 70
-## age: 60
-## prior: no
-## celltype: squamous
-## trt: test
-## offset: 0
+
## Conditions used in construction of plot
+## time: 93.41294
+## karno: 70
+## age: 60
+## prior: no
+## celltype: squamous
+## trt: test
+## offset: 0

-
## Conditions used in construction of plot
-## time: 93.41294
-## karno: 70
-## diagtime: 5
-## prior: no
-## celltype: squamous
-## trt: test
-## offset: 0
+
## Conditions used in construction of plot
+## time: 93.41294
+## karno: 70
+## diagtime: 5
+## prior: no
+## celltype: squamous
+## trt: test
+## offset: 0

-
## Conditions used in construction of plot
-## time: 93.41294
-## karno: 70
-## diagtime: 5
-## age: 60
-## celltype: squamous
-## trt: test
-## offset: 0
+
## Conditions used in construction of plot
+## time: 93.41294
+## karno: 70
+## diagtime: 5
+## age: 60
+## celltype: squamous
+## trt: test
+## offset: 0

-
## Conditions used in construction of plot
-## time: 93.41294
-## karno: 70
-## diagtime: 5
-## age: 60
-## prior: no
-## trt: test
-## offset: 0
+
## Conditions used in construction of plot
+## time: 93.41294
+## karno: 70
+## diagtime: 5
+## age: 60
+## prior: no
+## trt: test
+## offset: 0

-
## Conditions used in construction of plot
-## time: 93.41294
-## karno: 70
-## diagtime: 5
-## age: 60
-## prior: no
-## celltype: squamous
-## offset: 0
+
## Conditions used in construction of plot
+## time: 93.41294
+## karno: 70
+## diagtime: 5
+## age: 60
+## prior: no
+## celltype: squamous
+## offset: 0

-

The main purpose of fitting smooth hazard functions is that it is then relatively easy to compute absolute risks. For example, we can use the function absoluteRisk to compute the mean absolute risk at 90 days, which can then be compared to the empirical measure.

-
-absRisk4 <- absoluteRisk(object = model4, time = 90)
-mean(absRisk4)
-
## [1] 0.6124916
+

The main purpose of fitting smooth hazard functions is that it is +then relatively easy to compute absolute risks. For example, we can use +the function absoluteRisk to compute the mean absolute risk +at 90 days, which can then be compared to the empirical measure.

-ftime <- veteran$time
-mean(ftime <= 90)
-
## [1] 0.5547445
-

We can also fit a Weibull hazard by using a logarithmic term for time:

+absRisk4 <- absoluteRisk(object = model4, time = 90) +mean(absRisk4) +
## [1] 0.6124916
-model5 <- fitSmoothHazard(status ~ log(time) + karno + diagtime + age + prior +
-             celltype + trt, data = veteran, ratio = 100)
-
## 'time' will be used as the time variable
+ftime <- veteran$time +mean(ftime <= 90) +
## [1] 0.5547445
+

We can also fit a Weibull hazard by using a logarithmic term for +time:

-summary(model5)
-
## Fitting smooth hazards with case-base sampling
-## 
-## Sample size: 137 
-## Number of events: 128 
-## Number of base moments: 12800 
-## ----
-## 
-## Call:
-## fitSmoothHazard(formula = status ~ log(time) + karno + diagtime + 
-##     age + prior + celltype + trt, data = veteran, ratio = 100)
-## 
-## Deviance Residuals: 
-##     Min       1Q   Median       3Q      Max  
-## -0.4637  -0.1518  -0.1190  -0.0967   3.4205  
-## 
-## Coefficients:
-##                     Estimate Std. Error z value Pr(>|z|)    
-## (Intercept)       -3.0686578  0.7552335  -4.063 4.84e-05 ***
-## log(time)          0.0721936  0.0718563   1.005   0.3150    
-## karno             -0.0328926  0.0055067  -5.973 2.33e-09 ***
-## diagtime          -0.0007996  0.0091842  -0.087   0.9306    
-## age               -0.0045541  0.0092918  -0.490   0.6240    
-## prioryes           0.0286398  0.2292875   0.125   0.9006    
-## celltypesquamous  -0.4170963  0.2799527  -1.490   0.1363    
-## celltypesmallcell  0.4495348  0.2634379   1.706   0.0879 .  
-## celltypeadeno      0.7776537  0.3036331   2.561   0.0104 *  
-## trttest            0.2740183  0.2046518   1.339   0.1806    
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-## (Dispersion parameter for binomial family taken to be 1)
-## 
-##     Null deviance: 1436.2  on 12927  degrees of freedom
-## Residual deviance: 1365.1  on 12918  degrees of freedom
-## AIC: 1385.1
-## 
-## Number of Fisher Scoring iterations: 8
-

With case-base sampling, it is straightforward to fit a semi-parametric hazard function using splines, which can then be used to estimate the mean absolute risk.

+model5 <- fitSmoothHazard(status ~ log(time) + karno + diagtime + age + prior + + celltype + trt, data = veteran, ratio = 100) +
## 'time' will be used as the time variable
-# Fit a spline for time
-library(splines)
-model6 <- fitSmoothHazard(status ~ bs(time) + karno + diagtime + age + prior +
-             celltype + trt, data = veteran, ratio = 100)
-
## 'time' will be used as the time variable
+summary(model5) +
## Fitting smooth hazards with case-base sampling
+## 
+## Sample size: 137 
+## Number of events: 128 
+## Number of base moments: 12800 
+## ----
+## 
+## Call:
+## fitSmoothHazard(formula = status ~ log(time) + karno + diagtime + 
+##     age + prior + celltype + trt, data = veteran, ratio = 100)
+## 
+## Coefficients:
+##                     Estimate Std. Error z value Pr(>|z|)    
+## (Intercept)       -3.0686578  0.7552335  -4.063 4.84e-05 ***
+## log(time)          0.0721936  0.0718563   1.005   0.3150    
+## karno             -0.0328926  0.0055067  -5.973 2.33e-09 ***
+## diagtime          -0.0007996  0.0091842  -0.087   0.9306    
+## age               -0.0045541  0.0092918  -0.490   0.6240    
+## prioryes           0.0286398  0.2292875   0.125   0.9006    
+## celltypesquamous  -0.4170963  0.2799527  -1.490   0.1363    
+## celltypesmallcell  0.4495348  0.2634379   1.706   0.0879 .  
+## celltypeadeno      0.7776537  0.3036331   2.561   0.0104 *  
+## trttest            0.2740183  0.2046518   1.339   0.1806    
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+## (Dispersion parameter for binomial family taken to be 1)
+## 
+##     Null deviance: 1436.2  on 12927  degrees of freedom
+## Residual deviance: 1365.1  on 12918  degrees of freedom
+## AIC: 1385.1
+## 
+## Number of Fisher Scoring iterations: 8
+

With case-base sampling, it is straightforward to fit a +semi-parametric hazard function using splines, which can then be used to +estimate the mean absolute risk.

-summary(model6)
-
## Fitting smooth hazards with case-base sampling
-## 
-## Sample size: 137 
-## Number of events: 128 
-## Number of base moments: 12800 
-## ----
-## 
-## Call:
-## fitSmoothHazard(formula = status ~ bs(time) + karno + diagtime + 
-##     age + prior + celltype + trt, data = veteran, ratio = 100)
-## 
-## Deviance Residuals: 
-##     Min       1Q   Median       3Q      Max  
-## -0.4551  -0.1535  -0.1198  -0.0952   3.5218  
-## 
-## Coefficients:
-##                     Estimate Std. Error z value Pr(>|z|)    
-## (Intercept)       -2.9343444  0.7277306  -4.032 5.53e-05 ***
-## bs(time)1          1.6365070  1.0324136   1.585  0.11294    
-## bs(time)2         -2.5135557  1.7558703  -1.432  0.15228    
-## bs(time)3          1.6976154  0.9897559   1.715  0.08631 .  
-## karno             -0.0322573  0.0053904  -5.984 2.17e-09 ***
-## diagtime           0.0003886  0.0091607   0.042  0.96617    
-## age               -0.0065366  0.0093554  -0.699  0.48474    
-## prioryes           0.0162450  0.2356069   0.069  0.94503    
-## celltypesquamous  -0.4172317  0.2837950  -1.470  0.14151    
-## celltypesmallcell  0.4518275  0.2651004   1.704  0.08831 .  
-## celltypeadeno      0.8527251  0.3040353   2.805  0.00504 ** 
-## trttest            0.2622058  0.2073319   1.265  0.20599    
-## ---
-## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
-## 
-## (Dispersion parameter for binomial family taken to be 1)
-## 
-##     Null deviance: 1436.2  on 12927  degrees of freedom
-## Residual deviance: 1363.0  on 12916  degrees of freedom
-## AIC: 1387
-## 
-## Number of Fisher Scoring iterations: 8
+# Fit a spline for time +library(splines) +model6 <- fitSmoothHazard(status ~ bs(time) + karno + diagtime + age + prior + + celltype + trt, data = veteran, ratio = 100) +
## 'time' will be used as the time variable
-str(absoluteRisk(object = model6, time = 90))
-
##  'absRiskCB' num [1:2, 1:138] 0 90 0 0.294 0 ...
-##  - attr(*, "dimnames")=List of 2
-##   ..$ : chr [1:2] "" ""
-##   ..$ : chr [1:138] "time" "" "" "" ...
-##  - attr(*, "type")= chr "CI"
-

As we can see from the summary, there is little evidence that splines actually improve the fit. Moreover, we can see that estimated individual absolute risks are essentially the same when using either a linear term or splines:

+summary(model6) +
## Fitting smooth hazards with case-base sampling
+## 
+## Sample size: 137 
+## Number of events: 128 
+## Number of base moments: 12800 
+## ----
+## 
+## Call:
+## fitSmoothHazard(formula = status ~ bs(time) + karno + diagtime + 
+##     age + prior + celltype + trt, data = veteran, ratio = 100)
+## 
+## Coefficients:
+##                     Estimate Std. Error z value Pr(>|z|)    
+## (Intercept)       -2.9343444  0.7277306  -4.032 5.53e-05 ***
+## bs(time)1          1.6365070  1.0324136   1.585  0.11294    
+## bs(time)2         -2.5135557  1.7558703  -1.432  0.15228    
+## bs(time)3          1.6976154  0.9897559   1.715  0.08631 .  
+## karno             -0.0322573  0.0053904  -5.984 2.17e-09 ***
+## diagtime           0.0003886  0.0091607   0.042  0.96617    
+## age               -0.0065366  0.0093554  -0.699  0.48474    
+## prioryes           0.0162450  0.2356069   0.069  0.94503    
+## celltypesquamous  -0.4172317  0.2837950  -1.470  0.14151    
+## celltypesmallcell  0.4518275  0.2651004   1.704  0.08831 .  
+## celltypeadeno      0.8527251  0.3040353   2.805  0.00504 ** 
+## trttest            0.2622058  0.2073319   1.265  0.20599    
+## ---
+## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
+## 
+## (Dispersion parameter for binomial family taken to be 1)
+## 
+##     Null deviance: 1436.2  on 12927  degrees of freedom
+## Residual deviance: 1363.0  on 12916  degrees of freedom
+## AIC: 1387
+## 
+## Number of Fisher Scoring iterations: 8
-linearRisk <- absoluteRisk(object = model4, time = 90, newdata = veteran)
-splineRisk <- absoluteRisk(object = model6, time = 90, newdata = veteran)
-
-plot.default(linearRisk, splineRisk,
-     xlab = "Linear", ylab = "Splines", pch = 19)
-abline(a = 0, b = 1, lty = 2, lwd = 2, col = 'red')
+str(absoluteRisk(object = model6, time = 90)) +
##  'absRiskCB' num [1:2, 1:138] 0 90 0 0.294 0 ...
+##  - attr(*, "dimnames")=List of 2
+##   ..$ : chr [1:2] "" ""
+##   ..$ : chr [1:138] "time" "" "" "" ...
+##  - attr(*, "type")= chr "CI"
+##  - attr(*, "newdata")='data.frame':  137 obs. of  7 variables:
+##   ..$ trt     : Factor w/ 2 levels "standard","test": 1 1 1 1 1 1 1 1 1 1 ...
+##   ..$ celltype: Factor w/ 4 levels "large","squamous",..: 2 2 2 2 2 2 2 2 2 2 ...
+##   ..$ status  : num [1:137] 1 1 1 1 1 1 1 1 1 0 ...
+##   ..$ karno   : num [1:137] 60 70 60 60 70 20 40 80 50 70 ...
+##   ..$ diagtime: num [1:137] 7 5 3 9 11 5 10 29 18 6 ...
+##   ..$ age     : num [1:137] 69 64 38 63 65 49 69 68 43 70 ...
+##   ..$ prior   : Factor w/ 2 levels "no","yes": 1 2 1 2 2 1 2 1 1 1 ...
+

As we can see from the summary, there is little evidence that splines +actually improve the fit. Moreover, we can see that estimated individual +absolute risks are essentially the same when using either a linear term +or splines:

+
+linearRisk <- absoluteRisk(object = model4, time = 90, newdata = veteran)
+splineRisk <- absoluteRisk(object = model6, time = 90, newdata = veteran)
+
+plot.default(linearRisk, splineRisk,
+     xlab = "Linear", ylab = "Splines", pch = 19)
+abline(a = 0, b = 1, lty = 2, lwd = 2, col = 'red')

-

These last three models give similar information as the first three, i.e. the main predictors for the hazard are karno and celltype, with treatment being non-significant. Moreover, by explicitly including the time variable in the formula, we see that it is not significant; this is evidence that the true hazard is exponential.

-

Finally, we can look at the estimates of the coefficients for the Cox model, as well as the last three models (CB stands for “case-base”):

+

These last three models give similar information as the first three, +i.e. the main predictors for the hazard are karno and +celltype, with treatment being non-significant. Moreover, +by explicitly including the time variable in the formula, we see that it +is not significant; this is evidence that the true hazard is +exponential.

+

Finally, we can look at the estimates of the coefficients for the Cox +model, as well as the last three models (CB stands for “case-base”):

@@ -664,141 +733,167 @@

-
-

-Cumulative Incidence Curves

-

Here we show how to calculate the cumulative incidence curves for a specific risk profile using the following equation:

-

\[ CI(x, t) = 1 - exp\left[ - \int_0^t h(x, u) \textrm{d}u \right] \] where \( h(x, t) \) is the hazard function, \( t \) denotes the numerical value (number of units) of a point in prognostic/prospective time and \( x \) is the realization of the vector \( X \) of variates based on the patient’s profile and intervention (if any).

-

We compare the cumulative incidence functions from the fully-parametric fit using case base sampling, with those from the Cox model:

-
-# define a specific covariate profile
-new_data <- data.frame(trt = "test", 
-                       celltype = "adeno", 
-                       karno = median(veteran$karno), 
-                       diagtime = median(veteran$diagtime),
-                       age = median(veteran$age),
-                       prior = "no")
-
-# calculate cumulative incidence using casebase model
-smooth_risk <- absoluteRisk(object = model4, 
-                            time = seq(0,300, 1), 
-                            newdata = new_data)
-
-cols <- c("#8E063B","#023FA5")
-
-# cumulative incidence function for the Cox model
-plot(survfit(model3, newdata = new_data),
-     xlab = "Days", ylab = "Cumulative Incidence (%)", fun = "event",
-     xlim = c(0,300), conf.int = F, col = cols[1], 
-     main = sprintf("Estimated Cumulative Incidence (risk) of Lung Cancer\ntrt = test, celltype = adeno, karno = %g,\ndiagtime = %g, age = %g, prior = no", median(veteran$karno), median(veteran$diagtime), 
-                    median(veteran$age)))
-
-# add casebase curve with legend
-plot(smooth_risk, add = TRUE, col = cols[2], gg = FALSE)
-legend("bottomright", 
-       legend = c("semi-parametric (Cox)", "parametric (casebase)"), 
-       col = cols,
-       lty = c(1, 1), 
-       bg = "gray90")
+
+

Cumulative Incidence Curves +

+

Here we show how to calculate the cumulative incidence curves for a +specific risk profile using the following equation:

+

\[ CI(x, t) = 1 - exp\left[ - \int_0^t +h(x, u) \textrm{d}u \right] \] where \( h(x, t) \) is the hazard +function, \( t \) denotes the numerical value (number of units) of a +point in prognostic/prospective time and \( x \) is the realization of +the vector \( X \) of variates based on the patient’s profile and +intervention (if any).

+

We compare the cumulative incidence functions from the +fully-parametric fit using case base sampling, with those from the Cox +model:

+
+# define a specific covariate profile
+new_data <- data.frame(trt = "test", 
+                       celltype = "adeno", 
+                       karno = median(veteran$karno), 
+                       diagtime = median(veteran$diagtime),
+                       age = median(veteran$age),
+                       prior = "no")
+
+# calculate cumulative incidence using casebase model
+smooth_risk <- absoluteRisk(object = model4, 
+                            time = seq(0,300, 1), 
+                            newdata = new_data)
+
+cols <- c("#8E063B","#023FA5")
+
+# cumulative incidence function for the Cox model
+plot(survfit(model3, newdata = new_data),
+     xlab = "Days", ylab = "Cumulative Incidence (%)", fun = "event",
+     xlim = c(0,300), conf.int = F, col = cols[1], 
+     main = sprintf("Estimated Cumulative Incidence (risk) of Lung Cancer\ntrt = test, celltype = adeno, karno = %g,\ndiagtime = %g, age = %g, prior = no", median(veteran$karno), median(veteran$diagtime), 
+                    median(veteran$age)))
+
+# add casebase curve with legend
+plot(smooth_risk, add = TRUE, col = cols[2], gg = FALSE)
+legend("bottomright", 
+       legend = c("semi-parametric (Cox)", "parametric (casebase)"), 
+       col = cols,
+       lty = c(1, 1), 
+       bg = "gray90")

-

Note that by default, absoulteRisk calculated the cumulative incidence. Alternatively, you can calculate the survival curve by specifying type = 'survival' in the call to absoulteRisk:

-
-smooth_risk <- absoluteRisk(object = model4, 
-                            time = seq(0,300, 1), 
-                            newdata = new_data, 
-                            type = "survival")
-
-plot(survfit(model3, newdata = new_data),
-     xlab = "Days", ylab = "Survival Probability (%)", 
-     xlim = c(0,300), conf.int = F, col = cols[1], 
-     main = sprintf("Estimated Survival Probability of Lung Cancer\ntrt = test, celltype = adeno, karno = %g,\ndiagtime = %g, age = %g, prior = no", median(veteran$karno), median(veteran$diagtime), 
-                    median(veteran$age)))
-
-# add casebase curve with legend
-plot(smooth_risk, add = TRUE, col = cols[2], gg = FALSE)
-legend("topright", 
-       legend = c("semi-parametric (Cox)", "parametric (casebase)"), 
-       col = cols,
-       lty = c(1, 1), 
-       bg = "gray90")
+

Note that by default, absoulteRisk calculated the +cumulative incidence. Alternatively, you can calculate the survival +curve by specifying type = 'survival' in the call to +absoulteRisk:

+
+smooth_risk <- absoluteRisk(object = model4, 
+                            time = seq(0,300, 1), 
+                            newdata = new_data, 
+                            type = "survival")
+
+plot(survfit(model3, newdata = new_data),
+     xlab = "Days", ylab = "Survival Probability (%)", 
+     xlim = c(0,300), conf.int = F, col = cols[1], 
+     main = sprintf("Estimated Survival Probability of Lung Cancer\ntrt = test, celltype = adeno, karno = %g,\ndiagtime = %g, age = %g, prior = no", median(veteran$karno), median(veteran$diagtime), 
+                    median(veteran$age)))
+
+# add casebase curve with legend
+plot(smooth_risk, add = TRUE, col = cols[2], gg = FALSE)
+legend("topright", 
+       legend = c("semi-parametric (Cox)", "parametric (casebase)"), 
+       col = cols,
+       lty = c(1, 1), 
+       bg = "gray90")

-
-

-Session information

-
## R version 4.0.2 (2020-06-22)
-## Platform: x86_64-pc-linux-gnu (64-bit)
-## Running under: Ubuntu 16.04.6 LTS
-## 
-## Matrix products: default
-## BLAS:   /usr/lib/openblas-base/libblas.so.3
-## LAPACK: /usr/lib/libopenblasp-r0.2.18.so
-## 
-## attached base packages:
-## [1] splines   stats     graphics  grDevices utils     datasets  methods  
-## [8] base     
-## 
-## other attached packages:
-## [1] visreg_2.7.0        eha_2.8.5           survival_3.1-12    
-## [4] casebase_0.9.1.9999
-## 
-## loaded via a namespace (and not attached):
-##  [1] highr_0.8         pillar_1.4.7      compiler_4.0.2    tools_4.0.2      
-##  [5] digest_0.6.27     nlme_3.1-148      tibble_3.0.6      evaluate_0.14    
-##  [9] memoise_2.0.0     lifecycle_0.2.0   gtable_0.3.0      lattice_0.20-41  
-## [13] mgcv_1.8-31       pkgconfig_2.0.3   rlang_0.4.10      Matrix_1.2-18    
-## [17] yaml_2.2.1        pkgdown_1.6.1     xfun_0.20         fastmap_1.1.0    
-## [21] stringr_1.4.0     knitr_1.31        vctrs_0.3.6       desc_1.2.0       
-## [25] fs_1.5.0          systemfonts_1.0.0 stats4_4.0.2      rprojroot_2.0.2  
-## [29] grid_4.0.2        glue_1.4.2        data.table_1.13.6 R6_2.5.0         
-## [33] textshaping_0.2.1 VGAM_1.1-5        rmarkdown_2.6     farver_2.0.3     
-## [37] ggplot2_3.3.3     magrittr_2.0.1    ellipsis_0.3.1    scales_1.1.1     
-## [41] htmltools_0.5.1.1 assertthat_0.2.1  colorspace_2.0-0  labeling_0.4.2   
-## [45] ragg_0.4.1        stringi_1.5.3     munsell_0.5.0     cachem_1.0.3     
-## [49] crayon_1.4.0
+
+

Session information +

+
## R version 4.3.1 (2023-06-16)
+## Platform: x86_64-pc-linux-gnu (64-bit)
+## Running under: Ubuntu 22.04.2 LTS
+## 
+## Matrix products: default
+## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
+## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
+## 
+## attached base packages:
+## [1] splines   stats     graphics  grDevices utils     datasets  methods  
+## [8] base     
+## 
+## other attached packages:
+## [1] visreg_2.7.0         eha_2.10.3           survival_3.5-5      
+## [4] casebase_0.10.2.9999
+## 
+## loaded via a namespace (and not attached):
+##  [1] sass_0.4.7        utf8_1.2.3        generics_0.1.3    stringi_1.7.12   
+##  [5] lattice_0.21-8    digest_0.6.33     magrittr_2.0.3    evaluate_0.21    
+##  [9] grid_4.3.1        fastmap_1.1.1     rprojroot_2.0.3   jsonlite_1.8.7   
+## [13] Matrix_1.5-4.1    mgcv_1.8-42       purrr_1.0.1       fansi_1.0.4      
+## [17] scales_1.2.1      textshaping_0.3.6 jquerylib_0.1.4   cli_3.6.1        
+## [21] rlang_1.1.1       munsell_0.5.0     withr_2.5.0       cachem_1.0.8     
+## [25] yaml_2.3.7        tools_4.3.1       memoise_2.0.1     dplyr_1.1.2      
+## [29] colorspace_2.1-0  ggplot2_3.4.2     VGAM_1.1-8        vctrs_0.6.3      
+## [33] R6_2.5.1          stats4_4.3.1      lifecycle_1.0.3   stringr_1.5.0    
+## [37] fs_1.6.3          ragg_1.2.5        pkgconfig_2.0.3   desc_1.4.2       
+## [41] pkgdown_2.0.7     bslib_0.5.0       pillar_1.9.0      gtable_0.3.3     
+## [45] glue_1.6.2        data.table_1.14.8 systemfonts_1.0.4 highr_0.10       
+## [49] tidyselect_1.2.0  xfun_0.39         tibble_3.2.1      knitr_1.43       
+## [53] farver_2.1.1      nlme_3.1-162      htmltools_0.5.5   labeling_0.4.2   
+## [57] rmarkdown_2.23    compiler_4.3.1
-
-

-References

+
+

References +

  1. -Efron, Bradley. 1977. “The Efficiency of Cox’s Likelihood Function for Censored Data.” Journal of the American Statistical Association 72 (359). Taylor & Francis Group: 557–65. +Efron, Bradley. 1977. “The Efficiency of Cox’s Likelihood Function for +Censored Data.” Journal of the American Statistical Association +72 (359). Taylor & Francis Group: 557–65.

  2. -Hanley, James A, and Olli S Miettinen. 2009. “Fitting Smooth-in-Time Prognostic Risk Functions via Logistic Regression.” The International Journal of Biostatistics 5 (1). +Hanley, James A, and Olli S Miettinen. 2009. “Fitting Smooth-in-Time +Prognostic Risk Functions via Logistic Regression.” The +International Journal of Biostatistics 5 (1).

  3. -Mantel, Nathan. 1973. “Synthetic Retrospective Studies and Related Topics.” Biometrics. JSTOR, 479–86. +Mantel, Nathan. 1973. “Synthetic Retrospective Studies and Related +Topics.” Biometrics. JSTOR, 479–86.

  4. -Saarela, Olli. 2015. “A Case-Base Sampling Method for Estimating Recurrent Event Intensities.” Lifetime Data Analysis. Springer, 1–17. +Saarela, Olli. 2015. “A Case-Base Sampling Method for Estimating +Recurrent Event Intensities.” Lifetime Data Analysis. Springer, +1–17.

  5. -Saarela, Olli, and Elja Arjas. 2015. “Non-Parametric Bayesian Hazard Regression for Chronic Disease Risk Assessment.” Scandinavian Journal of Statistics 42 (2). Wiley Online Library: 609–26. +Saarela, Olli, and Elja Arjas. 2015. “Non-Parametric Bayesian Hazard +Regression for Chronic Disease Risk Assessment.” Scandinavian +Journal of Statistics 42 (2). Wiley Online Library: 609–26.

  6. -Scrucca, L, A Santucci, and F Aversa. 2010. “Regression Modeling of Competing Risk Using R: An in Depth Guide for Clinicians.” Bone Marrow Transplantation 45 (9). Nature Publishing Group: 1388–95. +Scrucca, L, A Santucci, and F Aversa. 2010. “Regression Modeling of +Competing Risk Using R: An in Depth Guide for Clinicians.” Bone +Marrow Transplantation 45 (9). Nature Publishing Group: 1388–95.

  7. -Kalbfleisch, John D., and Ross L. Prentice. The statistical analysis of failure time data. Vol. 360. John Wiley & Sons, 2011. +Kalbfleisch, John D., and Ross L. Prentice. The statistical analysis of +failure time data. Vol. 360. John Wiley & Sons, 2011.

  8. -Cox, D. R. “Regression models and life tables.” Journal of the Royal Statistical Society 34 (1972): 187-220. +Cox, D. R. “Regression models and life tables.” Journal of the Royal +Statistical Society 34 (1972): 187-220.

@@ -816,11 +911,13 @@

@@ -829,5 +926,7 @@

+ + diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-13-1.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-13-1.png index 9172ae3a..5fd17277 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-13-1.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-13-1.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-15-1.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-15-1.png index c4b53e00..792f235d 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-15-1.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-15-1.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-16-1.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-16-1.png index d3976e4f..2143a426 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-16-1.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-16-1.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-2-1.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-2-1.png index 57a87860..2fa55387 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-2-1.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-2-1.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-7-1.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-7-1.png index 4bf571d3..77fdffb5 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-7-1.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-7-1.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-1.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-1.png index 8b66d755..5d931a8c 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-1.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-1.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-2.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-2.png index bd106f1c..e177860d 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-2.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-2.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-3.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-3.png index 455f2308..13feb945 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-3.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-3.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-4.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-4.png index a96db779..7b67fdd7 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-4.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-4.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-5.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-5.png index 80a21e94..9c542e6b 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-5.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-5.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-6.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-6.png index ce65650c..d8b4514a 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-6.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-6.png differ diff --git a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-7.png b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-7.png index cce7e950..579f2e6b 100644 Binary files a/articles/smoothHazard_files/figure-html/unnamed-chunk-9-7.png and b/articles/smoothHazard_files/figure-html/unnamed-chunk-9-7.png differ diff --git a/articles/time-varying-covariates.html b/articles/time-varying-covariates.html new file mode 100644 index 00000000..d3213f50 --- /dev/null +++ b/articles/time-varying-covariates.html @@ -0,0 +1,286 @@ + + + + + + + +Time-Varying Covariates • casebase + + + + + + + + + + + + +
+
+ + + + +
+
+ + + + +

In the previous case studies, we only considered covariates that were +fixed at baseline. In this next case study, we will use the Stanford +Heart Transplant data (Clark et al. 1971) +(Crowley and Hu 1977) to show how +case-base sampling can also be used in the context of time-varying +covariates. As an example that already appeared in the literature, +case-base sampling was to study vaccination safety, where the exposure +period was defined as the week following vaccination (Saarela 2015). Hence, the main covariate of +interest, i.e. exposure to the vaccine, was changing over time. In this +context, case-base sampling offers an efficient alternative to nested +case-control designs or self-matching.

+

Recall the setting of Stanford Heart Transplant study: patients were +admitted to the Stanford program after meeting with their physician and +determining that they were unlikely to respond to other forms of +treatment. After enrollment, the program searched for a suitable donor +for the patient, which could take anywhere between a few days to almost +a year. We are interested in the effect of a heart transplant on +survival; therefore, the patient is considered exposed only after the +transplant has occurred.

+

As before, we can look at the population-time plot for a graphical +summary of the event incidence. As we can see, most events occur early +during the follow-up period, and therefore we do not expect the hazard +to be constant.

+
+library(survival)
+library(casebase)
+
+stanford_popTime <- popTime(jasa, time = "futime", 
+                            event = "fustat")
+plot(stanford_popTime)
+

+

Since the exposure is time-dependent, we need to manually define the +exposure variable after case-base sampling and before +fitting the hazard function. For this reason, we will use the +sampleCaseBase function directly.

+
+library(dplyr)
+library(lubridate)
+
+cb_data <- sampleCaseBase(jasa, time = "futime", 
+                          event = "fustat", ratio = 10)
+

Next, we will compute the number of days from acceptance into the +program to transplant, and we use this variable to determine whether +each population-moment is exposed or not.

+
+# Define exposure variable
+cb_data <- mutate(cb_data,
+                  txtime = time_length(accept.dt %--% tx.date, 
+                                       unit = "days"),
+                  exposure = case_when(
+                    is.na(txtime) ~ 0L,
+                    txtime > futime ~ 0L,
+                    txtime <= futime ~ 1L
+                  ))
+

Finally, we can fit the hazard using various linear predictors.

+
+library(splines)
+# Fit several models
+fit1 <- fitSmoothHazard(fustat ~ exposure,
+                        data = cb_data, time = "futime")
+fit2 <- fitSmoothHazard(fustat ~ exposure + futime,
+                        data = cb_data, time = "futime")
+fit3 <- fitSmoothHazard(fustat ~ exposure + bs(futime),
+                        data = cb_data, time = "futime")
+fit4 <- fitSmoothHazard(fustat ~ exposure*bs(futime),
+                        data = cb_data, time = "futime")
+

Note that the fourth model (i.e. fit4) includes an +interaction term between exposure and follow-up time. In other words, +this model no longer exhibit proportional hazards. The evidence of +non-proportionality of hazards in the Stanford Heart Transplant data has +been widely discussed (Arjas 1988).

+

We can then compare the goodness of fit of these four models using +the Akaike Information Criterion (AIC).

+
+# Compute AIC
+c("Model1" = AIC(fit1),
+  "Model2" = AIC(fit2),
+  "Model3" = AIC(fit3),
+  "Model4" = AIC(fit4))
+#>   Model1   Model2   Model3   Model4 
+#> 493.2688 454.6240 441.4930 445.1944
+

As we can, the best fit is the third model.

+

By visualizing the hazard functions for both exposed and unexposed +individuals, we can visualize the non-proportionality of the fourth +model.

+
+# Compute hazards---
+# First, create a list of time points for both exposure status
+hazard_data <- expand.grid(exposure = c(0, 1),
+                           futime = seq(0, 1000,
+                                        length.out = 100))
+# Set the offset to zero
+hazard_data$offset <- 0 
+# Use predict to get the fitted values, and exponentiate to 
+# transform to the right scale
+hazard_data$hazard = exp(predict(fit4, newdata = hazard_data,
+                                 type = "link"))
+# Add labels for plots
+hazard_data$Status = factor(hazard_data$exposure,
+                            labels = c("NoTrans", "Trans"))
+
+library(ggplot2)
+ggplot(hazard_data, aes(futime, hazard, colour = Status)) +
+    geom_line() +
+    theme_minimal() +
+    theme(legend.position = 'top') +
+    ylab('Hazard') + xlab('Follow-up time')
+

+

The non-proportionality seems to be more pronounced at the beginning +of follow-up than the end.

+
+

References +

+
+
+Arjas, Elja. 1988. “A Graphical Method for Assessing Goodness of +Fit in Cox’s Proportional Hazards Model.” Journal of the +American Statistical Association 83 (401): 204–12. +
+
+Clark, David A, Edward B Stinson, Randall B Griepp, John S Schroeder, +Norman E Shumway, and Donald Harrison. 1971. “Cardiac +Transplantation in Man.” Annals of Internal Medicine 75 +(1): 15–21. +
+
+Crowley, John, and Marie Hu. 1977. “Covariance Analysis of Heart +Transplant Survival Data.” Journal of the American +Statistical Association 72 (357): 27–36. +
+
+Saarela, Olli. 2015. “A Case-Base Sampling Method for Estimating +Recurrent Event Intensities.” Lifetime Data Analysis, +1–17. +
+
+
+
+ + + +
+ + + + +
+ + + + + + + + diff --git a/articles/time-varying-covariates_files/figure-html/stanford-hazard-1.png b/articles/time-varying-covariates_files/figure-html/stanford-hazard-1.png new file mode 100644 index 00000000..95303ad4 Binary files /dev/null and b/articles/time-varying-covariates_files/figure-html/stanford-hazard-1.png differ diff --git a/articles/time-varying-covariates_files/figure-html/stanford-poptime-1.png b/articles/time-varying-covariates_files/figure-html/stanford-poptime-1.png new file mode 100644 index 00000000..ec155802 Binary files /dev/null and b/articles/time-varying-covariates_files/figure-html/stanford-poptime-1.png differ diff --git a/authors.html b/authors.html index c9e7d49c..3548e99b 100644 --- a/authors.html +++ b/authors.html @@ -1,66 +1,12 @@ - - - - - - - -Citation and Authors • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Authors and Citation • casebase - - - - + + -
-
- -
- -
+
-
-
- - + + diff --git a/index.html b/index.html index 8c997fcf..842a8720 100644 --- a/index.html +++ b/index.html @@ -27,6 +27,8 @@ + +
-
- +
+

casebase is an R package for fitting flexible and fully parametric hazard regression models to survival data with single event type or multiple competing causes via logistic and multinomial regression. Our formulation allows for arbitrary functional forms of time and its interactions with other predictors for time-dependent hazards and hazard ratios. From the fitted hazard model, we provide functions to readily calculate and plot cumulative incidence and survival curves for a given covariate profile. This approach accommodates any log-linear hazard function of prognostic time, treatment, and covariates, and readily allows for non-proportionality. We also provide a plot method for visualizing incidence density via population time plots.

-
-

-Installation

-

You can install the released version of casebase from CRAN with:

+
+

Installation +

+

You can install the released version of casebase from CRAN with:

-install.packages("casebase")
-

And the development version from GitHub with:

+install.packages("casebase")
+

And the development version from GitHub with:

-# install.packages("devtools")
-devtools::install_github("sahirbhatnagar/casebase")
+# install.packages("devtools") +devtools::install_github("sahirbhatnagar/casebase")
-
-

-Vignettes

-

See the package website for example usage of the functions. This includes

-
    -
  1. Fitting Smooth Hazard Functions
  2. -
  3. Competing Risks Analysis
  4. -
  5. Population Time Plots
  6. -
  7. Customizing Population Time Plots
  8. -
  9. Plot Hazards and Hazard Ratios
  10. -
  11. Plot Cumulative Incidence and Survival Curves
  12. + -
    -

    -useR! 2019 Toulouse - Presentation

    -

    Jesse useR

    +
    +

    useR! 2019 Toulouse - Presentation +

    +

    Jesse useR

    -
    -

    -Quickstart

    -

    This is a basic example which shows you some of the main functionalities of the casebase package. We use data from the estrogen plus progestin trial from the Women’s Health Initiative (included in the casebase package). This randomized clinical trial investigated the effect of estrogen plus progestin (estPro) on coronary heart disease (CHD) risk in 16,608 postmenopausal women who were 50 to 79 years of age at base line. Participants were randomly assigned to receive estPro or placebo. The primary efficacy outcome of the trial was CHD (nonfatal myocardial infarction or death due to CHD).

    +
    +

    Quickstart +

    +

    This is a basic example which shows you some of the main functionalities of the casebase package. We use data from the estrogen plus progestin trial from the Women’s Health Initiative (included in the casebase package). This randomized clinical trial investigated the effect of estrogen plus progestin (estPro) on coronary heart disease (CHD) risk in 16,608 postmenopausal women who were 50 to 79 years of age at base line. Participants were randomly assigned to receive estPro or placebo. The primary efficacy outcome of the trial was CHD (nonfatal myocardial infarction or death due to CHD).

    -library(casebase)
    -#> See example usage at http://sahirbhatnagar.com/casebase/
    -library(visreg)
    -library(splines)
    -data("eprchd")
    -
    -

    -Population Time Plots

    +library(casebase) +#> See example usage at http://sahirbhatnagar.com/casebase/ +library(visreg) +library(splines) +data("eprchd")
    +
    +

    Population Time Plots +

    We first visualize the data with a population time plot. For each treatment arm, we plot the observed person time in gray, and the case series as colored dots. It gives us a good visual representation of the incidence density:

    -plot(popTime(eprchd, exposure = "treatment"))
    +plot(popTime(eprchd, exposure = "treatment"))

    -
    -

    -Fit a Smooth Hazard Model

    +
    +

    Fit a Smooth Hazard Model +

    We model the hazard as a function of time, treatment arm and their interaction:

    -eprchd <- transform(eprchd, 
    -                    treatment = factor(treatment, levels = c("placebo","estPro")))
    -
    -fit <- fitSmoothHazard(status ~ treatment*ns(time, df = 3),
    -                       data = eprchd,
    -                       time = "time")
    -summary(fit)
    -#> Fitting smooth hazards with case-base sampling
    -#> 
    -#> Sample size: 16608 
    -#> Number of events: 324 
    -#> Number of base moments: 32400 
    -#> ----
    -#> 
    -#> Call:
    -#> fitSmoothHazard(formula = status ~ treatment * ns(time, df = 3), 
    -#>     data = eprchd, time = "time")
    -#> 
    -#> Deviance Residuals: 
    -#>     Min       1Q   Median       3Q      Max  
    -#> -0.2479  -0.1482  -0.1373  -0.1268   3.1490  
    -#> 
    -#> Coefficients:
    -#>                                   Estimate Std. Error z value Pr(>|z|)    
    -#> (Intercept)                       -5.85621    0.30013 -19.512  < 2e-16 ***
    -#> treatmentestPro                    0.66414    0.37783   1.758   0.0788 .  
    -#> ns(time, df = 3)1                 -0.41110    0.35944  -1.144   0.2527    
    -#> ns(time, df = 3)2                  0.77792    0.73254   1.062   0.2883    
    -#> ns(time, df = 3)3                  1.37771    0.34131   4.036 5.43e-05 ***
    -#> treatmentestPro:ns(time, df = 3)1  0.07468    0.48613   0.154   0.8779    
    -#> treatmentestPro:ns(time, df = 3)2 -1.40359    0.94189  -1.490   0.1362    
    -#> treatmentestPro:ns(time, df = 3)3 -1.06823    0.48634  -2.196   0.0281 *  
    -#> ---
    -#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    -#> 
    -#> (Dispersion parameter for binomial family taken to be 1)
    -#> 
    -#>     Null deviance: 3635.4  on 32723  degrees of freedom
    -#> Residual deviance: 3613.4  on 32716  degrees of freedom
    -#> AIC: 3629.4
    -#> 
    -#> Number of Fisher Scoring iterations: 7
    +eprchd <- transform(eprchd, + treatment = factor(treatment, levels = c("placebo","estPro"))) + +fit <- fitSmoothHazard(status ~ treatment*ns(time, df = 3), + data = eprchd, + time = "time") +summary(fit) +#> Fitting smooth hazards with case-base sampling +#> +#> Sample size: 16608 +#> Number of events: 324 +#> Number of base moments: 32400 +#> ---- +#> +#> Call: +#> fitSmoothHazard(formula = status ~ treatment * ns(time, df = 3), +#> data = eprchd, time = "time") +#> +#> Deviance Residuals: +#> Min 1Q Median 3Q Max +#> -0.2344 -0.1483 -0.1380 -0.1267 3.1466 +#> +#> Coefficients: +#> Estimate Std. Error z value Pr(>|z|) +#> (Intercept) -5.80408 0.30090 -19.289 < 2e-16 *** +#> treatmentestPro 0.59882 0.37743 1.587 0.112608 +#> ns(time, df = 3)1 -0.36224 0.35878 -1.010 0.312654 +#> ns(time, df = 3)2 0.58913 0.73465 0.802 0.422601 +#> ns(time, df = 3)3 1.26391 0.34033 3.714 0.000204 *** +#> treatmentestPro:ns(time, df = 3)1 0.05593 0.48546 0.115 0.908274 +#> treatmentestPro:ns(time, df = 3)2 -1.19576 0.94034 -1.272 0.203506 +#> treatmentestPro:ns(time, df = 3)3 -0.97627 0.48228 -2.024 0.042942 * +#> --- +#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 +#> +#> (Dispersion parameter for binomial family taken to be 1) +#> +#> Null deviance: 3635.4 on 32723 degrees of freedom +#> Residual deviance: 3615.2 on 32716 degrees of freedom +#> AIC: 3631.2 +#> +#> Number of Fisher Scoring iterations: 7

    Since the output object from fitSmoothHazard inherits from the glm class, we see a familiar result when using the function summary.

    -
    -

    -Time-Dependent Hazard Function

    +
    +

    Time-Dependent Hazard Function +

    The treatment effect on the hazard is somewhat difficult to interpret because of its interaction with the spline term on time. In these situations, it is often more instructive to visualize the relationship. For example, we can easily plot the hazard function for each treatment arm:

    -plot(fit, hazard.params = list(xvar = "time", by = "treatment"))
    -#> Conditions used in construction of plot
    -#> treatment: placebo / estPro
    -#> offset: 0
    +plot(fit, hazard.params = list(xvar = "time", by = "treatment")) +#> Conditions used in construction of plot +#> treatment: placebo / estPro +#> offset: 0

    -
    -

    -Time-Dependent Hazard Ratio

    +
    +

    Time-Dependent Hazard Ratio +

    We can also plot the time-dependent hazard ratio and 95% confidence band:

    -newtime <- quantile(eprchd$time, 
    -                    probs = seq(0.01, 0.99, 0.01))
    -
    -# reference category
    -newdata <- data.frame(treatment = factor("placebo", 
    -                                         levels = c("placebo", "estPro")), 
    -                      time = newtime)
    -
    -plot(fit, 
    -     type = "hr", 
    -     newdata = newdata,
    -     var = "treatment",
    -     increment = 1,
    -     xvar = "time",
    -     ci = T,
    -     rug = T)
    +newtime <- quantile(eprchd$time, + probs = seq(0.01, 0.99, 0.01)) + +# reference category +newdata <- data.frame(treatment = factor("placebo", + levels = c("placebo", "estPro")), + time = newtime) + +plot(fit, + type = "hr", + newdata = newdata, + var = "treatment", + increment = 1, + xvar = "time", + ci = T, + rug = T)

    -
    -

    -Cumulative Incidence Function (CIF)

    +
    +

    Cumulative Incidence Function (CIF) +

    We can also calculate and plot the cumulative incidence function:

    -smooth_risk <- absoluteRisk(object = fit, 
    -                            newdata = data.frame(treatment = c("placebo", "estPro")))
    -
    -plot(smooth_risk, id.names = c("placebo", "estPro"))
    +smooth_risk <- absoluteRisk(object = fit, + newdata = data.frame(treatment = c("placebo", "estPro"))) + +plot(smooth_risk, id.names = c("placebo", "estPro"))

    -
    -

    -Class structure

    +
    +

    Class structure +

    The casebase package uses the following hierarchy of classes for the output of fitSmoothHazard:

    - +
    casebase:
    +  singleEventCB:
    +    - glm
    +    - gam
    +    - cv.glmnet
    +  CompRisk:
    +    - vglm

    The class singleEventCB is an S3 class, and we also keep track of the classes appearing below. The class CompRisk is an S4 class that inherits from vglm.

    -
    -

    -Credit

    +
    +

    Credit +

    This package is makes use of several existing packages including:

    • -VGAM for fitting multinomial logistic regression models
    • +VGAM for fitting multinomial logistic regression models
    • -survival for survival models
    • +survival for survival models
    • -ggplot2 for plotting the population time plots
    • +ggplot2 for plotting the population time plots
    • -data.table for efficient handling of large datasets
    • +data.table for efficient handling of large datasets

    Other packages with similar objectives but different parametric forms:

    -
    -

    -Citation

    +
    +

    Citation +

    -citation('casebase')
    -#> 
    -#> To cite casebase in publications use:
    -#> 
    -#> Bhatnagar S, Turgeon M, Islam J, Saarela O, Hanley J (2020). _casebase:
    -#> Fitting Flexible Smooth-in-Time Hazards and Risk Functions via Logistic
    -#> and Multinomial Regression_. R package version 0.9.0, <URL:
    -#> https://CRAN.R-project.org/package=casebase>.
    -#> 
    -#>   Hanley, James A., and Olli S. Miettinen. Fitting smooth-in-time
    -#>   prognostic risk functions via logistic regression. International
    -#>   Journal of Biostatistics 5.1 (2009): 1125-1125.
    -#> 
    -#>   Saarela, Olli. A case-base sampling method for estimating recurrent
    -#>   event intensities. Lifetime data analysis 22.4 (2016): 589-605.
    -#> 
    -#> If competing risks analyis is used, please also cite
    -#> 
    -#>   Saarela, Olli, and Elja Arjas. Non-parametric Bayesian Hazard
    -#>   Regression for Chronic Disease Risk Assessment. Scandinavian Journal
    -#>   of Statistics 42.2 (2015): 609-626.
    -#> 
    -#> To see these entries in BibTeX format, use 'print(<citation>,
    -#> bibtex=TRUE)', 'toBibtex(.)', or set
    -#> 'options(citation.bibtex.max=999)'.
    +citation('casebase') +#> +#> To cite casebase in publications use: +#> +#> Bhatnagar S, Turgeon M, Islam J, Saarela O, Hanley J (2020). +#> _casebase: Fitting Flexible Smooth-in-Time Hazards and Risk Functions +#> via Logistic and Multinomial Regression_. R package version 0.9.0, +#> <https://CRAN.R-project.org/package=casebase>. +#> +#> Hanley, James A., and Olli S. Miettinen. Fitting smooth-in-time +#> prognostic risk functions via logistic regression. International +#> Journal of Biostatistics 5.1 (2009): 1125-1125. +#> +#> Saarela, Olli. A case-base sampling method for estimating recurrent +#> event intensities. Lifetime data analysis 22.4 (2016): 589-605. +#> +#> If competing risks analyis is used, please also cite +#> +#> Saarela, Olli, and Elja Arjas. Non-parametric Bayesian Hazard +#> Regression for Chronic Disease Risk Assessment. Scandinavian Journal +#> of Statistics 42.2 (2015): 609-626. +#> +#> To see these entries in BibTeX format, use 'print(<citation>, +#> bibtex=TRUE)', 'toBibtex(.)', or set +#> 'options(citation.bibtex.max=999)'.
    -
    -

    -Contact

    + -
    -

    -Latest news

    +
    +

    Latest news +

    You can see the most recent changes to the package in the NEWS file

    -
    -

    -Code of Conduct

    +
    +

    Code of Conduct +

    Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

    @@ -344,71 +348,62 @@

    + + +

    @@ -417,5 +412,7 @@

    Dev status

    + + diff --git a/news/index.html b/news/index.html index a6df3de2..4615495d 100644 --- a/news/index.html +++ b/news/index.html @@ -1,66 +1,12 @@ - - - - - - - -Changelog • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Changelog • casebase - - + + - - -
    -
    - -
    - -
    +
    -
    -

    -casebase 0.9.1.9999 Unreleased -

    - -
    -
    -

    -casebase 0.9.1 2021-02-07 -

    -
      -
    • Fixed issue with plot.singleEventCB() when visreg package is not loaded.
    • +
      + +
      • Fixed noLD checks issue as reported by CRAN (Issue 156).
      • +
      +
      + +
      • Fixed issue 143 and return the data invisibly with plot.singleEventCB() when type = "hr".
      • +
      • Removed family = "gbm" as it wasn’t properly tested.
      • +
      • Added confint.singleEventCB to compute confidence bands for the risk (or survival) function.
      • +
      • Updated ERSPC data so that the exposure variable is categorical. This may break previous code explicitly making this conversion, or somehow relying on the numerical coding.
      • +
      +
      + +
      • Fixed issue with plot.singleEventCB() when visreg package is not loaded.
      • Improved error message when using family = "glmnet" with a single covariate.
      • Introduced summary method for objects of class singleEventCB, and improved the output of print by displaying the appropriate function call.
      • -
      -
      -
      -

      -casebase 0.9.0 2020-07-03 -

      +
    +
    +

    This is a Major new release

    -
    -

    -Breaking changes

    -
      -
    • The output of absoluteRisk() now always contains the time variable in the first column, regardless of the length of time. This will break earlier code that depended on the previous behaviour.
    • -
    • Population time plots now use ggplot2::geom_ribbon() instead of ggplot2::geom_segment().
    • +
      +

      Breaking changes

      +
      • The output of absoluteRisk() now always contains the time variable in the first column, regardless of the length of time. This will break earlier code that depended on the previous behaviour.
      • +
      • Population time plots now use ggplot2::geom_ribbon() instead of ggplot2::geom_segment().
      • Population time functions now allow for more flexible plots with user defined arguments including sequentially adding base, case, and competing event series. These are now passed as a list to the *.params arguments. Several arguments are now deprecated.
      • -
      • Removed popTimeExposure class and the corresponding plot method. popTime() now returns an exposure attribute which contains the name of the exposure variable in the dataset. The plot method for objects of class popTime will use this exposure attribute to create exposure stratified population time plots.
      • -
      -
      -
      -

      -New features

      -
        -
      • Major refactoring of absoluteRisk(). Trapezoidal rule to perform numerical integration for absolute risk estimation, providing significant speed up.
      • -
      • Users now have further control on the output of absoluteRisk() using the arguments type and addZero.
      • -
      • New plotting method for time-dependent hazard functions and hazard ratios. These include confidence intervals. See plot.singleEventCB(). The hazard function plot requires the visreg package to be installed.
      • -
      • New plotting method for cumulative incidence and survival curves. See plot.absRiskCB().
      • -
      • When time is unspecified, absoluteRisk() now computes the cumulative incidence at ntimes equidistant points between 0 and the max failure time.
      • +
      • Removed popTimeExposure class and the corresponding plot method. popTime() now returns an exposure attribute which contains the name of the exposure variable in the dataset. The plot method for objects of class popTime will use this exposure attribute to create exposure stratified population time plots.
      • +
      +
      +

      New features

      +
      • Major refactoring of absoluteRisk(). Trapezoidal rule to perform numerical integration for absolute risk estimation, providing significant speed up.
      • +
      • Users now have further control on the output of absoluteRisk() using the arguments type and addZero.
      • +
      • New plotting method for time-dependent hazard functions and hazard ratios. These include confidence intervals. See plot.singleEventCB(). The hazard function plot requires the visreg package to be installed.
      • +
      • New plotting method for cumulative incidence and survival curves. See plot.absRiskCB().
      • +
      • When time is unspecified, absoluteRisk() now computes the cumulative incidence at ntimes equidistant points between 0 and the max failure time.
      • -absoluteRisk() can now compute the cumulative incidence for a "typical" covariate profile with newdata = "typical". “Typical” corresponds to the median for continuous variables and the mode for factors (each variable is summarised independently).
      • +absoluteRisk() can now compute the cumulative incidence for a "typical" covariate profile with newdata = "typical". “Typical” corresponds to the median for continuous variables and the mode for factors (each variable is summarised independently).
      • Added eprchd, brcancer, support and simdat datasets to the package.
      • Implemented riskRegression::predictRisk() method for singleEventCB objects.
      • -
      -
      -
      -

      -Minor improvements and fixes

      -
        -
      • No longer importing the entire namespace of data.table and ggplot2.
      • +
      +
      +

      Minor improvements and fixes

      +
      • No longer importing the entire namespace of data.table and ggplot2.
      • Moved from make the docs to pkgdown for package website.
      • A warning is given when family="gbm" and nonlinear functions of time or interactions are specified.
      • Add singleEventCB class to object returned by fitSmoothHazard()
      • -
      • Add absRiskCB class to object returned by absoluteRisk() +
      • Add absRiskCB class to object returned by absoluteRisk()
      • -
      • Use glmnet::prepareX to convert factors into indicator variables
      • -
      +
    • Use glmnet::prepareX to convert factors into indicator variables
    • +
    -
    -
    -

    -casebase 0.1.0 2017-04-28 -

    -
      -
    • Added a NEWS.md file to track changes to the package.
    • +
      + +
      • Added a NEWS.md file to track changes to the package.
      • First release of the casebase package
      • -
      -
      +
    +
    -
    - - + + diff --git a/pkgdown.css b/pkgdown.css index 1273238d..80ea5b83 100644 --- a/pkgdown.css +++ b/pkgdown.css @@ -56,8 +56,10 @@ img.icon { float: right; } -img { +/* Ensure in-page images don't run outside their container */ +.contents img { max-width: 100%; + height: auto; } /* Fix bug in bootstrap (only seen in firefox) */ @@ -78,11 +80,10 @@ dd { /* Section anchors ---------------------------------*/ a.anchor { - margin-left: -30px; - display:inline-block; - width: 30px; - height: 30px; - visibility: hidden; + display: none; + margin-left: 5px; + width: 20px; + height: 20px; background-image: url(./link.svg); background-repeat: no-repeat; @@ -90,17 +91,15 @@ a.anchor { background-position: center center; } -.hasAnchor:hover a.anchor { - visibility: visible; -} - -@media (max-width: 767px) { - .hasAnchor:hover a.anchor { - visibility: hidden; - } +h1:hover .anchor, +h2:hover .anchor, +h3:hover .anchor, +h4:hover .anchor, +h5:hover .anchor, +h6:hover .anchor { + display: inline-block; } - /* Fixes for fixed navbar --------------------------*/ .contents h1, .contents h2, .contents h3, .contents h4 { @@ -264,31 +263,26 @@ table { /* Syntax highlighting ---------------------------------------------------- */ -pre { - word-wrap: normal; - word-break: normal; - border: 1px solid #eee; -} - -pre, code { +pre, code, pre code { background-color: #f8f8f8; color: #333; } +pre, pre code { + white-space: pre-wrap; + word-break: break-all; + overflow-wrap: break-word; +} -pre code { - overflow: auto; - word-wrap: normal; - white-space: pre; +pre { + border: 1px solid #eee; } -pre .img { +pre .img, pre .r-plt { margin: 5px 0; } -pre .img img { +pre .img img, pre .r-plt img { background-color: #fff; - display: block; - height: auto; } code a, pre a { @@ -305,9 +299,8 @@ a.sourceLine:hover { .kw {color: #264D66;} /* keyword */ .co {color: #888888;} /* comment */ -.message { color: black; font-weight: bolder;} -.error { color: orange; font-weight: bolder;} -.warning { color: #6A0366; font-weight: bolder;} +.error {font-weight: bolder;} +.warning {font-weight: bolder;} /* Clipboard --------------------------*/ @@ -365,3 +358,27 @@ mark { content: ""; } } + +/* Section anchors --------------------------------- + Added in pandoc 2.11: https://github.com/jgm/pandoc-templates/commit/9904bf71 +*/ + +div.csl-bib-body { } +div.csl-entry { + clear: both; +} +.hanging-indent div.csl-entry { + margin-left:2em; + text-indent:-2em; +} +div.csl-left-margin { + min-width:2em; + float:left; +} +div.csl-right-inline { + margin-left:2em; + padding-left:1em; +} +div.csl-indent { + margin-left: 2em; +} diff --git a/pkgdown.js b/pkgdown.js index 7e7048fa..6f0eee40 100644 --- a/pkgdown.js +++ b/pkgdown.js @@ -80,7 +80,7 @@ $(document).ready(function() { var copyButton = ""; - $(".examples, div.sourceCode").addClass("hasCopyButton"); + $("div.sourceCode").addClass("hasCopyButton"); // Insert copy buttons: $(copyButton).prependTo(".hasCopyButton"); @@ -91,7 +91,7 @@ // Initialize clipboard: var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { text: function(trigger) { - return trigger.parentNode.textContent; + return trigger.parentNode.textContent.replace(/\n#>[^\n]*/g, ""); } }); diff --git a/pkgdown.yml b/pkgdown.yml index a1da0c30..081e22ca 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -1,5 +1,5 @@ -pandoc: '2.2' -pkgdown: 1.6.1 +pandoc: 2.19.2 +pkgdown: 2.0.7 pkgdown_sha: ~ articles: competingRisk: competingRisk.html @@ -8,8 +8,9 @@ articles: plotsmoothHazard: plotsmoothHazard.html popTime: popTime.html smoothHazard: smoothHazard.html -last_built: 2021-02-08T17:41Z + time-varying-covariates: time-varying-covariates.html +last_built: 2023-08-03T16:00Z urls: - reference: http://sahirbhatnagar.com/casebase/reference - article: http://sahirbhatnagar.com/casebase/articles + reference: https://sahirbhatnagar.com/casebase/reference + article: https://sahirbhatnagar.com/casebase/articles diff --git a/reference/CompRisk-class.html b/reference/CompRisk-class.html index c43dee31..e679e070 100644 --- a/reference/CompRisk-class.html +++ b/reference/CompRisk-class.html @@ -1,67 +1,12 @@ - - - - - - - -An S4 class to store the output of fitSmoothHazard — CompRisk-class • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -An S4 class to store the output of fitSmoothHazard — CompRisk-class • casebase - - - - + + -
    -
    - -
    - -
    +
    @@ -147,69 +86,71 @@

    An S4 class to store the output of fitSmoothHazard

    This class inherits from vglm-class.

    -
    summary(object, ...)
    +    
    +
    summary(object, ...)
    +
    +# S4 method for CompRisk
    +summary(object)
    +
    -# S4 method for CompRisk -summary(object)
    +
    +

    Arguments

    +
    object
    +

    Object of class CompRisk

    -

    Arguments

    - - - - - - - - - - -
    object

    Object of class CompRisk

    ...

    Extra parameters

    -

    Slots

    +
    ...
    +

    Extra parameters

    +
    +
    +

    Slots

    -
    -
    originalData

    Data.frame containing the original data (i.e. before -case-base sampling). This is used by the absoluteRisk +

    originalData
    +

    Data.frame containing the original data (i.e. before +case-base sampling). This is used by the absoluteRisk function.

    -
    typeEvents

    Numeric factor which encodes the type of events being + +

    typeEvents
    +

    Numeric factor which encodes the type of events being considered (including censoring).

    -
    timeVar

    Character string giving the name of the time variable, as + +

    timeVar
    +

    Character string giving the name of the time variable, as appearing in originalData

    -
    eventVar

    Character string giving the name of the event variable, as + +

    eventVar
    +

    Character string giving the name of the event variable, as appearing in originalData

    -
    + +
    +
    -
    - - + + diff --git a/reference/ERSPC-1.png b/reference/ERSPC-1.png index 117d2ebf..95f3169f 100644 Binary files a/reference/ERSPC-1.png and b/reference/ERSPC-1.png differ diff --git a/reference/ERSPC.html b/reference/ERSPC.html index 733953c6..bea80d68 100644 --- a/reference/ERSPC.html +++ b/reference/ERSPC.html @@ -1,68 +1,13 @@ - - - - - - - -Data on the men in the European Randomized Study of Prostate Cancer Screening — ERSPC • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Data on the men in the European Randomized Study of Prostate Cancer Screening — ERSPC • casebase - + + - - - -
    -
    - -
    - -
    +
    @@ -149,21 +88,26 @@

    Data on the men in the European Randomized Study of Prostate Cancer Screenin in the core age group between the ages of 55 and 69 years at entry.

    -
    ERSPC
    - - -

    Format

    +
    +
    ERSPC
    +
    -

    A data frame with 159,893 observations on the following 3 variables:

    -
    ScrArm

    Whether in Screening Arm (1) or non-Screening arm -(0) (numeric)

    Follow.Up.Time

    The time, measured in years +

    +

    Format

    +

    A data frame with 159,893 observations on the following 3 variables:

    ScrArm
    +

    Whether in Screening Arm (1) or non-Screening arm +(0) (numeric)

    +
    Follow.Up.Time
    +

    The time, measured in years from randomization, at which follow-up was terminated

    -
    DeadOfPrCa

    Whether follow-up was terminated by Death from Prostate -Cancer (1) or by death from other causes, or administratively (0)

    -
    - -

    Source

    +
    DeadOfPrCa
    +

    Whether follow-up was terminated by Death from Prostate +Cancer (1) or by death from other causes, or administratively (0)

    + +
    +
    +

    Source

    The individual censored values were recovered by James Hanley from the Postcript code that the NEJM article (Schroder et al., 2009) used to render Figure 2 (see Liu et al., 2014, for details). The uncensored values @@ -181,8 +125,9 @@

    Source -

    Details

    - +
    +
    +

    Details

    The men were recruited from seven European countries (centers). Each centre began recruitment at a different time, ranging from 1991 to 1998. The last entry was in December 2003. The uniform censoring date was @@ -205,55 +150,54 @@

    Details recruitment - in December 2003 - the minimum potential follow-up is three years. Tracked further forwards in time (i.e. after year 3) the attrition is a combination of deaths and staggered entries.

    -

    References

    - +
    +
    +

    References

    Liu Z, Rich B, Hanley JA. Recovering the raw data behind a non-parametric survival curve. Systematic Reviews 2014; 3:151. -doi: 10.1186/2046-4053-3-151 +doi:10.1186/2046-4053-3-151 .

    Schroder FH, et al., for the ERSPC Investigators. Screening and Prostate-Cancer Mortality in a Randomized European Study. N Engl J Med -2009; 360:1320-8. doi: 10.1056/NEJMoa0810084 +2009; 360:1320-8. doi:10.1056/NEJMoa0810084 .

    +
    -

    Examples

    -
    data("ERSPC") -ERSPC$ScrArm <- factor(ERSPC$ScrArm, - levels = c(0,1), - labels = c("Control group", "Screening group")) -set.seed(12345) -pt_object_strat <- casebase::popTime(ERSPC[sample(1:nrow(ERSPC), 10000),], - event = "DeadOfPrCa", - exposure = "ScrArm") -
    #> 'Follow.Up.Time' will be used as the time variable
    -plot(pt_object_strat, - facet.params = list(ncol = 2)) -
    +
    +

    Examples

    +
    data("ERSPC")
    +set.seed(12345)
    +pt_object_strat <- casebase::popTime(ERSPC[sample(1:nrow(ERSPC), 10000),],
    +                                     event = "DeadOfPrCa",
    +                                     exposure = "ScrArm")
    +#> 'Follow.Up.Time' will be used as the time variable
    +
    +plot(pt_object_strat,
    +     facet.params = list(ncol = 2))
    +
    +
    +
    +
    -
- - + + diff --git a/reference/absoluteRisk-1.png b/reference/absoluteRisk-1.png index 0a916139..8c493ccb 100644 Binary files a/reference/absoluteRisk-1.png and b/reference/absoluteRisk-1.png differ diff --git a/reference/absoluteRisk.html b/reference/absoluteRisk.html index 5afd8adc..02d7f41e 100644 --- a/reference/absoluteRisk.html +++ b/reference/absoluteRisk.html @@ -1,72 +1,17 @@ - - - - - - - -Compute absolute risks using the fitted hazard function. — absoluteRisk.CompRisk • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Compute absolute risks using the fitted hazard function. — absoluteRisk.CompRisk • casebase - - - - - - - - - + + - - - - -
-
- -
- -
+
@@ -157,153 +96,151 @@

Compute absolute risks using the fitted hazard function.

functions.

-
absoluteRisk.CompRisk(
-  object,
-  time,
-  newdata,
-  method = c("numerical", "montecarlo"),
-  nsamp = 100,
-  onlyMain = TRUE,
-  type = c("CI", "survival"),
-  addZero = TRUE
-)
-
-absoluteRisk(
-  object,
-  time,
-  newdata,
-  method = c("numerical", "montecarlo"),
-  nsamp = 100,
-  s = c("lambda.1se", "lambda.min"),
-  n.trees,
-  onlyMain = TRUE,
-  type = c("CI", "survival"),
-  addZero = TRUE,
-  ntimes = 100,
-  ...
-)
-
-# S3 method for absRiskCB
-print(x, ...)
-
-# S3 method for absRiskCB
-plot(
-  x,
-  ...,
-  xlab = "time",
-  ylab = ifelse(attr(x, "type") == "CI", "cumulative incidence",
-    "survival probability"),
-  type = "l",
-  gg = TRUE,
-  id.names,
-  legend.title
-)
- -

Arguments

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
object

Output of function fitSmoothHazard.

time

A vector of time points at which we should compute the absolute -risks.

newdata

Optionally, a data frame in which to look for variables with +

+
absoluteRisk.CompRisk(
+  object,
+  time,
+  newdata,
+  method = c("numerical", "montecarlo"),
+  nsamp = 100,
+  onlyMain = TRUE,
+  type = c("CI", "survival"),
+  addZero = TRUE
+)
+
+absoluteRisk(
+  object,
+  time,
+  newdata,
+  method = c("numerical", "montecarlo"),
+  nsamp = 100,
+  s = c("lambda.1se", "lambda.min"),
+  onlyMain = TRUE,
+  type = c("CI", "survival"),
+  addZero = TRUE,
+  ntimes = 100,
+  ...
+)
+
+# S3 method for absRiskCB
+print(x, ...)
+
+# S3 method for absRiskCB
+plot(
+  x,
+  ...,
+  xlab = "time",
+  ylab = ifelse(attr(x, "type") == "CI", "cumulative incidence", "survival probability"),
+  type = "l",
+  gg = TRUE,
+  id.names,
+  legend.title
+)
+
+ +
+

Arguments

+
object
+

Output of function fitSmoothHazard.

+ + +
time
+

A vector of time points at which we should compute the absolute +risks.

+ + +
newdata
+

Optionally, a data frame in which to look for variables with which to predict. If omitted, the mean absolute risk is returned. Alternatively, if newdata = "typical", the absolute risk will be -computed at a "typical" covariate profile (see Details).

method

Method used for integration. Defaults to "numerical", +computed at a "typical" covariate profile (see Details).

+ + +
method
+

Method used for integration. Defaults to "numerical", which uses the trapezoidal rule to integrate over all time points together. The only other option is "montecarlo", which implements Monte-Carlo -integration.

nsamp

Maximal number of subdivisions (if method = "numerical") -or number of sampled points (if method = "montecarlo").

onlyMain

Logical. For competing risks, should we return absolute risks -only for the main event of interest? Defaults to TRUE.

type

Line type. Only used if gg = FALSE. This argument gets passed -to graphics::matplot(). Default: 'l'

addZero

Logical. Should we add time = 0 at the beginning of the -output? Defaults to TRUE.

s

Value of the penalty parameter lambda at which predictions are -required (for class cv.glmnet).

n.trees

Number of trees used in the prediction (for class gbm).

ntimes

Number of time points (only used if time is missing).

...

further arguments passed to matplot. Only used if -gg=FALSE.

x

Fitted object of class absRiskCB. This is the result from the -absoluteRisk() function.

xlab

xaxis label, Default: 'time'

ylab

yaxis label. By default, this will use the "type" attribute of -the absRiskCB object

gg

Logical for whether the ggplot2 package should be used for -plotting. Default: TRUE

id.names

Optional character vector used as legend key when gg=TRUE. -If missing, defaults to V1, V2, ...

legend.title

Optional character vector of the legend title. Only used -if gg = FALSE. Default is 'ID'

- -

Value

- -

If time was provided, returns the estimated absolute risk for +integration.

+ + +
nsamp
+

Maximal number of subdivisions (if method = "numerical") +or number of sampled points (if method = "montecarlo").

+ + +
onlyMain
+

Logical. For competing risks, should we return absolute risks +only for the main event of interest? Defaults to TRUE.

+ + +
type
+

Line type. Only used if gg = FALSE. This argument gets passed +to graphics::matplot(). Default: 'l'

+ + +
addZero
+

Logical. Should we add time = 0 at the beginning of the +output? Defaults to TRUE.

+ + +
s
+

Value of the penalty parameter lambda at which predictions are +required (for class cv.glmnet).

+ + +
ntimes
+

Number of time points (only used if time is missing).

+ + +
...
+

further arguments passed to matplot. Only used if +gg=FALSE.

+ + +
x
+

Fitted object of class absRiskCB. This is the result from the +absoluteRisk() function.

+ + +
xlab
+

xaxis label, Default: 'time'

+ + +
ylab
+

yaxis label. By default, this will use the "type" attribute of +the absRiskCB object

+ + +
gg
+

Logical for whether the ggplot2 package should be used for +plotting. Default: TRUE

+ + +
id.names
+

Optional character vector used as legend key when gg=TRUE. +If missing, defaults to V1, V2, ...

+ + +
legend.title
+

Optional character vector of the legend title. Only used +if gg = FALSE. Default is 'ID'

+ +
+
+

Value

+ + +

If time was provided, returns the estimated absolute risk for the user-supplied covariate profiles. This will be stored in a matrix or a higher dimensional array, depending on the input (see details). If both time and newdata are missing, returns the original data with a new column containing the risk estimate at failure times.

-

A plot of the cumulative incidence or survival curve

-

Details

+ +

A plot of the cumulative incidence or survival curve

+
+
+

Details

If newdata = "typical", we create a typical covariate profile for the absolute risk computation. This means that we take the median for numerical and date variables, and we take the most common level for factor variables.

@@ -320,115 +257,121 @@

Details

The numerical method should be good enough in most situation, but Monte Carlo integration can give more accurate results when the estimated hazard function is not smooth (e.g. when modeling with time-varying covariates).

-

See also

- - +
+

See also

+ - -

Examples

-
# Simulate censored survival data for two outcome types -library(data.table) -set.seed(12345) -nobs <- 1000 -tlim <- 20 - -# simulation parameters -b1 <- 200 -b2 <- 50 - -# event type 0-censored, 1-event of interest, 2-competing event -# t observed time/endpoint -# z is a binary covariate -DT <- data.table(z = rbinom(nobs, 1, 0.5)) -DT[,`:=` ("t_event" = rweibull(nobs, 1, b1), - "t_comp" = rweibull(nobs, 1, b2))] -
#> z t_event t_comp -#> 1: 1 510.83410 2.3923947 -#> 2: 1 33.98842 23.7578470 -#> 3: 1 997.76445 31.5864062 -#> 4: 1 209.28888 5.7092667 -#> 5: 0 75.35774 81.5124801 -#> --- -#> 996: 1 111.80274 0.1186062 -#> 997: 0 238.05336 60.3685477 -#> 998: 0 142.60033 3.6318489 -#> 999: 0 103.37601 24.5722384 -#> 1000: 0 255.84352 113.5144522
DT[,`:=`("event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp), - "time" = pmin(t_event, t_comp))] -
#> z t_event t_comp event time -#> 1: 1 510.83410 2.3923947 2 2.3923947 -#> 2: 1 33.98842 23.7578470 2 23.7578470 -#> 3: 1 997.76445 31.5864062 2 31.5864062 -#> 4: 1 209.28888 5.7092667 2 5.7092667 -#> 5: 0 75.35774 81.5124801 1 75.3577374 -#> --- -#> 996: 1 111.80274 0.1186062 2 0.1186062 -#> 997: 0 238.05336 60.3685477 2 60.3685477 -#> 998: 0 142.60033 3.6318489 2 3.6318489 -#> 999: 0 103.37601 24.5722384 2 24.5722384 -#> 1000: 0 255.84352 113.5144522 2 113.5144522
DT[time >= tlim, `:=`("event" = 0, "time" = tlim)] -
#> z t_event t_comp event time -#> 1: 1 510.83410 2.3923947 2 2.3923947 -#> 2: 1 33.98842 23.7578470 0 20.0000000 -#> 3: 1 997.76445 31.5864062 0 20.0000000 -#> 4: 1 209.28888 5.7092667 2 5.7092667 -#> 5: 0 75.35774 81.5124801 0 20.0000000 -#> --- -#> 996: 1 111.80274 0.1186062 2 0.1186062 -#> 997: 0 238.05336 60.3685477 0 20.0000000 -#> 998: 0 142.60033 3.6318489 2 3.6318489 -#> 999: 0 103.37601 24.5722384 0 20.0000000 -#> 1000: 0 255.84352 113.5144522 0 20.0000000
-out_linear <- fitSmoothHazard(event ~ time + z, DT, ratio = 10) -
#> 'time' will be used as the time variable
-linear_risk <- absoluteRisk(out_linear, time = 10, - newdata = data.table("z" = c(0,1))) -# Plot CI curves---- -library(ggplot2) -data("brcancer") -mod_cb_tvc <- fitSmoothHazard(cens ~ estrec*log(time) + - horTh + - age + - menostat + - tsize + - tgrade + - pnodes + - progrec, - data = brcancer, - time = "time", ratio = 1) -smooth_risk_brcancer <- absoluteRisk(object = mod_cb_tvc, - newdata = brcancer[c(1,50),]) - -class(smooth_risk_brcancer) -
#> [1] "absRiskCB" "matrix" "array"
plot(smooth_risk_brcancer) -
+as.data.table, setattr, +melt.data.table

+
+ +
+

Examples

+
# Simulate censored survival data for two outcome types
+library(data.table)
+set.seed(12345)
+nobs <- 1000
+tlim <- 20
+
+# simulation parameters
+b1 <- 200
+b2 <- 50
+
+# event type 0-censored, 1-event of interest, 2-competing event
+# t observed time/endpoint
+# z is a binary covariate
+DT <- data.table(z = rbinom(nobs, 1, 0.5))
+DT[,`:=` ("t_event" = rweibull(nobs, 1, b1),
+          "t_comp" = rweibull(nobs, 1, b2))]
+#>       z   t_event      t_comp
+#>    1: 1 510.83410   2.3923947
+#>    2: 1  33.98842  23.7578470
+#>    3: 1 997.76445  31.5864062
+#>    4: 1 209.28888   5.7092667
+#>    5: 0  75.35774  81.5124801
+#>   ---                        
+#>  996: 1 111.80274   0.1186062
+#>  997: 0 238.05336  60.3685477
+#>  998: 0 142.60033   3.6318489
+#>  999: 0 103.37601  24.5722384
+#> 1000: 0 255.84352 113.5144522
+DT[,`:=`("event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp),
+         "time" = pmin(t_event, t_comp))]
+#>       z   t_event      t_comp event        time
+#>    1: 1 510.83410   2.3923947     2   2.3923947
+#>    2: 1  33.98842  23.7578470     2  23.7578470
+#>    3: 1 997.76445  31.5864062     2  31.5864062
+#>    4: 1 209.28888   5.7092667     2   5.7092667
+#>    5: 0  75.35774  81.5124801     1  75.3577374
+#>   ---                                          
+#>  996: 1 111.80274   0.1186062     2   0.1186062
+#>  997: 0 238.05336  60.3685477     2  60.3685477
+#>  998: 0 142.60033   3.6318489     2   3.6318489
+#>  999: 0 103.37601  24.5722384     2  24.5722384
+#> 1000: 0 255.84352 113.5144522     2 113.5144522
+DT[time >= tlim, `:=`("event" = 0, "time" = tlim)]
+#>       z   t_event      t_comp event       time
+#>    1: 1 510.83410   2.3923947     2  2.3923947
+#>    2: 1  33.98842  23.7578470     0 20.0000000
+#>    3: 1 997.76445  31.5864062     0 20.0000000
+#>    4: 1 209.28888   5.7092667     2  5.7092667
+#>    5: 0  75.35774  81.5124801     0 20.0000000
+#>   ---                                         
+#>  996: 1 111.80274   0.1186062     2  0.1186062
+#>  997: 0 238.05336  60.3685477     0 20.0000000
+#>  998: 0 142.60033   3.6318489     2  3.6318489
+#>  999: 0 103.37601  24.5722384     0 20.0000000
+#> 1000: 0 255.84352 113.5144522     0 20.0000000
+
+out_linear <- fitSmoothHazard(event ~ time + z, DT, ratio = 10)
+#> 'time' will be used as the time variable
+
+linear_risk <- absoluteRisk(out_linear, time = 10,
+                            newdata = data.table("z" = c(0,1)))
+# Plot CI curves----
+library(ggplot2)
+data("brcancer")
+mod_cb_tvc <- fitSmoothHazard(cens ~ estrec*log(time) +
+                                horTh +
+                                age +
+                                menostat +
+                                tsize +
+                                tgrade +
+                                pnodes +
+                                progrec,
+                              data = brcancer,
+                              time = "time", ratio = 1)
+smooth_risk_brcancer <- absoluteRisk(object = mod_cb_tvc,
+                                     newdata = brcancer[c(1,50),])
+
+class(smooth_risk_brcancer)
+#> [1] "absRiskCB" "matrix"    "array"    
+plot(smooth_risk_brcancer)
+
+
+
+
-
- - + + diff --git a/reference/bmtcrr.html b/reference/bmtcrr.html index 35068a03..5bf91f38 100644 --- a/reference/bmtcrr.html +++ b/reference/bmtcrr.html @@ -1,68 +1,13 @@ - - - - - - - -Data on transplant patients — bmtcrr • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Data on transplant patients — bmtcrr • casebase + + - - - - -
-
- -
- -
+
@@ -149,53 +88,63 @@

Data on transplant patients

acute leukemia.

-
bmtcrr
- - -

Format

+
+
bmtcrr
+
-

A dataframe with 177 observations and 7 variables:

-
Sex

Gender of the individual

D

Disease: lymphoblastic or +

+

Format

+

A dataframe with 177 observations and 7 variables:

Sex
+

Gender of the individual

+
D
+

Disease: lymphoblastic or myeloblastic leukemia, abbreviated as ALL and AML, respectively

-
Phase

Phase at transplant (Relapse, CR1, CR2, CR3)

Age

Age -at the beginning of follow-up

Status

Status indicator: 0=censored, -1=relapse, 2=competing event

Source

Source of stem cells: bone -marrow and peripheral blood, coded as BM+PB, or peripheral blood only, -coded as PB

ftime

Failure time in months

-
- -

References

+
Phase
+

Phase at transplant (Relapse, CR1, CR2, CR3)

+
Age
+

Age +at the beginning of follow-up

+
Status
+

Status indicator: 0=censored, +1=relapse, 2=competing event

+
Source
+

Source of stem cells: bone +marrow and peripheral blood, coded as BM+PB, or peripheral blood only, +coded as PB

+
ftime
+

Failure time in months

+ +
+
+

References

Scrucca L, Santucci A, Aversa F. Competing risk analysis using R: an easy guide for clinicians. Bone Marrow Transplant. 2007 Aug;40(4):381-7. -doi: 10.1038/sj.bmt.1705727 +doi:10.1038/sj.bmt.1705727 .

+
+
-
- - + + diff --git a/reference/brcancer.html b/reference/brcancer.html index eb48e505..2e14d017 100644 --- a/reference/brcancer.html +++ b/reference/brcancer.html @@ -1,68 +1,13 @@ - - - - - - - -German Breast Cancer Study Group 2 — brcancer • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -German Breast Cancer Study Group 2 — brcancer • casebase + + - - - - -
-
- -
- -
+
@@ -149,62 +88,83 @@

German Breast Cancer Study Group 2

This is taken almost verbatim from the TH.data package.

-
brcancer
- - -

Format

+
+
brcancer
+
-

This data frame contains the observations of 686 women:

-
horTh

hormonal therapy, a factor at two levels no and -yes.

hormon

numeric version of horTh

age

of the -patients in years.

menostat

menopausal status, a factor at two +

+

Format

+

This data frame contains the observations of 686 women:

horTh
+

hormonal therapy, a factor at two levels no and +yes.

+
hormon
+

numeric version of horTh

+
age
+

of the +patients in years.

+
menostat
+

menopausal status, a factor at two levels pre (premenopausal) and post (postmenopausal).

-
meno

Numeric version of menostat

tsize

tumor size (in -mm).

tgrade

tumor grade, a ordered factor at levels I < II < - III.

pnodes

number of positive nodes.

progrec

progesterone -receptor (in fmol).

estrec

estrogen receptor (in fmol).

-
time

recurrence free survival time (in days).

cens

censoring -indicator (0- censored, 1- event).

-
- -

Source

+
meno
+

Numeric version of menostat

+
tsize
+

tumor size (in +mm).

+
tgrade
+

tumor grade, a ordered factor at levels I < II < + III.

+
pnodes
+

number of positive nodes.

+
progrec
+

progesterone +receptor (in fmol).

+
estrec
+

estrogen receptor (in fmol).

+ +
time
+

recurrence free survival time (in days).

+
cens
+

censoring +indicator (0- censored, 1- event).

+ +
+
+

Source

Torsten Hothorn (2019). TH.data: TH's Data Archive. R package version 1.0-10. https://CRAN.R-project.org/package=TH.data

-

References

- +
+
+

References

M. Schumacher, G. Basert, H. Bojar, K. Huebner, M. Olschewski, W. Sauerbrei, C. Schmoor, C. Beyerle, R.L.A. Neumann and H.F. Rauschecker for the German Breast Cancer Study Group (1994), Randomized \(2\times2\) trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. Journal of Clinical Oncology, 12, 2086--2093.

+
+
-
- - + + diff --git a/reference/checkArgsEventIndicator.html b/reference/checkArgsEventIndicator.html index 87e86547..69f1a5b4 100644 --- a/reference/checkArgsEventIndicator.html +++ b/reference/checkArgsEventIndicator.html @@ -1,68 +1,13 @@ - - - - - - - -Check that Event is in Correct Format — checkArgsEventIndicator • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Check that Event is in Correct Format — checkArgsEventIndicator • casebase + + - - - - -
-
- -
- -
+
@@ -149,205 +88,212 @@

Check that Event is in Correct Format

level is assumed to be the reference level.

-
checkArgsEventIndicator(data, event, censored.indicator)
+
+
checkArgsEventIndicator(data, event, censored.indicator)
+
+ +
+

Arguments

+
data
+

a data.frame or data.table containing the source +dataset.

+ -

Arguments

- - - - - - - - - - - - - - -
data

a data.frame or data.table containing the source -dataset.

event

a character string giving the name of the event variable +

event
+

a character string giving the name of the event variable contained in data. See Details. If event is a numeric variable, then 0 needs to represent a censored observation, 1 needs to be the event of interest. Integers 2, 3, ... and so on are treated as competing events. If event is a factor or character and censored.indicator is not specified, this function will assume the -reference level is the censored indicator

censored.indicator

a character string of length 1 indicating which +reference level is the censored indicator

+ + +
censored.indicator
+

a character string of length 1 indicating which value in event is the censored. This function will use -relevel to set censored.indicator as the +relevel to set censored.indicator as the reference level. This argument is ignored if the event variable is a -numeric

+numeric

-

Value

+
+
+

Value

+ -

A list of length two. The first element is the factored event, and +

A list of length two. The first element is the factored event, and the second element is the numeric representation of the event

+
-

Examples

-
if (requireNamespace("survival", quietly = TRUE)) { -library(survival) # for veteran data -checkArgsEventIndicator(data = veteran, event = "celltype", - censored.indicator = "smallcell") -checkArgsEventIndicator(data = veteran, event = "status") -} -
#> assuming smallcell represents a censored observation and squamous is the event of interest
#> $event.factored -#> [1] event event event event event event event event -#> [9] event censored event event event censored event event -#> [17] event event event event censored censored event event -#> [25] event event event event event event event event -#> [33] event event event event event event event event -#> [41] event event event event event event event event -#> [49] event event event event event event event event -#> [57] event event event event event event event censored -#> [65] event event event event event event event censored -#> [73] censored event event event event event event event -#> [81] event event event event event event event event -#> [89] event event censored event event event event event -#> [97] event event event event event event event event -#> [105] event event event event event censored event event -#> [113] event event event event event event event event -#> [121] event event event event event event event event -#> [129] event event event event event event event event -#> [137] event -#> Levels: censored event -#> -#> $event.numeric -#> [1] 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -#> [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 -#> [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 -#> [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 -#> -#> $nLevels -#> [1] 2 -#>
data("bmtcrr") # from casebase -checkArgsEventIndicator(data = bmtcrr, event = "Sex", - censored.indicator = "M") -
#> assuming M represents a censored observation and F is the event of interest
#> $event.factored -#> [1] M F M F F M M F M F M M F M M F M M M M F F M M M M F M M F M M F M F M M -#> [38] M F F M M F M M M M M M M F F F F M F F M F M F M M F M F F M F M M M F F -#> [75] M M F M M F M F M F F M F M M M F F M M F F M M F F F F F M M F F M F M M -#> [112] M F M F M F M M M M F F M M F M M F M F M F M M M M F M M F M F F F M M M -#> [149] F F M F M F M M M F F M F F M F F M F M M F F M M F M F M -#> Levels: M F -#> -#> $event.numeric -#> [1] 0 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 -#> [38] 0 1 1 0 0 1 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 0 1 1 -#> [75] 0 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 1 0 1 0 0 -#> [112] 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1 1 1 0 0 0 -#> [149] 1 1 0 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0 -#> -#> $nLevels -#> [1] 2 -#>
checkArgsEventIndicator(data = bmtcrr, event = "D", - censored.indicator = "AML") -
#> assuming AML represents a censored observation and ALL is the event of interest
#> $event.factored -#> [1] ALL AML ALL ALL ALL ALL ALL ALL ALL ALL ALL AML AML ALL ALL ALL ALL AML -#> [19] ALL AML ALL ALL ALL ALL AML ALL ALL AML ALL AML AML AML ALL ALL ALL AML -#> [37] ALL AML AML ALL AML AML AML AML AML AML ALL AML AML AML AML AML AML ALL -#> [55] ALL AML ALL AML AML ALL AML ALL AML AML AML AML AML ALL ALL ALL ALL AML -#> [73] ALL ALL ALL AML AML AML AML ALL AML AML ALL AML ALL AML AML ALL AML AML -#> [91] AML AML AML AML AML ALL AML ALL ALL ALL AML ALL AML AML ALL AML AML AML -#> [109] AML AML AML AML AML AML AML ALL AML AML ALL ALL ALL AML ALL ALL ALL ALL -#> [127] AML AML AML ALL AML AML AML AML ALL AML AML AML ALL AML AML AML AML ALL -#> [145] AML ALL ALL AML ALL AML ALL AML AML AML AML AML ALL ALL AML AML ALL AML -#> [163] AML AML ALL AML ALL ALL ALL AML AML AML AML AML AML AML AML -#> Levels: AML ALL -#> -#> $event.numeric -#> [1] 1 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 0 1 1 1 1 0 1 1 0 1 0 0 0 1 1 1 0 1 -#> [38] 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 1 -#> [75] 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 -#> [112] 0 0 0 0 1 0 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0 -#> [149] 1 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 -#> -#> $nLevels -#> [1] 2 -#>
checkArgsEventIndicator(data = bmtcrr, event = "Status") -
#> $event.factored -#> [1] competing event event censored competing event -#> [5] competing event competing event censored competing event -#> [9] censored event competing event competing event -#> [13] competing event event competing event competing event -#> [17] censored competing event competing event competing event -#> [21] event competing event event event -#> [25] competing event event event censored -#> [29] event censored event censored -#> [33] competing event censored event competing event -#> [37] event censored competing event competing event -#> [41] competing event competing event competing event competing event -#> [45] event censored event event -#> [49] competing event competing event censored competing event -#> [53] competing event competing event censored censored -#> [57] competing event event competing event event -#> [61] competing event competing event competing event censored -#> [65] censored event competing event event -#> [69] competing event censored censored censored -#> [73] competing event event event censored -#> [77] event event censored competing event -#> [81] event event competing event censored -#> [85] event censored censored censored -#> [89] competing event event event competing event -#> [93] competing event censored competing event event -#> [97] competing event event competing event censored -#> [101] event censored competing event censored -#> [105] competing event event censored event -#> [109] competing event censored competing event competing event -#> [113] competing event competing event competing event event -#> [117] competing event event competing event competing event -#> [121] censored censored event event -#> [125] event competing event competing event competing event -#> [129] censored competing event competing event competing event -#> [133] event censored censored censored -#> [137] event competing event event event -#> [141] censored event competing event censored -#> [145] event competing event event event -#> [149] event event censored event -#> [153] competing event competing event competing event competing event -#> [157] event event competing event competing event -#> [161] competing event competing event competing event competing event -#> [165] censored censored censored competing event -#> [169] event censored event event -#> [173] censored event censored censored -#> [177] event -#> Levels: censored event competing event -#> -#> $event.numeric -#> [1] 2 1 0 2 2 2 0 2 0 1 2 2 2 1 2 2 0 2 2 2 1 2 1 1 2 1 1 0 1 0 1 0 2 0 1 2 1 -#> [38] 0 2 2 2 2 2 2 1 0 1 1 2 2 0 2 2 2 0 0 2 1 2 1 2 2 2 0 0 1 2 1 2 0 0 0 2 1 -#> [75] 1 0 1 1 0 2 1 1 2 0 1 0 0 0 2 1 1 2 2 0 2 1 2 1 2 0 1 0 2 0 2 1 0 1 2 0 2 -#> [112] 2 2 2 2 1 2 1 2 2 0 0 1 1 1 2 2 2 0 2 2 2 1 0 0 0 1 2 1 1 0 1 2 0 1 2 1 1 -#> [149] 1 1 0 1 2 2 2 2 1 1 2 2 2 2 2 2 0 0 0 2 1 0 1 1 0 1 0 0 1 -#> -#> $nLevels -#> [1] 3 -#>
+
+

Examples

+
if (requireNamespace("survival", quietly = TRUE)) {
+library(survival) # for veteran data
+checkArgsEventIndicator(data = veteran, event = "celltype",
+                        censored.indicator = "smallcell")
+checkArgsEventIndicator(data = veteran, event = "status")
+}
+#> assuming smallcell represents a censored observation and squamous is the event of interest
+#> $event.factored
+#>   [1] event    event    event    event    event    event    event    event   
+#>   [9] event    censored event    event    event    censored event    event   
+#>  [17] event    event    event    event    censored censored event    event   
+#>  [25] event    event    event    event    event    event    event    event   
+#>  [33] event    event    event    event    event    event    event    event   
+#>  [41] event    event    event    event    event    event    event    event   
+#>  [49] event    event    event    event    event    event    event    event   
+#>  [57] event    event    event    event    event    event    event    censored
+#>  [65] event    event    event    event    event    event    event    censored
+#>  [73] censored event    event    event    event    event    event    event   
+#>  [81] event    event    event    event    event    event    event    event   
+#>  [89] event    event    censored event    event    event    event    event   
+#>  [97] event    event    event    event    event    event    event    event   
+#> [105] event    event    event    event    event    censored event    event   
+#> [113] event    event    event    event    event    event    event    event   
+#> [121] event    event    event    event    event    event    event    event   
+#> [129] event    event    event    event    event    event    event    event   
+#> [137] event   
+#> Levels: censored event
+#> 
+#> $event.numeric
+#>   [1] 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
+#>  [38] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1
+#>  [75] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1
+#> [112] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
+#> 
+#> $nLevels
+#> [1] 2
+#> 
+data("bmtcrr") # from casebase
+checkArgsEventIndicator(data = bmtcrr, event = "Sex",
+                        censored.indicator = "M")
+#> assuming M represents a censored observation and F is the event of interest
+#> $event.factored
+#>   [1] M F M F F M M F M F M M F M M F M M M M F F M M M M F M M F M M F M F M M
+#>  [38] M F F M M F M M M M M M M F F F F M F F M F M F M M F M F F M F M M M F F
+#>  [75] M M F M M F M F M F F M F M M M F F M M F F M M F F F F F M M F F M F M M
+#> [112] M F M F M F M M M M F F M M F M M F M F M F M M M M F M M F M F F F M M M
+#> [149] F F M F M F M M M F F M F F M F F M F M M F F M M F M F M
+#> Levels: M F
+#> 
+#> $event.numeric
+#>   [1] 0 1 0 1 1 0 0 1 0 1 0 0 1 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0
+#>  [38] 0 1 1 0 0 1 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 0 1 1
+#>  [75] 0 0 1 0 0 1 0 1 0 1 1 0 1 0 0 0 1 1 0 0 1 1 0 0 1 1 1 1 1 0 0 1 1 0 1 0 0
+#> [112] 0 1 0 1 0 1 0 0 0 0 1 1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1 1 1 0 0 0
+#> [149] 1 1 0 1 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 0 1 1 0 0 1 0 1 0
+#> 
+#> $nLevels
+#> [1] 2
+#> 
+checkArgsEventIndicator(data = bmtcrr, event = "D",
+                        censored.indicator = "AML")
+#> assuming AML represents a censored observation and ALL is the event of interest
+#> $event.factored
+#>   [1] ALL AML ALL ALL ALL ALL ALL ALL ALL ALL ALL AML AML ALL ALL ALL ALL AML
+#>  [19] ALL AML ALL ALL ALL ALL AML ALL ALL AML ALL AML AML AML ALL ALL ALL AML
+#>  [37] ALL AML AML ALL AML AML AML AML AML AML ALL AML AML AML AML AML AML ALL
+#>  [55] ALL AML ALL AML AML ALL AML ALL AML AML AML AML AML ALL ALL ALL ALL AML
+#>  [73] ALL ALL ALL AML AML AML AML ALL AML AML ALL AML ALL AML AML ALL AML AML
+#>  [91] AML AML AML AML AML ALL AML ALL ALL ALL AML ALL AML AML ALL AML AML AML
+#> [109] AML AML AML AML AML AML AML ALL AML AML ALL ALL ALL AML ALL ALL ALL ALL
+#> [127] AML AML AML ALL AML AML AML AML ALL AML AML AML ALL AML AML AML AML ALL
+#> [145] AML ALL ALL AML ALL AML ALL AML AML AML AML AML ALL ALL AML AML ALL AML
+#> [163] AML AML ALL AML ALL ALL ALL AML AML AML AML AML AML AML AML
+#> Levels: AML ALL
+#> 
+#> $event.numeric
+#>   [1] 1 0 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 0 1 0 1 1 1 1 0 1 1 0 1 0 0 0 1 1 1 0 1
+#>  [38] 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 1 0 1 1
+#>  [75] 1 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0
+#> [112] 0 0 0 0 1 0 0 1 1 1 0 1 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 1 1 0
+#> [149] 1 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0
+#> 
+#> $nLevels
+#> [1] 2
+#> 
+checkArgsEventIndicator(data = bmtcrr, event = "Status")
+#> $event.factored
+#>   [1] competing event event           censored        competing event
+#>   [5] competing event competing event censored        competing event
+#>   [9] censored        event           competing event competing event
+#>  [13] competing event event           competing event competing event
+#>  [17] censored        competing event competing event competing event
+#>  [21] event           competing event event           event          
+#>  [25] competing event event           event           censored       
+#>  [29] event           censored        event           censored       
+#>  [33] competing event censored        event           competing event
+#>  [37] event           censored        competing event competing event
+#>  [41] competing event competing event competing event competing event
+#>  [45] event           censored        event           event          
+#>  [49] competing event competing event censored        competing event
+#>  [53] competing event competing event censored        censored       
+#>  [57] competing event event           competing event event          
+#>  [61] competing event competing event competing event censored       
+#>  [65] censored        event           competing event event          
+#>  [69] competing event censored        censored        censored       
+#>  [73] competing event event           event           censored       
+#>  [77] event           event           censored        competing event
+#>  [81] event           event           competing event censored       
+#>  [85] event           censored        censored        censored       
+#>  [89] competing event event           event           competing event
+#>  [93] competing event censored        competing event event          
+#>  [97] competing event event           competing event censored       
+#> [101] event           censored        competing event censored       
+#> [105] competing event event           censored        event          
+#> [109] competing event censored        competing event competing event
+#> [113] competing event competing event competing event event          
+#> [117] competing event event           competing event competing event
+#> [121] censored        censored        event           event          
+#> [125] event           competing event competing event competing event
+#> [129] censored        competing event competing event competing event
+#> [133] event           censored        censored        censored       
+#> [137] event           competing event event           event          
+#> [141] censored        event           competing event censored       
+#> [145] event           competing event event           event          
+#> [149] event           event           censored        event          
+#> [153] competing event competing event competing event competing event
+#> [157] event           event           competing event competing event
+#> [161] competing event competing event competing event competing event
+#> [165] censored        censored        censored        competing event
+#> [169] event           censored        event           event          
+#> [173] censored        event           censored        censored       
+#> [177] event          
+#> Levels: censored event competing event
+#> 
+#> $event.numeric
+#>   [1] 2 1 0 2 2 2 0 2 0 1 2 2 2 1 2 2 0 2 2 2 1 2 1 1 2 1 1 0 1 0 1 0 2 0 1 2 1
+#>  [38] 0 2 2 2 2 2 2 1 0 1 1 2 2 0 2 2 2 0 0 2 1 2 1 2 2 2 0 0 1 2 1 2 0 0 0 2 1
+#>  [75] 1 0 1 1 0 2 1 1 2 0 1 0 0 0 2 1 1 2 2 0 2 1 2 1 2 0 1 0 2 0 2 1 0 1 2 0 2
+#> [112] 2 2 2 2 1 2 1 2 2 0 0 1 1 1 2 2 2 0 2 2 2 1 0 0 0 1 2 1 1 0 1 2 0 1 2 1 1
+#> [149] 1 1 0 1 2 2 2 2 1 1 2 2 2 2 2 2 0 0 0 2 1 0 1 1 0 1 0 0 1
+#> 
+#> $nLevels
+#> [1] 3
+#> 
+
+
+
-

- - + + diff --git a/reference/confint.absRiskCB.html b/reference/confint.absRiskCB.html new file mode 100644 index 00000000..c20a1833 --- /dev/null +++ b/reference/confint.absRiskCB.html @@ -0,0 +1,162 @@ + +Compute confidence intervals for risks — confint.absRiskCB • casebase + + +
+
+ + + +
+
+ + +
+

This function uses parametric bootstrap to compute confidence intervals for +the risk estimates. Since it relies on MLE theory for the validity of these +intervals, this function only works for fit_obj that was fitted using +family = "glm" (i.e. the default).

+
+ +
+
# S3 method for absRiskCB
+confint(object, parm, level = 0.95, nboot = 500, ...)
+
+ +
+

Arguments

+
object
+

Output of function absoluteRisk.

+ + +
parm
+

Output of function fitSmoothHazard that was used to +compute object.

+ + +
level
+

The confidence level required.

+ + +
nboot
+

The number of bootstrap samples to use.

+ + +
...
+

Additional arguments for methods.

+ +
+
+

Value

+ + +

If there is only one covariate profile, the function returns a matrix +with the time points, the risk estimates, and the confidence intervals. If +there are more than one covariate profile, the function returns a list with +three components.

+
+
+

Details

+

If the package progress is available, the function also reports on +progress of the sampling (which can take some time if there are many +covariate profiles and/or time points).

+
+ +
+ +
+ + +
+ + + + + + + + diff --git a/reference/eprchd.html b/reference/eprchd.html index 1641c019..3281624d 100644 --- a/reference/eprchd.html +++ b/reference/eprchd.html @@ -1,68 +1,13 @@ - - - - - - - -Estrogen plus Progestin and the Risk of Coronary Heart Disease (eprchd) — eprchd • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Estrogen plus Progestin and the Risk of Coronary Heart Disease (eprchd) — eprchd • casebase - - + + - - -
-
- -
- -
+
@@ -149,52 +88,57 @@

Estrogen plus Progestin and the Risk of Coronary Heart Disease (eprchd)

(Manson 2003).Compares placebo to hormone treatment.

-
eprchd
- - -

Format

+
+
eprchd
+
-

A dataframe with 16608 observations and 3 variables:

-
time

Years (continuous)

status

0=censored, 1=event

treatment

placebo, +

+

Format

+

A dataframe with 16608 observations and 3 variables:

time
+

Years (continuous)

+
status
+

0=censored, 1=event

+
treatment
+

placebo, estPro

-
- -

References

+
+
+

References

Manson, J. E., Hsia, J., Johnson, K. C., Rossouw, J. E., Assaf, A. R., Lasser, N. L., ... & Strickland, O. L. (2003). Estrogen plus progestin and the risk of coronary heart disease. New England Journal of Medicine, 349(6), 523-534.

+
-

Examples

-
data("eprchd") -fit <- fitSmoothHazard(status ~ time + treatment, data = eprchd) -
#> 'time' will be used as the time variable
+
+

Examples

+
data("eprchd")
+fit <- fitSmoothHazard(status ~ time + treatment, data = eprchd)
+#> 'time' will be used as the time variable
+
+
+
-
- - + + diff --git a/reference/figures/README-plot-mason-1.png b/reference/figures/README-plot-mason-1.png index a37e0959..3bb3fbbb 100644 Binary files a/reference/figures/README-plot-mason-1.png and b/reference/figures/README-plot-mason-1.png differ diff --git a/reference/figures/README-unnamed-chunk-2-1.png b/reference/figures/README-unnamed-chunk-2-1.png index 151a83d1..e471ffb3 100644 Binary files a/reference/figures/README-unnamed-chunk-2-1.png and b/reference/figures/README-unnamed-chunk-2-1.png differ diff --git a/reference/figures/README-unnamed-chunk-3-1.png b/reference/figures/README-unnamed-chunk-3-1.png index 3d4f527e..cb2d0c80 100644 Binary files a/reference/figures/README-unnamed-chunk-3-1.png and b/reference/figures/README-unnamed-chunk-3-1.png differ diff --git a/reference/figures/README-unnamed-chunk-4-1.png b/reference/figures/README-unnamed-chunk-4-1.png index aa6ef89a..a68e353c 100644 Binary files a/reference/figures/README-unnamed-chunk-4-1.png and b/reference/figures/README-unnamed-chunk-4-1.png differ diff --git a/reference/fitSmoothHazard.html b/reference/fitSmoothHazard.html index bd628f49..763ee8d6 100644 --- a/reference/fitSmoothHazard.html +++ b/reference/fitSmoothHazard.html @@ -1,70 +1,15 @@ - - - - - - - -Fit smooth-in-time parametric hazard functions. — fitSmoothHazard • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Fit smooth-in-time parametric hazard functions. — fitSmoothHazard • casebase - - - - - - - - - - - - - + + -
-
- -
- -
+
@@ -153,105 +92,107 @@

Fit smooth-in-time parametric hazard functions.

hazard using logistic regression.

-
fitSmoothHazard(
-  formula,
-  data,
-  time,
-  family = c("glm", "gam", "gbm", "glmnet"),
-  censored.indicator,
-  ratio = 100,
-  ...
-)
-
-fitSmoothHazard.fit(
-  x,
-  y,
-  formula_time,
-  time,
-  event,
-  family = c("glm", "gbm", "glmnet"),
-  censored.indicator,
-  ratio = 100,
-  ...
-)
-
-prepareX(formula, data)
- -

Arguments

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
formula

an object of class "formula" (or one that can be coerced to +

+
fitSmoothHazard(
+  formula,
+  data,
+  time,
+  family = c("glm", "gam", "glmnet"),
+  censored.indicator,
+  ratio = 100,
+  ...
+)
+
+fitSmoothHazard.fit(
+  x,
+  y,
+  formula_time,
+  time,
+  event,
+  family = c("glm", "glmnet"),
+  censored.indicator,
+  ratio = 100,
+  ...
+)
+
+prepareX(formula, data)
+
+ +
+

Arguments

+
formula
+

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details -of model specification are given under Details.

data

a data frame, list or environment containing the variables in the +of model specification are given under Details.

+ + +
data
+

a data frame, list or environment containing the variables in the model. If not found in data, the variables are taken from -environment(formula), typically the environment from which -fitSmoothHazard is called.

time

a character string giving the name of the time variable. See -Details.

family

a character string specifying the family of regression models -used to fit the hazard.

censored.indicator

a character string of length 1 indicating which +environment(formula), typically the environment from which +fitSmoothHazard is called.

+ + +
time
+

a character string giving the name of the time variable. See +Details.

+ + +
family
+

a character string specifying the family of regression models +used to fit the hazard.

+ + +
censored.indicator
+

a character string of length 1 indicating which value in event is the censored. This function will use -relevel to set censored.indicator as the +relevel to set censored.indicator as the reference level. This argument is ignored if the event variable is a -numeric.

ratio

integer, giving the ratio of the size of the base series to that -of the case series. Defaults to 100.

...

Additional parameters passed to fitting functions (e.g. -glm, glmnet, gam).

x

Matrix containing covariates.

y

Matrix containing two columns: one corresponding to time, the other -to the event type.

formula_time

A formula describing how the hazard depends on time. -Defaults to linear.

event

a character string giving the name of the event variable.

- -

Value

- -

An object of glm and lm when there is only one event of -interest, or of class CompRisk, which inherits from +numeric.

+ + +
ratio
+

integer, giving the ratio of the size of the base series to that +of the case series. Defaults to 100.

+ + +
...
+

Additional parameters passed to fitting functions (e.g. +glm, glmnet, gam).

+ + +
x
+

Matrix containing covariates.

+ + +
y
+

Matrix containing two columns: one corresponding to time, the other +to the event type.

+ + +
formula_time
+

A formula describing how the hazard depends on time. +Defaults to linear.

+ + +
event
+

a character string giving the name of the event variable.

+ +
+
+

Value

+ + +

An object of glm and lm when there is only one event of +interest, or of class CompRisk, which inherits from vglm, for a competing risk analysis. As such, functions like summary, deviance and coefficients give familiar results.

-

Details

- +
+
+

Details

The object data should either be the output of the function -sampleCaseBase or the source dataset on which case-base +sampleCaseBase or the source dataset on which case-base sampling will be performed. In the latter case, it is assumed that data contains the two columns corresponding to the supplied time and event variables. The variable time is used for the sampling the base @@ -259,128 +200,136 @@

Details (i.e. non transformed) scale. If time is missing, the function looks for a column named "time" in the data. Note that the event variable is inferred from formula, since it is the left hand side.

-

For single-event survival analysis, it is possible to fit the hazard function -using glmnet, gam, or gbm. The choice of fitting family -is controlled by the parameter family. The default value is glm, +

For single-event survival analysis, it is also possible to fit the hazard +function using glmnet or gam. The choice of fitting family is +controlled by the parameter family. The default value is glm, which corresponds to logistic regression. For competing risk analysis, only glm and glmnet are allowed.

We also provide a matrix interface through fitSmoothHazard.fit, which -mimics glm.fit and gbm.fit. This is mostly convenient for -family = "glmnet", since a formula interface becomes quickly -cumbersome as the number of variables increases. In this setting, the matrix -y should have two columns and contain the time and event variables -(e.g. like the output of survival::Surv). We need this linear function -of time in order to perform case-base sampling. Therefore, nonlinear -functions of time should be specified as a one-sided formula through the -argument formula_time (the left-hand side is always ignored).

+mimics glm.fit. This is mostly convenient for family = +"glmnet", since a formula interface becomes quickly cumbersome as the number +of variables increases. In this setting, the matrix y should have two +columns and contain the time and event variables (e.g. like the output of +survival::Surv). We need this linear function of time in order to +perform case-base sampling. Therefore, nonlinear functions of time should be +specified as a one-sided formula through the argument formula_time +(the left-hand side is always ignored).

prepareX is a slightly modified version of the same function from the glmnet package. It can be used to convert a data.frame to a matrix with categorical variables converted to dummy variables using one-hot encoding

+

-

Examples

-
# Simulate censored survival data for two outcome types from exponential -# distributions -library(data.table) -nobs <- 500 -tlim <- 20 - -# simulation parameters -b1 <- 200 -b2 <- 50 - -# event type 0-censored, 1-event of interest, 2-competing event -# t observed time/endpoint -# z is a binary covariate -DT <- data.table(z = rbinom(nobs, 1, 0.5)) -DT[, `:=`( - "t_event" = rweibull(nobs, 1, b1), - "t_comp" = rweibull(nobs, 1, b2) -)] -
#> z t_event t_comp -#> 1: 0 125.67322 53.090763 -#> 2: 0 27.92365 41.517160 -#> 3: 1 625.41749 23.956800 -#> 4: 0 279.93382 62.361918 -#> 5: 1 315.90292 19.739612 -#> --- -#> 496: 1 111.76799 57.900562 -#> 497: 1 153.70621 63.501104 -#> 498: 0 38.02085 14.325562 -#> 499: 0 109.52164 20.573577 -#> 500: 0 575.37257 8.339102
DT[, `:=`( - "event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp), - "time" = pmin(t_event, t_comp) -)] -
#> z t_event t_comp event time -#> 1: 0 125.67322 53.090763 2 53.090763 -#> 2: 0 27.92365 41.517160 1 27.923655 -#> 3: 1 625.41749 23.956800 2 23.956800 -#> 4: 0 279.93382 62.361918 2 62.361918 -#> 5: 1 315.90292 19.739612 2 19.739612 -#> --- -#> 496: 1 111.76799 57.900562 2 57.900562 -#> 497: 1 153.70621 63.501104 2 63.501104 -#> 498: 0 38.02085 14.325562 2 14.325562 -#> 499: 0 109.52164 20.573577 2 20.573577 -#> 500: 0 575.37257 8.339102 2 8.339102
DT[time >= tlim, `:=`("event" = 0, "time" = tlim)] -
#> z t_event t_comp event time -#> 1: 0 125.67322 53.090763 0 20.000000 -#> 2: 0 27.92365 41.517160 0 20.000000 -#> 3: 1 625.41749 23.956800 0 20.000000 -#> 4: 0 279.93382 62.361918 0 20.000000 -#> 5: 1 315.90292 19.739612 2 19.739612 -#> --- -#> 496: 1 111.76799 57.900562 0 20.000000 -#> 497: 1 153.70621 63.501104 0 20.000000 -#> 498: 0 38.02085 14.325562 2 14.325562 -#> 499: 0 109.52164 20.573577 0 20.000000 -#> 500: 0 575.37257 8.339102 2 8.339102
-out_linear <- fitSmoothHazard(event ~ time + z, DT, ratio = 10) -
#> 'time' will be used as the time variable
out_log <- fitSmoothHazard(event ~ log(time) + z, DT, ratio = 10) -
#> 'time' will be used as the time variable
-# Use GAMs -library(mgcv) -
#> Loading required package: nlme
#> This is mgcv 1.8-31. For overview type 'help("mgcv-package")'.
DT[event == 2, event := 1] -
#> z t_event t_comp event time -#> 1: 0 125.67322 53.090763 0 20.000000 -#> 2: 0 27.92365 41.517160 0 20.000000 -#> 3: 1 625.41749 23.956800 0 20.000000 -#> 4: 0 279.93382 62.361918 0 20.000000 -#> 5: 1 315.90292 19.739612 1 19.739612 -#> --- -#> 496: 1 111.76799 57.900562 0 20.000000 -#> 497: 1 153.70621 63.501104 0 20.000000 -#> 498: 0 38.02085 14.325562 1 14.325562 -#> 499: 0 109.52164 20.573577 0 20.000000 -#> 500: 0 575.37257 8.339102 1 8.339102
out_gam <- fitSmoothHazard(event ~ s(time) + z, DT, - ratio = 10, family = "gam") -
#> 'time' will be used as the time variable
+
+

Examples

+
# Simulate censored survival data for two outcome types from exponential
+# distributions
+library(data.table)
+nobs <- 500
+tlim <- 20
+
+# simulation parameters
+b1 <- 200
+b2 <- 50
+
+# event type 0-censored, 1-event of interest, 2-competing event
+# t observed time/endpoint
+# z is a binary covariate
+DT <- data.table(z = rbinom(nobs, 1, 0.5))
+DT[, `:=`(
+  "t_event" = rweibull(nobs, 1, b1),
+  "t_comp" = rweibull(nobs, 1, b2)
+)]
+#>      z   t_event    t_comp
+#>   1: 0 125.67322 53.090763
+#>   2: 0  27.92365 41.517160
+#>   3: 1 625.41749 23.956800
+#>   4: 0 279.93382 62.361918
+#>   5: 1 315.90292 19.739612
+#>  ---                      
+#> 496: 1 111.76799 57.900562
+#> 497: 1 153.70621 63.501104
+#> 498: 0  38.02085 14.325562
+#> 499: 0 109.52164 20.573577
+#> 500: 0 575.37257  8.339102
+DT[, `:=`(
+  "event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp),
+  "time" = pmin(t_event, t_comp)
+)]
+#>      z   t_event    t_comp event      time
+#>   1: 0 125.67322 53.090763     2 53.090763
+#>   2: 0  27.92365 41.517160     1 27.923655
+#>   3: 1 625.41749 23.956800     2 23.956800
+#>   4: 0 279.93382 62.361918     2 62.361918
+#>   5: 1 315.90292 19.739612     2 19.739612
+#>  ---                                      
+#> 496: 1 111.76799 57.900562     2 57.900562
+#> 497: 1 153.70621 63.501104     2 63.501104
+#> 498: 0  38.02085 14.325562     2 14.325562
+#> 499: 0 109.52164 20.573577     2 20.573577
+#> 500: 0 575.37257  8.339102     2  8.339102
+DT[time >= tlim, `:=`("event" = 0, "time" = tlim)]
+#>      z   t_event    t_comp event      time
+#>   1: 0 125.67322 53.090763     0 20.000000
+#>   2: 0  27.92365 41.517160     0 20.000000
+#>   3: 1 625.41749 23.956800     0 20.000000
+#>   4: 0 279.93382 62.361918     0 20.000000
+#>   5: 1 315.90292 19.739612     2 19.739612
+#>  ---                                      
+#> 496: 1 111.76799 57.900562     0 20.000000
+#> 497: 1 153.70621 63.501104     0 20.000000
+#> 498: 0  38.02085 14.325562     2 14.325562
+#> 499: 0 109.52164 20.573577     0 20.000000
+#> 500: 0 575.37257  8.339102     2  8.339102
+
+out_linear <- fitSmoothHazard(event ~ time + z, DT, ratio = 10)
+#> 'time' will be used as the time variable
+out_log <- fitSmoothHazard(event ~ log(time) + z, DT, ratio = 10)
+#> 'time' will be used as the time variable
+
+# Use GAMs
+library(mgcv)
+#> Loading required package: nlme
+#> This is mgcv 1.8-42. For overview type 'help("mgcv-package")'.
+DT[event == 2, event := 1]
+#>      z   t_event    t_comp event      time
+#>   1: 0 125.67322 53.090763     0 20.000000
+#>   2: 0  27.92365 41.517160     0 20.000000
+#>   3: 1 625.41749 23.956800     0 20.000000
+#>   4: 0 279.93382 62.361918     0 20.000000
+#>   5: 1 315.90292 19.739612     1 19.739612
+#>  ---                                      
+#> 496: 1 111.76799 57.900562     0 20.000000
+#> 497: 1 153.70621 63.501104     0 20.000000
+#> 498: 0  38.02085 14.325562     1 14.325562
+#> 499: 0 109.52164 20.573577     0 20.000000
+#> 500: 0 575.37257  8.339102     1  8.339102
+out_gam <- fitSmoothHazard(event ~ s(time) + z, DT,
+                           ratio = 10, family = "gam")
+#> 'time' will be used as the time variable
+
+
+
- - - + + diff --git a/reference/hazardPlot-1.png b/reference/hazardPlot-1.png index 844da76a..da23a500 100644 Binary files a/reference/hazardPlot-1.png and b/reference/hazardPlot-1.png differ diff --git a/reference/hazardPlot.html b/reference/hazardPlot.html index dcc6daad..560dc76b 100644 --- a/reference/hazardPlot.html +++ b/reference/hazardPlot.html @@ -1,72 +1,17 @@ - - - - - - - -Plot Fitted Hazard Curve as a Function of Time — hazardPlot • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Plot Fitted Hazard Curve as a Function of Time — hazardPlot • casebase - - - - - - - - - - - - - + + -
-
- -
- -
+

Visualize estimated hazard curves as a function of time with confidence intervals. This function takes as input, the result from the -fitSmoothHazard() function. The user can also specify a +fitSmoothHazard() function. The user can also specify a sequence of times at which to estimate the hazard function. These plots are useful to visualize the non-proportional hazards, i.e., time dependent interactions with a covariate.

-
hazardPlot(
-  object,
-  newdata,
-  type = c("hazard"),
-  xlab = NULL,
-  breaks = 100,
-  ci.lvl = 0.95,
-  ylab = NULL,
-  line.col = 1,
-  ci.col = "grey",
-  lty = par("lty"),
-  add = FALSE,
-  ci = !add,
-  rug = !add,
-  s = c("lambda.1se", "lambda.min"),
-  times = NULL,
-  ...
-)
- -

Arguments

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
object

Fitted object of class glm, gam, cv.glmnet or gbm. This -is the result from the fitSmoothHazard() function.

newdata

A data frame in which to look for variables with which to +

+
hazardPlot(
+  object,
+  newdata,
+  type = c("hazard"),
+  xlab = NULL,
+  breaks = 100,
+  ci.lvl = 0.95,
+  ylab = NULL,
+  line.col = 1,
+  ci.col = "grey",
+  lty = par("lty"),
+  add = FALSE,
+  ci = !add,
+  rug = !add,
+  s = c("lambda.1se", "lambda.min"),
+  times = NULL,
+  ...
+)
+
+ +
+

Arguments

+
object
+

Fitted object of class glm, gam, cv.glmnet or gbm. This +is the result from the fitSmoothHazard() function.

+ + +
newdata
+

A data frame in which to look for variables with which to predict. This is required and must contain all the variables used in the model. Only one covariate profile can be used. If more than one row is -provided, only the first row will be used.

type

Type of plot. Currently, only "hazard" has been implemented. -Default: c("hazard")

xlab

x-axis label. Default: the name of the time variable from the -fitted object.

breaks

Number of points at which to estimate the hazard. This argument +provided, only the first row will be used.

+ + +
type
+

Type of plot. Currently, only "hazard" has been implemented. +Default: c("hazard")

+ + +
xlab
+

x-axis label. Default: the name of the time variable from the +fitted object.

+ + +
breaks
+

Number of points at which to estimate the hazard. This argument is only used if argument times=NULL. This function will calculate a sequence of times between the minimum and maximum of observed event times. -Default: 100.

ci.lvl

Confidence level. Must be in (0,1), Default: 0.95

ylab

y-axis label. Default: NULL which means the function will put -sensible defaults.

line.col

Line color, Default: 1. See graphics::par() for details.

ci.col

Confidence band color. Only used if argument ci=TRUE, -Default: 'grey'

lty

Line type. See graphics::par() for details, Default: par("lty")

add

Logical; if TRUE add to an already existing plot; Default: FALSE

ci

Logical; if TRUE confidence bands are calculated. Only available -for family="glm" and family="gam", Default: !add

rug

Logical. Adds a rug representation (1-d plot) of the event times -(only for status=1), Default: !add

s

Value of the penalty parameter lambda at which predictions are +Default: 100.

+ + +
ci.lvl
+

Confidence level. Must be in (0,1), Default: 0.95

+ + +
ylab
+

y-axis label. Default: NULL which means the function will put +sensible defaults.

+ + +
line.col
+

Line color, Default: 1. See graphics::par() for details.

+ + +
ci.col
+

Confidence band color. Only used if argument ci=TRUE, +Default: 'grey'

+ + +
lty
+

Line type. See graphics::par() for details, Default: par("lty")

+ + +
add
+

Logical; if TRUE add to an already existing plot; Default: FALSE

+ + +
ci
+

Logical; if TRUE confidence bands are calculated. Only available +for family="glm" and family="gam", Default: !add

+ + +
rug
+

Logical. Adds a rug representation (1-d plot) of the event times +(only for status=1), Default: !add

+ + +
s
+

Value of the penalty parameter lambda at which predictions are required (for class cv.glmnet only). Only the first entry will be used if more than one numeric value is provided, Default: c("lambda.1se", -"lambda.min")

times

Vector of numeric values at which the hazard should be +"lambda.min")

+ + +
times
+

Vector of numeric values at which the hazard should be calculated. Default: NULL which means this function will use the minimum -and maximum of observed event times with the breaks argument.

...

further arguments passed to graphics::matplot()

+and maximum of observed event times with the breaks argument.

+ + +
...
+

further arguments passed to graphics::matplot()

-

Value

+
+
+

Value

+ -

a plot of the hazard function and a data.frame of original data used +

a plot of the hazard function and a data.frame of original data used in the fitting along with the data used to create the plots including predictedhazard which is the predicted hazard for a given covariate pattern and time predictedloghazard is the predicted hazard on the log @@ -273,61 +213,64 @@

Value

interval bounds on the hazard scale (i.e. used to plot the confidence bands). standarderror is the standard error of the log hazard (only if family="glm" or family="gam")

-

Details

- +
+
+

Details

This is an earlier version of a function to plot hazards. We recommend instead using the plot method for objects returned by -fitSmoothHazard(). See plot.singleEventCB().

-

See also

- - - -

Examples

-
data("simdat") -mod_cb <- fitSmoothHazard(status ~ trt * eventtime, - time = "eventtime", - data = simdat[1:200,], - ratio = 1, - family = "glm") - -results0 <- hazardPlot(object = mod_cb, newdata = data.frame(trt = 0), - ci.lvl = 0.95, ci = FALSE, lty = 1, line.col = 1, lwd = 2) -
head(results0) -
#> trt eventtime offset predictedloghazard predictedhazard -#> 1 0 0.2695865 0 -2.186588 0.1122993 -#> 1.1 0 0.3173684 0 -2.164438 0.1148144 -#> 1.2 0 0.3651504 0 -2.142289 0.1173859 -#> 1.3 0 0.4129323 0 -2.120139 0.1200149 -#> 1.4 0 0.4607143 0 -2.097989 0.1227029 -#> 1.5 0 0.5084963 0 -2.075840 0.1254510
hazardPlot(object = mod_cb, newdata = data.frame(trt = 1), ci = FALSE, - ci.lvl = 0.95, add = TRUE, lty = 2, line.col = 2, lwd = 2) -
legend("topleft", c("trt=0","trt=1"),lty=1:2,col=1:2,bty="y", lwd = 2) -
+fitSmoothHazard(). See plot.singleEventCB().

+
+
+

See also

+ +
+ +
+

Examples

+
data("simdat")
+mod_cb <- fitSmoothHazard(status ~ trt * eventtime,
+                                    time = "eventtime",
+                                    data = simdat[1:200,],
+                                    ratio = 1,
+                                    family = "glm")
+
+results0 <- hazardPlot(object = mod_cb, newdata = data.frame(trt = 0),
+           ci.lvl = 0.95, ci = FALSE, lty = 1, line.col = 1, lwd = 2)
+head(results0)
+#>     trt eventtime offset predictedloghazard predictedhazard
+#> 1     0 0.2695865      0          -2.186588       0.1122993
+#> 1.1   0 0.3173684      0          -2.164438       0.1148144
+#> 1.2   0 0.3651504      0          -2.142289       0.1173859
+#> 1.3   0 0.4129323      0          -2.120139       0.1200149
+#> 1.4   0 0.4607143      0          -2.097989       0.1227029
+#> 1.5   0 0.5084963      0          -2.075840       0.1254510
+hazardPlot(object = mod_cb, newdata = data.frame(trt = 1), ci = FALSE,
+           ci.lvl = 0.95, add = TRUE, lty = 2, line.col = 2, lwd = 2)
+legend("topleft", c("trt=0","trt=1"),lty=1:2,col=1:2,bty="y", lwd = 2)
+
+
+
+
- - - + + diff --git a/reference/index.html b/reference/index.html index fad9317c..3e141d7e 100644 --- a/reference/index.html +++ b/reference/index.html @@ -1,66 +1,12 @@ - - - - - - - -Function reference • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Function reference • casebase + + - - - - -
-
- -
- -
+
- - - - - - - - - - -
-

Data Visualization

-

Visualizing survival data with population time plots

+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
+

Data Visualization

+

Visualizing survival data with population time plots

+

plot(<popTime>) popTime() checkArgsTimeEvent()

Population Time Plot

-

Model Fitting

-

Main functions for fitting smooth hazard functions and plotting them

+
+

Model Fitting

+

Main functions for fitting smooth hazard functions and plotting them

+

fitSmoothHazard() fitSmoothHazard.fit() prepareX()

Fit smooth-in-time parametric hazard functions.

+

absoluteRisk.CompRisk() absoluteRisk() print(<absRiskCB>) plot(<absRiskCB>)

Compute absolute risks using the fitted hazard function.

+

plotHazardRatio() plot(<singleEventCB>) incrVar()

Plot Hazards and Hazard Ratios

+

hazardPlot()

Plot Fitted Hazard Curve as a Function of Time

-

Datasets

-

Datasets that ship with the package to illustrate usage

+
+

confint(<absRiskCB>)

+

Compute confidence intervals for risks

+

Datasets

+

Datasets that ship with the package to illustrate usage

+

ERSPC

Data on the men in the European Randomized Study of Prostate Cancer Screening

+

bmtcrr

Data on transplant patients

+

brcancer

German Breast Cancer Study Group 2

+

eprchd

Estrogen plus Progestin and the Risk of Coronary Heart Disease (eprchd)

+

simdat

Simulated data under Weibull model with Time-Dependent Treatment Effect

+

support

Study to Understand Prognoses Preferences Outcomes and Risks of Treatment (SUPPORT)

-

Internal Utility Functions

-

Functions not really meant for the user

+
+

Internal Utility Functions

+

Functions not really meant for the user

+

sampleCaseBase()

Create case-base dataset for use in fitting parametric hazard functions

+

checkArgsEventIndicator()

Check that Event is in Correct Format

+

summary()

An S4 class to store the output of fitSmoothHazard

- +
+
-
- - + + diff --git a/reference/plot.singleEventCB-1.png b/reference/plot.singleEventCB-1.png index d22893dd..01239615 100644 Binary files a/reference/plot.singleEventCB-1.png and b/reference/plot.singleEventCB-1.png differ diff --git a/reference/plot.singleEventCB.html b/reference/plot.singleEventCB.html index 5abb8219..a907845c 100644 --- a/reference/plot.singleEventCB.html +++ b/reference/plot.singleEventCB.html @@ -1,70 +1,15 @@ - - - - - - - -Plot Hazards and Hazard Ratios — plotHazardRatio • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Plot Hazards and Hazard Ratios — plotHazardRatio • casebase - - + + - - -
-
- -
- -
+
@@ -153,38 +92,38 @@

Plot Hazards and Hazard Ratios

function accounts for the possible time-varying exposure effects.

-
plotHazardRatio(x, newdata, newdata2, ci, ci.lvl, ci.col, rug, xvar, ...)
-
-# S3 method for singleEventCB
-plot(
-  x,
-  ...,
-  type = c("hazard", "hr"),
-  hazard.params = list(),
-  newdata,
-  exposed,
-  increment = 1,
-  var,
-  xvar = NULL,
-  ci = FALSE,
-  ci.lvl = 0.95,
-  rug = !ci,
-  ci.col = "grey"
-)
-
-incrVar(var, increment = 1)
- -

Arguments

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
x

Fitted object of class glm, gam, cv.glmnet or gbm. This is -the result from the fitSmoothHazard() function.

newdata

Required for type="hr". The newdata argument is +

+
plotHazardRatio(x, newdata, newdata2, ci, ci.lvl, ci.col, rug, xvar, ...)
+
+# S3 method for singleEventCB
+plot(
+  x,
+  ...,
+  type = c("hazard", "hr"),
+  hazard.params = list(),
+  newdata,
+  exposed,
+  increment = 1,
+  var,
+  xvar = NULL,
+  ci = FALSE,
+  ci.lvl = 0.95,
+  rug = !ci,
+  ci.col = "grey"
+)
+
+incrVar(var, increment = 1)
+
+ +
+

Arguments

+
x
+

Fitted object of class glm, gam, cv.glmnet or gbm. This is +the result from the fitSmoothHazard() function.

+ + +
newdata
+

Required for type="hr". The newdata argument is the "unexposed" group, while the exposed group is defined by either: (i) a change (defined by the increment argument) in a variable in newdata defined by the var argument ; or (ii) an exposed function that takes @@ -192,85 +131,85 @@

Arg function(data) transform(data, treat=1)). This is a generalization of the behavior of the rstpm2 plot function. It allows both numeric and factor variables to be incremented or decremented. See references for rstpm2 -package. Only used for type="hr"

newdata2

data.frame for exposed group. calculated and passed -internally to plotHazardRatio function

ci

Logical; if TRUE confidence bands are calculated. Only available +package. Only used for type="hr"

+ + +
newdata2
+

data.frame for exposed group. calculated and passed +internally to plotHazardRatio function

+ + +
ci
+

Logical; if TRUE confidence bands are calculated. Only available for family="glm" and family="gam", and only used for type="hr", Default: !add. Confidence intervals for hazard ratios are calculated using -the Delta Method.

ci.lvl

Confidence level. Must be in (0,1), Default: 0.95. Only used -for type="hr".

ci.col

Confidence band color. Only used if argument ci=TRUE, -Default: 'grey'. Only used for type="hr".

rug

Logical. Adds a rug representation (1-d plot) of the event times -(only for status=1), Default: !ci. Only used for type="hr".

xvar

Variable to be used on x-axis for hazard ratio plots. If NULL, +the Delta Method.

+ + +
ci.lvl
+

Confidence level. Must be in (0,1), Default: 0.95. Only used +for type="hr".

+ + +
ci.col
+

Confidence band color. Only used if argument ci=TRUE, +Default: 'grey'. Only used for type="hr".

+ + +
rug
+

Logical. Adds a rug representation (1-d plot) of the event times +(only for status=1), Default: !ci. Only used for type="hr".

+ + +
xvar
+

Variable to be used on x-axis for hazard ratio plots. If NULL, the function defaults to using the time variable used in the call to fitSmoothHazard. In general, this should be any continuous variable which has an interaction term with another variable. Only used for -type="hr".

...

further arguments passed to plot. Only used if type="hr". +type="hr".

+ + +
...
+

further arguments passed to plot. Only used if type="hr". Any of lwd,lty,col,pch,cex will be applied to the hazard ratio -line, or point (if only one time point is supplied to newdata).

type

plot type. Choose one of either "hazard" for hazard -function or "hr" for hazard ratio. Default: type = "hazard".

hazard.params

Named list of arguments which will override the defaults -passed to visreg::visreg(), The default arguments are list(fit = x, +line, or point (if only one time point is supplied to newdata).

+ + +
type
+

plot type. Choose one of either "hazard" for hazard +function or "hr" for hazard ratio. Default: type = "hazard".

+ + +
hazard.params
+

Named list of arguments which will override the defaults +passed to visreg::visreg(), The default arguments are list(fit = x, trans = exp, plot = TRUE, rug = FALSE, alpha = 1, partial = FALSE, overlay - = TRUE). For example, if you want a 95% confidence band, specify + = TRUE). For example, if you want a 95% confidence band, specify hazard.params = list(alpha = 0.05). Note that The cond argument must be provided as a named list. Each element of that list specifies the value for one of the terms in the model; any elements left unspecified are filled in with the median/most common category. Only used for type="hazard". All other argument are used for type="hr". Note that the -visreg package must be installed for type="hazard".

exposed

function that takes newdata and returns the exposed +visreg package must be installed for type="hazard".

+ + +
exposed
+

function that takes newdata and returns the exposed dataset (e.g. function(data) transform(data, treat = 1)). This argument takes precedence over the var argument, i.e., if both var and exposed are correctly specified, only the exposed argument -will be used. Only used for type="hr".

increment

Numeric value indicating how much to increment (if positive) +will be used. Only used for type="hr".

+ + +
increment
+

Numeric value indicating how much to increment (if positive) or decrement (if negative) the var variable in newdata. See var argument for more details. Default is 1. Only used for -type="hr".

var

specify the variable name for the exposed/unexposed (name is given +type="hr".

+ + +
var
+

specify the variable name for the exposed/unexposed (name is given as a character variable). If this argument is missing, then the exposed argument must be specified. This is the variable which will be incremented by the increment argument to give the exposed @@ -283,141 +222,150 @@

Arg increment=-1 will return one level lower than the value in newdata. If var is a numeric, than increment will increment (if positive) or decrement (if negative) by the supplied value. -Only used for type="hr".

+Only used for type="hr".

-

Value

+
+
+

Value

+ -

a plot of the hazard function or hazard ratio. For type="hazard", a +

a plot of the hazard function or hazard ratio. For type="hazard", a data.frame (returned invisibly) of the original data used in the fitting -along with the data used to create the plots including predictedhazard -which is the predicted hazard for a given covariate pattern and time. -predictedloghazard is the predicted hazard on the log scale. lowerbound -and upperbound are the lower and upper confidence interval bounds on the +along with the data used to create the plots including predictedhazard

+ + +

which is the predicted hazard for a given covariate pattern and time. +predictedloghazard is the predicted hazard on the log scale. lowerbound

+ + +

and upperbound are the lower and upper confidence interval bounds on the hazard scale (i.e. used to plot the confidence bands). standarderror is the standard error of the log hazard or log hazard ratio (only if family="glm" or family="gam"). For type="hr", log_hazard_ratio and hazard_ratio is returned, and if ci=TRUE, standarderror (on the log scale) and lowerbound and upperbound of the hazard_ratio are returned.

-

Details

- +
+
+

Details

This function has only been thoroughly tested for family="glm". If the user wants more customized plot aesthetics, we recommend saving the results to a data.frame and using the graphical package of their choice.

-

References

- +
+
+

References

Mark Clements and Xing-Rong Liu (2019). rstpm2: Smooth Survival Models, Including Generalized Survival Models. R package version 1.5.1. https://CRAN.R-project.org/package=rstpm2

Breheny P and Burchett W (2017). Visualization of Regression Models Using visreg. The R Journal, 9: 56-71.

-

See also

- - - -

Examples

-
if (requireNamespace("splines", quietly = TRUE)) { -data("simdat") # from casebase package -library(splines) -simdat <- transform(simdat[sample(1:nrow(simdat), size = 200),], - treat = factor(trt, levels = 0:1, - labels = c("control","treatment"))) - -fit_numeric_exposure <- fitSmoothHazard(status ~ trt*bs(eventtime), - data = simdat, - ratio = 1, - time = "eventtime") - -fit_factor_exposure <- fitSmoothHazard(status ~ treat*bs(eventtime), - data = simdat, - ratio = 1, - time = "eventtime") - -newtime <- quantile(fit_factor_exposure[["data"]][[fit_factor_exposure[["timeVar"]]]], - probs = seq(0.05, 0.95, 0.01)) - -par(mfrow = c(1,3)) -plot(fit_numeric_exposure, - type = "hr", - newdata = data.frame(trt = 0, eventtime = newtime), - exposed = function(data) transform(data, trt = 1), - xvar = "eventtime", - ci = TRUE) - -#by default this will increment `var` by 1 for exposed category -plot(fit_factor_exposure, - type = "hr", - newdata = data.frame(treat = factor("control", - levels = c("control","treatment")), eventtime = newtime), - var = "treat", - increment = 1, - xvar = "eventtime", - ci = TRUE, - ci.col = "lightblue", - xlab = "Time", - main = "Hazard Ratio for Treatment", - ylab = "Hazard Ratio", - lty = 5, - lwd = 7, - rug = TRUE) - - -# we can also decrement `var` by 1 to give hazard ratio for control/treatment -result <- plot(fit_factor_exposure, - type = "hr", - newdata = data.frame(treat = factor("treatment", - levels = c("control","treatment")), - eventtime = newtime), - var = "treat", - increment = -1, - xvar = "eventtime", - ci = TRUE) - -# see data used to create plot -head(result) -} -
#> treat eventtime log_hazard_ratio standarderror hazard_ratio lowerbound -#> 5% treatment 0.3488444 0.5617254 0.7422200 1.753696 0.4094260 -#> 6% treatment 0.3902785 0.6649501 0.7012281 1.944394 0.4919237 -#> 7% treatment 0.4148271 0.7239973 0.6788513 2.062662 0.5452413 -#> 8% treatment 0.4512202 0.8086798 0.6482910 2.244942 0.6300556 -#> 9% treatment 0.4636149 0.8367497 0.6385929 2.308850 0.6604266 -#> 10% treatment 0.4862850 0.8870854 0.6217826 2.428043 0.7177844 -#> upperbound -#> 5% 7.511610 -#> 6% 7.685473 -#> 7% 7.803102 -#> 8% 7.998922 -#> 9% 8.071737 -#> 10% 8.213317
+
+ + +
+

Examples

+
if (requireNamespace("splines", quietly = TRUE)) {
+data("simdat") # from casebase package
+library(splines)
+simdat <- transform(simdat[sample(1:nrow(simdat), size = 200),],
+                    treat = factor(trt, levels = 0:1,
+                    labels = c("control","treatment")))
+
+fit_numeric_exposure <- fitSmoothHazard(status ~ trt*bs(eventtime),
+                                        data = simdat,
+                                        ratio = 1,
+                                        time = "eventtime")
+
+fit_factor_exposure <- fitSmoothHazard(status ~ treat*bs(eventtime),
+                                       data = simdat,
+                                       ratio = 1,
+                                       time = "eventtime")
+
+newtime <- quantile(fit_factor_exposure[["data"]][[fit_factor_exposure[["timeVar"]]]],
+                    probs = seq(0.05, 0.95, 0.01))
+
+par(mfrow = c(1,3))
+plot(fit_numeric_exposure,
+     type = "hr",
+     newdata = data.frame(trt = 0, eventtime = newtime),
+     exposed = function(data) transform(data, trt = 1),
+     xvar = "eventtime",
+     ci = TRUE)
+
+#by default this will increment `var` by 1 for exposed category
+plot(fit_factor_exposure,
+     type = "hr",
+     newdata = data.frame(treat = factor("control",
+              levels = c("control","treatment")), eventtime = newtime),
+     var = "treat",
+     increment = 1,
+     xvar = "eventtime",
+     ci = TRUE,
+     ci.col = "lightblue",
+     xlab = "Time",
+     main = "Hazard Ratio for Treatment",
+     ylab = "Hazard Ratio",
+     lty = 5,
+     lwd = 7,
+     rug = TRUE)
+
+
+# we can also decrement `var` by 1 to give hazard ratio for control/treatment
+result <- plot(fit_factor_exposure,
+               type = "hr",
+               newdata = data.frame(treat = factor("treatment",
+                                    levels = c("control","treatment")),
+                                    eventtime = newtime),
+               var = "treat",
+               increment = -1,
+               xvar = "eventtime",
+               ci = TRUE)
+
+# see data used to create plot
+head(result)
+}
+
+#>         treat eventtime log_hazard_ratio standarderror hazard_ratio lowerbound
+#> 5%  treatment 0.3488444        0.5617254     0.7422200     1.753696  0.4094260
+#> 6%  treatment 0.3902785        0.6649501     0.7012281     1.944394  0.4919237
+#> 7%  treatment 0.4148271        0.7239973     0.6788513     2.062662  0.5452413
+#> 8%  treatment 0.4512202        0.8086798     0.6482910     2.244942  0.6300556
+#> 9%  treatment 0.4636149        0.8367497     0.6385929     2.308850  0.6604266
+#> 10% treatment 0.4862850        0.8870854     0.6217826     2.428043  0.7177844
+#>     upperbound
+#> 5%    7.511610
+#> 6%    7.685473
+#> 7%    7.803102
+#> 8%    7.998922
+#> 9%    8.071737
+#> 10%   8.213317
+
+
+
- - - + + diff --git a/reference/popTime-1.png b/reference/popTime-1.png index 88ac6e64..865b6d36 100644 Binary files a/reference/popTime-1.png and b/reference/popTime-1.png differ diff --git a/reference/popTime.html b/reference/popTime.html index f2416614..d5ce0b28 100644 --- a/reference/popTime.html +++ b/reference/popTime.html @@ -1,69 +1,14 @@ - - - - - - - -Population Time Plot — plot.popTime • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Population Time Plot — plot.popTime • casebase - - - - - - - - - - - + + - - -
-
- -
- -
+
@@ -151,249 +90,256 @@

Population Time Plot

of incidence density

-
# S3 method for popTime
-plot(
-  x,
-  ...,
-  xlab = "Follow-up time",
-  ylab = "Population",
-  add.case.series = TRUE,
-  add.base.series = FALSE,
-  add.competing.event = FALSE,
-  casebase.theme = TRUE,
-  ribbon.params = list(),
-  case.params = list(),
-  base.params = list(),
-  competing.params = list(),
-  color.params = list(),
-  fill.params = list(),
-  theme.params = list(),
-  facet.params = list(),
-  ratio = 1,
-  censored.indicator,
-  comprisk = FALSE,
-  legend = TRUE,
-  ncol,
-  legend.position,
-  line.width,
-  line.colour,
-  point.size,
-  point.colour
-)
-
-popTime(data, time, event, censored.indicator, exposure, percentile_number)
-
-checkArgsTimeEvent(data, time, event)
- -

Arguments

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
x

an object of class popTime or popTimeExposure.

...

Ignored.

xlab, ylab

The title of the respective axis. Default: 'Follow-up time' -for xlab and 'Population' for ylab

add.case.series

Logical indicating if the case series should be added -to the plot. Default: TRUE

add.base.series

Logical indicating if the base series should be added -to the plot. Default: FALSE

add.competing.event

Logical indicating if the competing event should -be added to the plot. Default: FALSE

casebase.theme

Logical indication if the casebase theme be used. The -casebase theme uses theme_minimal. Default: TRUE.

ribbon.params

A list containing arguments that are passed to -geom_ribbon which is used to plot the +

+
# S3 method for popTime
+plot(
+  x,
+  ...,
+  xlab = "Follow-up time",
+  ylab = "Population",
+  add.case.series = TRUE,
+  add.base.series = FALSE,
+  add.competing.event = FALSE,
+  casebase.theme = TRUE,
+  ribbon.params = list(),
+  case.params = list(),
+  base.params = list(),
+  competing.params = list(),
+  color.params = list(),
+  fill.params = list(),
+  theme.params = list(),
+  facet.params = list(),
+  ratio = 1,
+  censored.indicator,
+  comprisk = FALSE,
+  legend = TRUE,
+  ncol,
+  legend.position,
+  line.width,
+  line.colour,
+  point.size,
+  point.colour
+)
+
+popTime(data, time, event, censored.indicator, exposure, percentile_number)
+
+checkArgsTimeEvent(data, time, event)
+
+ +
+

Arguments

+
x
+

an object of class popTime or popTimeExposure.

+ + +
...
+

Ignored.

+ + +
xlab, ylab
+

The title of the respective axis. Default: 'Follow-up time' +for xlab and 'Population' for ylab

+ + +
add.case.series
+

Logical indicating if the case series should be added +to the plot. Default: TRUE

+ + +
add.base.series
+

Logical indicating if the base series should be added +to the plot. Default: FALSE

+ + +
add.competing.event
+

Logical indicating if the competing event should +be added to the plot. Default: FALSE

+ + +
casebase.theme
+

Logical indication if the casebase theme be used. The +casebase theme uses theme_minimal. Default: TRUE.

+ + +
ribbon.params
+

A list containing arguments that are passed to +geom_ribbon which is used to plot the population-time area. These arguments will override the function defaults. For example, you can set ribbon.params = list(colour = 'green') if -you want the area to be green.

case.params, base.params, competing.params

A list containing arguments -that are passed to geom_point which is used to plot +you want the area to be green.

+ + +
case.params, base.params, competing.params
+

A list containing arguments +that are passed to geom_point which is used to plot the case series, base series, competing events. These arguments will override the function defaults. For example, you can set case.params = list(size = 1.5) if you want to increase the point size for the case series points. Note: do not use this argument to change the color of the points. Doing so will result in unexpected results for the legend. See the color.params and fill.params arguments, if you want to change -the color of the points.

color.params

A list containing arguments that are passed to -scale_color_manual which is used to plot the legend. +the color of the points.

+ + +
color.params
+

A list containing arguments that are passed to +scale_color_manual which is used to plot the legend. Only used if legend=TRUE. These arguments will override the function defaults. Use this argument if you want to change the color of the points. -See examples for more details.

fill.params

A list containing arguments that are passed to -scale_fill_manual which is used to plot the legend. +See examples for more details.

+ + +
fill.params
+

A list containing arguments that are passed to +scale_fill_manual which is used to plot the legend. Only used if legend=TRUE. These arguments will override the function defaults. Use this argument if you want to change the color of the points. -See examples for more details.

theme.params

A list containing arguments that are passed to -theme. For example theme.params = - list(legend.position = 'none').

facet.params

A list containing arguments that are passed to -facet_wrap which is used to create facet plots. Only +See examples for more details.

+ + +
theme.params
+

A list containing arguments that are passed to +theme. For example theme.params = + list(legend.position = 'none').

+ + +
facet.params
+

A list containing arguments that are passed to +facet_wrap which is used to create facet plots. Only used if plotting exposure stratified population time plots. These arguments -will override the function defaults.

ratio

If add.base.series=TRUE, integer, giving the ratio of the +will override the function defaults.

+ + +
ratio
+

If add.base.series=TRUE, integer, giving the ratio of the size of the base series to that of the case series. This argument is passed -to the sampleCaseBase function. Default: 10.

censored.indicator

a character string of length 1 indicating which +to the sampleCaseBase function. Default: 10.

+ + +
censored.indicator
+

a character string of length 1 indicating which value in event is the censored. This function will use -relevel to set censored.indicator as the +relevel to set censored.indicator as the reference level. This argument is ignored if the event variable is a -numeric

comprisk

If add.base.series=TRUE, logical indicating whether we +numeric

+ + +
comprisk
+

If add.base.series=TRUE, logical indicating whether we have multiple event types and that we want to consider some of them as competing risks. This argument is passed to the -sampleCaseBase function. Note: should be TRUE if your +sampleCaseBase function. Note: should be TRUE if your data has competing risks, even if you don't want to add competing risk -points (add.competing.event=FALSE). Default: FALSE

legend

Logical indicating if a legend should be added to the plot. +points (add.competing.event=FALSE). Default: FALSE

+ + +
legend
+

Logical indicating if a legend should be added to the plot. Note that if you want to change the colors of the points, through the color.params and fill.params arguments, then set legend=TRUE. If you want to change the color of the points but not have a legend, then set legend=TRUE and theme.params = - list(legend.position = 'none'. Default: FALSE

ncol

Deprecated. Use facet.params instead.

legend.position

Deprecated. Specify the legend.position argument + list(legend.position = 'none'. Default: FALSE

+ + +
ncol
+

Deprecated. Use facet.params instead.

+ + +
legend.position
+

Deprecated. Specify the legend.position argument instead in the theme.params argument. e.g. theme.params = - list(legend.position = 'bottom').

line.width

Deprecated.

line.colour

Deprecated. specify the fill argument instead in -ribbon.params. e.g. ribbon.params = list(fill = 'red').

point.size

Deprecated. specify the size argument instead in the + list(legend.position = 'bottom').

+ + +
line.width
+

Deprecated.

+ + +
line.colour
+

Deprecated. specify the fill argument instead in +ribbon.params. e.g. ribbon.params = list(fill = 'red').

+ + +
point.size
+

Deprecated. specify the size argument instead in the case.params or base.params or competing.params -argument. e.g. case.params = list(size = 1.5).

point.colour

Deprecated. Specify the values argument instead in the +argument. e.g. case.params = list(size = 1.5).

+ + +
point.colour
+

Deprecated. Specify the values argument instead in the color.params and fill.params argument. See examples for -details.

data

a data.frame or data.table containing the source -dataset.

time

a character string giving the name of the time variable. See -Details.

event

a character string giving the name of the event variable +details.

+ + +
data
+

a data.frame or data.table containing the source +dataset.

+ + +
time
+

a character string giving the name of the time variable. See +Details.

+ + +
event
+

a character string giving the name of the event variable contained in data. See Details. If event is a numeric variable, then 0 needs to represent a censored observation, 1 needs to be the event of interest. Integers 2, 3, ... and so on are treated as competing events. If event is a factor or character and censored.indicator is not specified, this function will assume the -reference level is the censored indicator

exposure

a character string of length 1 giving the name of the +reference level is the censored indicator

+ + +
exposure
+

a character string of length 1 giving the name of the exposure variable which must be contained in data. Default is NULL. This is used to produced exposure stratified plots. If an exposure is specified, popTime returns an exposure attribute which contains the name of the exposure variable in the dataset. The plot method for objects of class popTime will use this exposure -attribute to create exposure stratified population time plots.

percentile_number

Default=0.5. Give a value between 0-1. if the +attribute to create exposure stratified population time plots.

+ + +
percentile_number
+

Default=0.5. Give a value between 0-1. if the percentile number of available subjects at any given point is less than 10, then sample regardless of case status. Depending on distribution of survival times and events event points may not be evenly distributed with -default value.

+default value.

-

Value

+
+
+

Value

+ -

The methods for plot return a population time plot, stratified +

The methods for plot return a population time plot, stratified by exposure status in the case of popTimeExposure. Note that these are ggplot2 objects and can therefore be used in subsequent ggplot2 type plots. See examples and vignette for details.

+ +

An object of class popTime (or popTimeExposure if exposure is specified), data.table and data.frame in this order! The output of this function is to be used with the plot method for objects of class popTime or of class popTimeExposure, which will produce population time plots. This dataset augments the original data -with the following columns:

-
original.event

value of +with the following columns:

original.event
+

value of the event variable in the original dataset - the one specified by the -event user argument to this function

time

renames the user -specified time column to time

event

renames the user specified event +event user argument to this function

+
time
+

renames the user +specified time column to time

+
event
+

renames the user specified event argument to event

-
- -

Details

+
+
+

Details

This function leverages the ggplot2 package to build population time plots. It builds the plot by adding layers, starting with a layer for the area representing the population time. It then sequentially @@ -408,7 +354,7 @@

Details (because the subjects with the least amount of observation time are plotted at the top of the y-axis). By randomly distributing them, we can get a better sense of the incidence density. The base series is sampled -horizontally on the plot using the sampleCaseBase function.

+horizontally on the plot using the sampleCaseBase function.

It is assumed that data contains the two columns corresponding to the supplied time and event variables. If either the time or event argument is missing, the function looks for @@ -420,9 +366,12 @@

Details indicator". This function will first (automatically) find the time variable and remove this as a possibility from subsequent searches of the event variable. The following regular expressions are used for the time and -event variables:

-
time

"[\s\W_]+time|^time\b"

-
event

"[\s\W_]+event|^event\b|[\s\W_]+status|^status\b"

+event variables:

time
+

"[\s\W_]+time|^time\b"

+ +
event
+

"[\s\W_]+event|^event\b|[\s\W_]+status|^status\b"

+

This allows for "time" to be preceded or followed by one or more white space characters, one or more non-word characters or one or more @@ -431,71 +380,76 @@

Details "death_time", "Time", "time", "diagnosis_time", "time.diag", "diag__time". But the following will not be recognized: "diagtime","eventtime", "Timediag"

-

See also

- - + -

Examples

-
# change color of points -library(ggplot2) -data("bmtcrr") -popTimeData <- popTime(data = bmtcrr, time = "ftime", event = "Status") -fill_cols <- c("Case series" = "black", "Competing event" = "#009E73", - "Base series" = "#0072B2") -color_cols <- c("Case series" = "black", "Competing event" = "black", - "Base series" = "black") - -plot(popTimeData, - add.case.series = TRUE, - add.base.series = TRUE, - add.competing.event = FALSE, - legend = TRUE, - comprisk = TRUE, - fill.params = list( - name = element_blank(), - breaks = c("Case series", "Competing event", "Base series"), - values = fill_cols - ), - color.params = list( - name = element_blank(), - breaks = c("Case series", "Competing event", "Base series"), - values = color_cols - ) -) -
data("bmtcrr") -popTimeData <- popTime(data = bmtcrr, time = "ftime") -
#> 'Status' will be used as the event variable
class(popTimeData) -
#> [1] "popTime" "data.table" "data.frame"
popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D") -
#> 'Status' will be used as the event variable
attr(popTimeData, "exposure") -
#> [1] "D"
+
+

Examples

+
# change color of points
+library(ggplot2)
+data("bmtcrr")
+popTimeData <- popTime(data = bmtcrr, time = "ftime", event = "Status")
+fill_cols <- c("Case series" = "black", "Competing event" = "#009E73",
+               "Base series" = "#0072B2")
+color_cols <- c("Case series" = "black", "Competing event" = "black",
+                "Base series" = "black")
+
+plot(popTimeData,
+  add.case.series = TRUE,
+  add.base.series = TRUE,
+  add.competing.event = FALSE,
+  legend = TRUE,
+  comprisk = TRUE,
+  fill.params = list(
+    name = element_blank(),
+    breaks = c("Case series", "Competing event", "Base series"),
+    values = fill_cols
+  ),
+  color.params = list(
+    name = element_blank(),
+    breaks = c("Case series", "Competing event", "Base series"),
+    values = color_cols
+  )
+)
+
+data("bmtcrr")
+popTimeData <- popTime(data = bmtcrr, time = "ftime")
+#> 'Status' will be used as the event variable
+class(popTimeData)
+#> [1] "popTime"    "data.table" "data.frame"
+popTimeData <- popTime(data = bmtcrr, time = "ftime", exposure = "D")
+#> 'Status' will be used as the event variable
+attr(popTimeData, "exposure")
+#> [1] "D"
+
+

+
- - - + + diff --git a/reference/sampleCaseBase.html b/reference/sampleCaseBase.html index 8b2c5e54..9e000afe 100644 --- a/reference/sampleCaseBase.html +++ b/reference/sampleCaseBase.html @@ -1,69 +1,14 @@ - - - - - - - -Create case-base dataset for use in fitting parametric hazard functions — sampleCaseBase • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Create case-base dataset for use in fitting parametric hazard functions — sampleCaseBase • casebase - - - - - - - - - + + - - - - -
-
- -
- -
+
@@ -151,157 +90,162 @@

Create case-base dataset for use in fitting parametric hazard functions

functions easily via logistic regression.

-
sampleCaseBase(
-  data,
-  time,
-  event,
-  ratio = 10,
-  comprisk = FALSE,
-  censored.indicator
-)
+
+
sampleCaseBase(
+  data,
+  time,
+  event,
+  ratio = 10,
+  comprisk = FALSE,
+  censored.indicator
+)
+
+ +
+

Arguments

+
data
+

a data.frame or data.table containing the source dataset.

+ + +
time
+

a character string giving the name of the time variable. See +Details.

+ + +
event
+

a character string giving the name of the event variable. See +Details.

+ + +
ratio
+

Integer, giving the ratio of the size of the base series to that +of the case series. Defaults to 10.

+ -

Arguments

- - - - - - - - - - - - - - - - - - - - - - - - - - -
data

a data.frame or data.table containing the source dataset.

time

a character string giving the name of the time variable. See -Details.

event

a character string giving the name of the event variable. See -Details.

ratio

Integer, giving the ratio of the size of the base series to that -of the case series. Defaults to 10.

comprisk

Logical. Indicates whether we have multiple event types and -that we want to consider some of them as competing risks.

censored.indicator

a character string of length 1 indicating which +

comprisk
+

Logical. Indicates whether we have multiple event types and +that we want to consider some of them as competing risks.

+ + +
censored.indicator
+

a character string of length 1 indicating which value in event is the censored. This function will use -relevel to set censored.indicator as the +relevel to set censored.indicator as the reference level. This argument is ignored if the event variable is a -numeric

+numeric

-

Value

+
+
+

Value

+ -

The function returns a dataset, with the same format as the source +

The function returns a dataset, with the same format as the source dataset, and where each row corresponds to a person-moment sampled from the case or the base series.

-

Details

- +
+
+

Details

The base series is sampled using a multinomial scheme: individuals are sampled proportionally to their follow-up time.

It is assumed that data contains the two columns corresponding to the supplied time and event variables. If either the time or event argument is missing, the function looks for columns with appropriate-looking -names (see checkArgsTimeEvent).

-

Warning

- +names (see checkArgsTimeEvent).

+
+
+

Warning

The offset is calculated using the total follow-up time for all individuals in the study. Therefore, we need time to be on the original scale, not a transformed scale (e.g. logarithmic). Otherwise, the offset and the estimation will be wrong.

+
-

Examples

-
# Simulate censored survival data for two outcome types from exponential -library(data.table) -set.seed(12345) -nobs <- 500 -tlim <- 10 - -# simulation parameters -b1 <- 200 -b2 <- 50 - -# event type 0-censored, 1-event of interest, 2-competing event -# t observed time/endpoint -# z is a binary covariate -DT <- data.table(z = rbinom(nobs, 1, 0.5)) -DT[, `:=`( - "t_event" = rweibull(nobs, 1, b1), - "t_comp" = rweibull(nobs, 1, b2) -)] -
#> z t_event t_comp -#> 1: 1 312.74831 127.708526 -#> 2: 1 53.25243 8.497106 -#> 3: 1 34.48639 249.441113 -#> 4: 1 13.62873 52.322220 -#> 5: 0 78.36455 18.839434 -#> --- -#> 496: 0 29.81916 142.094179 -#> 497: 1 157.21649 53.021951 -#> 498: 1 299.22847 36.967088 -#> 499: 1 194.74603 63.880643 -#> 500: 1 402.21055 55.350048
DT[, `:=`( - "event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp), - "time" = pmin(t_event, t_comp) -)] -
#> z t_event t_comp event time -#> 1: 1 312.74831 127.708526 2 127.708526 -#> 2: 1 53.25243 8.497106 2 8.497106 -#> 3: 1 34.48639 249.441113 1 34.486389 -#> 4: 1 13.62873 52.322220 1 13.628727 -#> 5: 0 78.36455 18.839434 2 18.839434 -#> --- -#> 496: 0 29.81916 142.094179 1 29.819162 -#> 497: 1 157.21649 53.021951 2 53.021951 -#> 498: 1 299.22847 36.967088 2 36.967088 -#> 499: 1 194.74603 63.880643 2 63.880643 -#> 500: 1 402.21055 55.350048 2 55.350048
DT[time >= tlim, `:=`("event" = 0, "time" = tlim)] -
#> z t_event t_comp event time -#> 1: 1 312.74831 127.708526 0 10.000000 -#> 2: 1 53.25243 8.497106 2 8.497106 -#> 3: 1 34.48639 249.441113 0 10.000000 -#> 4: 1 13.62873 52.322220 0 10.000000 -#> 5: 0 78.36455 18.839434 0 10.000000 -#> --- -#> 496: 0 29.81916 142.094179 0 10.000000 -#> 497: 1 157.21649 53.021951 0 10.000000 -#> 498: 1 299.22847 36.967088 0 10.000000 -#> 499: 1 194.74603 63.880643 0 10.000000 -#> 500: 1 402.21055 55.350048 0 10.000000
-out <- sampleCaseBase(DT, time = "time", event = "event", comprisk = TRUE) -
+
+

Examples

+
# Simulate censored survival data for two outcome types from exponential
+library(data.table)
+set.seed(12345)
+nobs <- 500
+tlim <- 10
+
+# simulation parameters
+b1 <- 200
+b2 <- 50
+
+# event type 0-censored, 1-event of interest, 2-competing event
+# t observed time/endpoint
+# z is a binary covariate
+DT <- data.table(z = rbinom(nobs, 1, 0.5))
+DT[, `:=`(
+  "t_event" = rweibull(nobs, 1, b1),
+  "t_comp" = rweibull(nobs, 1, b2)
+)]
+#>      z   t_event     t_comp
+#>   1: 1 312.74831 127.708526
+#>   2: 1  53.25243   8.497106
+#>   3: 1  34.48639 249.441113
+#>   4: 1  13.62873  52.322220
+#>   5: 0  78.36455  18.839434
+#>  ---                       
+#> 496: 0  29.81916 142.094179
+#> 497: 1 157.21649  53.021951
+#> 498: 1 299.22847  36.967088
+#> 499: 1 194.74603  63.880643
+#> 500: 1 402.21055  55.350048
+DT[, `:=`(
+  "event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp),
+  "time" = pmin(t_event, t_comp)
+)]
+#>      z   t_event     t_comp event       time
+#>   1: 1 312.74831 127.708526     2 127.708526
+#>   2: 1  53.25243   8.497106     2   8.497106
+#>   3: 1  34.48639 249.441113     1  34.486389
+#>   4: 1  13.62873  52.322220     1  13.628727
+#>   5: 0  78.36455  18.839434     2  18.839434
+#>  ---                                        
+#> 496: 0  29.81916 142.094179     1  29.819162
+#> 497: 1 157.21649  53.021951     2  53.021951
+#> 498: 1 299.22847  36.967088     2  36.967088
+#> 499: 1 194.74603  63.880643     2  63.880643
+#> 500: 1 402.21055  55.350048     2  55.350048
+DT[time >= tlim, `:=`("event" = 0, "time" = tlim)]
+#>      z   t_event     t_comp event      time
+#>   1: 1 312.74831 127.708526     0 10.000000
+#>   2: 1  53.25243   8.497106     2  8.497106
+#>   3: 1  34.48639 249.441113     0 10.000000
+#>   4: 1  13.62873  52.322220     0 10.000000
+#>   5: 0  78.36455  18.839434     0 10.000000
+#>  ---                                       
+#> 496: 0  29.81916 142.094179     0 10.000000
+#> 497: 1 157.21649  53.021951     0 10.000000
+#> 498: 1 299.22847  36.967088     0 10.000000
+#> 499: 1 194.74603  63.880643     0 10.000000
+#> 500: 1 402.21055  55.350048     0 10.000000
+
+out <- sampleCaseBase(DT, time = "time", event = "event", comprisk = TRUE)
+
+
+
- - - + + diff --git a/reference/simdat.html b/reference/simdat.html index e6052d76..832dd948 100644 --- a/reference/simdat.html +++ b/reference/simdat.html @@ -1,68 +1,13 @@ - - - - - - - -Simulated data under Weibull model with Time-Dependent Treatment Effect — simdat • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Simulated data under Weibull model with Time-Dependent Treatment Effect — simdat • casebase - + + - - - -
-
- -
- -
+
@@ -149,23 +88,31 @@

Simulated data under Weibull model with Time-Dependent Treatment Effect

simsurv.

-
simdat
- - -

Format

+
+
simdat
+
-

A dataframe with 1000 observations and 4 variables:

-
id

patient id

eventtime

time of event

status

event -indicator (1 = event, 0 = censored)

trt

binary treatment +

+

Format

+

A dataframe with 1000 observations and 4 variables:

id
+

patient id

+
eventtime
+

time of event

+
status
+

event +indicator (1 = event, 0 = censored)

+
trt
+

binary treatment indicator

-
- -

Source

+
+ +
+

Details

Simulated data under a standard Weibull survival model that incorporates a time-dependent treatment effect (i.e. non-proportional hazards). For the time-dependent effect we included a single binary covariate (e.g. a treatment @@ -181,48 +128,48 @@

Details the time-dependent effect is induced by interacting the log hazard ratio with log time. The true parameters are 1. \(\beta_0\) = -0.5 2. \(\beta_1\) = 0.15 3. \(\lambda\) = 0.1 4. \(\gamma\) = 1.5

-

References

- +
+
+

References

Sam Brilleman (2019). simsurv: Simulate Survival Data. R package version 0.2.3. https://CRAN.R-project.org/package=simsurv

+
-

Examples

-
if (requireNamespace("splines", quietly = TRUE)) { -library(splines) -data("simdat") -mod_cb <- casebase::fitSmoothHazard(status ~ trt + ns(log(eventtime), - df = 3) + - trt:ns(log(eventtime),df=1), - time = "eventtime", - data = simdat, - ratio = 1) -} -
+
+

Examples

+
if (requireNamespace("splines", quietly = TRUE)) {
+library(splines)
+data("simdat")
+mod_cb <- casebase::fitSmoothHazard(status ~ trt + ns(log(eventtime),
+                                                      df = 3) +
+                                   trt:ns(log(eventtime),df=1),
+                                   time = "eventtime",
+                                   data = simdat,
+                                   ratio = 1)
+}
+
+
+
- - - + + diff --git a/reference/support.html b/reference/support.html index 5efb6a4c..690d8774 100644 --- a/reference/support.html +++ b/reference/support.html @@ -1,72 +1,17 @@ - - - - - - - -Study to Understand Prognoses Preferences Outcomes and Risks of Treatment -(SUPPORT) — support • casebase - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -Study to Understand Prognoses Preferences Outcomes and Risks of Treatment +(SUPPORT) — support • casebase - - - - - - - - - - - + + - - -
-
- -
- -
+
@@ -156,44 +95,112 @@

Study to Understand Prognoses Preferences Outcomes and Risks of Treatment included only tracks follow-up time and death.

-
support
- - -

Format

+
+
support
+
+
+

Format

A dataframe with 9104 observations and 34 variables after imputation and the removal of response variables like hospital charges, patient ratio of costs to charges and micro-costs. Ordinal variables, namely functional disability and income, were also removed. Finally, Surrogate activities of daily living were removed due to sparsity. There were 6 other model scores -in the data-set and they were removed; only aps and sps were kept.

-
Age

Stores a double representing age.

death

Death at any time up to NDI date: 31DEC94.

sex

0=female, 1=male.

slos

Days from study entry to discharge.

d.time

days of -follow-up.

dzgroup

Each level of dzgroup: ARF/MOSF w/Sepsis, +in the data-set and they were removed; only aps and sps were kept.

Age
+

Stores a double representing age.

+
death
+

Death at any time up to NDI date: 31DEC94.

+
sex
+

0=female, 1=male.

+
slos
+

Days from study entry to discharge.

+
d.time
+

days of +follow-up.

+
dzgroup
+

Each level of dzgroup: ARF/MOSF w/Sepsis, COPD, CHF, Cirrhosis, Coma, Colon Cancer, Lung Cancer, MOSF with -malignancy.

dzclass

ARF/MOSF, COPD/CHF/Cirrhosis, Coma and cancer -disease classes.

num.co

the number of comorbidities.

-
edu

years of education of patient.

scoma

The SUPPORT coma -score based on Glasgow D3.

avtisst

Average TISS, days 3-25.

-
race

Indicates race. White, Black, Asian, Hispanic or other.

-
hday

Day in Hospital at Study Admit

diabetes

Diabetes (Com -27-28, Dx 73)

dementia

Dementia (Comorbidity 6)

ca

Cancer -State

meanbp

Mean Arterial Blood Pressure Day 3.

wblc

White blood cell count on day 3.

hrt

Heart rate day 3.

-
resp

Respiration Rate day 3.

temp

Temperature, in -Celsius, on day 3.

pafi

PaO2/(0.01*FiO2) Day 3.

alb

Serum albumin day 3.

bili

Bilirubin Day 3.

crea

Serum -creatinine day 3.

sod

Serum sodium day 3.

ph

Serum pH -(in arteries) day 3.

glucose

Serum glucose day 3.

bun

BUN day 3.

urine

urine output day 3.

adlp

ADL patient -day 3.

adlsc

Imputed ADL calibrated to surrogate, if a surrogate -was used for a follow up.

sps

SUPPORT physiology score

-
aps

Apache III physiology score

-
- -

Source

- +malignancy.

+
dzclass
+

ARF/MOSF, COPD/CHF/Cirrhosis, Coma and cancer +disease classes.

+
num.co
+

the number of comorbidities.

+ +
edu
+

years of education of patient.

+
scoma
+

The SUPPORT coma +score based on Glasgow D3.

+
avtisst
+

Average TISS, days 3-25.

+ +
race
+

Indicates race. White, Black, Asian, Hispanic or other.

+ +
hday
+

Day in Hospital at Study Admit

+
diabetes
+

Diabetes (Com +27-28, Dx 73)

+
dementia
+

Dementia (Comorbidity 6)

+
ca
+

Cancer +State

+
meanbp
+

Mean Arterial Blood Pressure Day 3.

+
wblc
+

White blood cell count on day 3.

+
hrt
+

Heart rate day 3.

+ +
resp
+

Respiration Rate day 3.

+
temp
+

Temperature, in +Celsius, on day 3.

+
pafi
+

PaO2/(0.01*FiO2) Day 3.

+
alb
+

Serum albumin day 3.

+
bili
+

Bilirubin Day 3.

+
crea
+

Serum +creatinine day 3.

+
sod
+

Serum sodium day 3.

+
ph
+

Serum pH +(in arteries) day 3.

+
glucose
+

Serum glucose day 3.

+
bun
+

BUN day 3.

+
urine
+

urine output day 3.

+
adlp
+

ADL patient +day 3.

+
adlsc
+

Imputed ADL calibrated to surrogate, if a surrogate +was used for a follow up.

+
sps
+

SUPPORT physiology score

+ +
aps
+

Apache III physiology score

+ +
+
+

Source

Available at the following website: -https://biostat.app.vumc.org/wiki/Main/SupportDesc. +https://biostat.app.vumc.org/wiki/Main/SupportDesc. note: must unzip and process this data before use.

-

Details

- +
+
+

Details

Some of the original data was missing. Before imputation, there were a total of 9105 individuals and 47 variables. Of those variables, a few were removed before imputation. We removed three response variables: @@ -215,52 +222,52 @@

Details living was not imputed. This is due to collinearity between the other two covariates for activities of daily living. Therefore, surrogate activities of daily living was removed.

-

References

- +
+
+

References

Knaus WA, Harrell FE, Lynn J et al. (1995): The SUPPORT prognostic model: Objective estimates of survival for seriously ill hospitalized adults. Annals of Internal Medicine 122:191-203. -doi: 10.7326/0003-4819-122-3-199502010-00007 +doi:10.7326/0003-4819-122-3-199502010-00007 .

http://biostat.mc.vanderbilt.edu/wiki/Main/SupportDesc

http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/Csupport.html

+
-

Examples

-
data("support") -# Using the matrix interface and log of time -x <- model.matrix(death ~ . - d.time - 1, data = support) -y <- with(support, cbind(death, d.time)) - -fit_cb <- casebase::fitSmoothHazard.fit(x, y, time = "d.time", - event = "death", - formula_time = ~ log(d.time), - ratio = 1) -
+
+

Examples

+
data("support")
+# Using the matrix interface and log of time
+x <- model.matrix(death ~ . - d.time - 1, data = support)
+y <- with(support, cbind(death, d.time))
+
+fit_cb <- casebase::fitSmoothHazard.fit(x, y, time = "d.time",
+                                        event = "death",
+                                        formula_time = ~ log(d.time),
+                                        ratio = 1)
+
+
+
- - - + + diff --git a/sitemap.xml b/sitemap.xml index b02f480a..efb55c59 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -1,66 +1,93 @@ - http://sahirbhatnagar.com/casebase/index.html + https://sahirbhatnagar.com/casebase/404.html - http://sahirbhatnagar.com/casebase/reference/CompRisk-class.html + https://sahirbhatnagar.com/casebase/CONDUCT.html - http://sahirbhatnagar.com/casebase/reference/ERSPC.html + https://sahirbhatnagar.com/casebase/LICENSE-text.html - http://sahirbhatnagar.com/casebase/reference/absoluteRisk.html + https://sahirbhatnagar.com/casebase/articles/competingRisk.html - http://sahirbhatnagar.com/casebase/reference/bmtcrr.html + https://sahirbhatnagar.com/casebase/articles/customizingpopTime.html - http://sahirbhatnagar.com/casebase/reference/brcancer.html + https://sahirbhatnagar.com/casebase/articles/index.html - http://sahirbhatnagar.com/casebase/reference/checkArgsEventIndicator.html + https://sahirbhatnagar.com/casebase/articles/plotabsRisk.html - http://sahirbhatnagar.com/casebase/reference/eprchd.html + https://sahirbhatnagar.com/casebase/articles/plotsmoothHazard.html - http://sahirbhatnagar.com/casebase/reference/fitSmoothHazard.html + https://sahirbhatnagar.com/casebase/articles/popTime.html - http://sahirbhatnagar.com/casebase/reference/hazardPlot.html + https://sahirbhatnagar.com/casebase/articles/smoothHazard.html - http://sahirbhatnagar.com/casebase/reference/plot.singleEventCB.html + https://sahirbhatnagar.com/casebase/articles/time-varying-covariates.html - http://sahirbhatnagar.com/casebase/reference/popTime.html + https://sahirbhatnagar.com/casebase/authors.html - http://sahirbhatnagar.com/casebase/reference/sampleCaseBase.html + https://sahirbhatnagar.com/casebase/index.html - http://sahirbhatnagar.com/casebase/reference/simdat.html + https://sahirbhatnagar.com/casebase/news/index.html - http://sahirbhatnagar.com/casebase/reference/support.html + https://sahirbhatnagar.com/casebase/reference/CompRisk-class.html - http://sahirbhatnagar.com/casebase/articles/competingRisk.html + https://sahirbhatnagar.com/casebase/reference/ERSPC.html - http://sahirbhatnagar.com/casebase/articles/customizingpopTime.html + https://sahirbhatnagar.com/casebase/reference/absoluteRisk.html - http://sahirbhatnagar.com/casebase/articles/plotabsRisk.html + https://sahirbhatnagar.com/casebase/reference/bmtcrr.html - http://sahirbhatnagar.com/casebase/articles/plotsmoothHazard.html + https://sahirbhatnagar.com/casebase/reference/brcancer.html - http://sahirbhatnagar.com/casebase/articles/popTime.html + https://sahirbhatnagar.com/casebase/reference/checkArgsEventIndicator.html - http://sahirbhatnagar.com/casebase/articles/smoothHazard.html + https://sahirbhatnagar.com/casebase/reference/confint.absRiskCB.html + + + https://sahirbhatnagar.com/casebase/reference/eprchd.html + + + https://sahirbhatnagar.com/casebase/reference/fitSmoothHazard.html + + + https://sahirbhatnagar.com/casebase/reference/hazardPlot.html + + + https://sahirbhatnagar.com/casebase/reference/index.html + + + https://sahirbhatnagar.com/casebase/reference/plot.singleEventCB.html + + + https://sahirbhatnagar.com/casebase/reference/popTime.html + + + https://sahirbhatnagar.com/casebase/reference/sampleCaseBase.html + + + https://sahirbhatnagar.com/casebase/reference/simdat.html + + + https://sahirbhatnagar.com/casebase/reference/support.html