7. Conditional Survial Analysis

Conditional survival analysis is a statistical method used to estimate the probability of surviving a specific period of time given that a patient has already survived a certain amount of time. This method is particularly useful in cancer research, where patients may have already survived for a certain period after diagnosis or treatment. Conditional survival analysis can provide more accurate estimates of survival probabilities by taking into account the time that has already elapsed since diagnosis or treatment.

Overview

Ordered multivariate failure time data refers to datasets where multiple failure times are recorded for each subject, and these times have a natural order (e.g., times to successive events for the same subject). The conditional survival function in this context represents the probability of surviving beyond a specific time point for one event, given the survival status of previous events.

Let \(T_1, T_2, \ldots, T_k\) denote the ordered failure times for a subject, where \(T_1 \leq T_2 \leq \ldots \leq T_k\).

The conditional survival function for the \(j\)-th failure time, \(T_j\), given that the first \(i\) events have already occurred (i.e., \(T_1, T_2, \ldots, T_i \leq t\)), is defined as:

\[ S_j(t | T_1, T_2, \ldots, T_i) = P(T_j > t | T_1, T_2, \ldots, T_i \leq t) \]

Here:

  • \(S_j(t | T_1, \ldots, T_i)\): The conditional survival probability for \(T_j\) given the earlier times \(T_1, T_2, \ldots, T_i\).
  • \(T_j\): The \(j\)-th failure time.
  • \(T_1, \ldots, T_i\): Previous ordered failure times.

*Key Concepts

  1. Dependence Between Events:

    • Ordered failure times are typically dependent because they belong to the same subject.
    • Dependencies can be modeled using copulas or multivariate survival models.
  2. Joint Survival Function: The joint survival function for \(T_1, T_2, \ldots, T_k\) is:

    \[ S(t_1, t_2, \ldots, t_k) = P(T_1 > t_1, T_2 > t_2, \ldots, T_k > t_k) \]

  3. Marginal Survival Functions:

    The marginal survival function for \(T_j\)) is:

    \[ S_j(t) = P(T_j > t) \]

  4. Conditional Survival Using Bayes’ Theorem:

    For the conditional survival function, we use:

    \[ S_j(t | T_1, T_2, \ldots, T_i) = \frac{S(T_1, T_2, \ldots, T_i, T_j > t)}{S(T_1, T_2, \ldots, T_i)} \]

    • The numerator represents the joint survival of \(T_1, T_2, \ldots, T_i\) and \(T_j > t\).
    • The denominator normalizes the probability to account for the condition \(T_1, T_2, \ldots, T_i \leq t\).

Example: Two Failure Times \(T_1\) and \(T_2\)

For two ordered failure times, \(T_1\) (first event) and \(T_2\) (second event):

  1. Marginal Survival Functions:

\[ S_1(t) = P(T_1 > t), \quad S_2(t) = P(T_2 > t) \]

  1. Joint Survival Function:

\[ S(t_1, t_2) = P(T_1 > t_1, T_2 > t_2) \]

  1. Conditional Survival for ( T_2 ) Given ( T_1 ):

\[ S_2(t | T_1 = t_1) = \frac{P(T_1 > t_1, T_2 > t)}{P(T_1 > t_1)} \]

Applications

  1. Medical Studies:
    • Modeling times to recurrent events (e.g., cancer relapse).
    • Estimating survival probabilities for successive treatments.
  2. Reliability Engineering:
    • Time to failure for components in a system with dependent failure risks.
  3. Actuarial Science:
    • Modeling dependent life events in joint-life insurance policies.

This framework combines the joint survival function and marginal survival to provide dynamic risk assessments in multivariate contexts.

Conditional Survival Analysis in R

The {condSURV} package in R provides a comprehensive set of tools for estimating the conditional survival function for ordered multivariate failure time data. This package is particularly useful for analyzing survival data in the presence of multiple events, where the occurrence of one event may affect the risk of subsequent events. This package allows to estimation of the (conditional) survival function for ordered multivariate failure time data.

This tutorial follows mostly the example from condSURV: An R Package for the Estimation of the Conditional Survival Function for Ordered Multivariate Failure to demonstrate how to use the {condSURV} package to estimate the conditional survival function for ordered multivariate failure time data.

Install Required R Packages

Following R packages are required to run this notebook. If any of these packages are not installed, you can install them using the code below:

Code
packages <-c(
         'tidyverse',
         'report',
         'performance',
         'gtsummary',
         'MASS',
         'epiDisplay',
         'survival',
         'survminer',
         'ggsurvfit',
         'tidycmprsk',
         'ggfortify',
         'timereg',
         'cmprsk',
         'condSURV',
         'riskRegression',
         'condSURV'
         )
#| warning: false
#| error: false

# Install missing packages
new_packages <- packages[!(packages %in% installed.packages()[,"Package"])]
if(length(new_packages)) install.packages(new_packages)
 devtools::install_github("ItziarI/WeDiBaDis")

# Verify installation
cat("Installed packages:\n")
print(sapply(packages, requireNamespace, quietly = TRUE))

Load Packages

Code
# Load packages with suppressed messages
invisible(lapply(packages, function(pkg) {
  suppressPackageStartupMessages(library(pkg, character.only = TRUE))
}))
Code
# Check loaded packages
cat("Successfully loaded packages:\n")
Successfully loaded packages:
Code
print(search()[grepl("package:", search())])
 [1] "package:riskRegression" "package:condSURV"       "package:cmprsk"        
 [4] "package:timereg"        "package:ggfortify"      "package:tidycmprsk"    
 [7] "package:ggsurvfit"      "package:survminer"      "package:ggpubr"        
[10] "package:epiDisplay"     "package:nnet"           "package:survival"      
[13] "package:foreign"        "package:MASS"           "package:gtsummary"     
[16] "package:performance"    "package:report"         "package:lubridate"     
[19] "package:forcats"        "package:stringr"        "package:dplyr"         
[22] "package:purrr"          "package:readr"          "package:tidyr"         
[25] "package:tibble"         "package:ggplot2"        "package:tidyverse"     
[28] "package:stats"          "package:graphics"       "package:grDevices"     
[31] "package:utils"          "package:datasets"       "package:methods"       
[34] "package:base"          

Data

We will use the colonCS data set from the condSURV package which contains data from a a large clinical trial on Duke’s stage II patients with colon cancer that underwent a curative surgery for colorectal cancer Out of a total of 929 patients, 468 experienced a recurrence, and of those, 414 died. For each patient, key data was recorded, including their final vital status (whether censored or not), survival times (time to recurrence and time to death measured in days from the start of the study), and a set of covariates such as age (in years) and recurrence status (coded as 1 for yes and 0 for no). It’s important to note that the recurrence covariate is a time-dependent variable that can be considered an intermediate event.

The data frame clononCS consists of 16 variables and 686 observations. Cancer clinical trials provide numerous examples of methods used for analyzing time-to-event data.

A data frame with 929 observations on the following 15 variables. Below a brief description is given for some of these variables.

time1: Time to recurrence/censoring/death, whichever occurs first.

event1: Recurrence/censoring indicator (recurrence=1, alive=0).

Stime: Time to censoring/death, whichever occurs first.

event: Death/censoring indicator (death=1, alive=0).

rx: Treatment - Obs(ervation), Lev(amisole), Lev(amisole)+5-FU.

sex: Sex indicator (male=1, female=0).

age: Age in years.

obstruct: Obstruction of colon by tumour.

perfor: Perforation of colon.

adhere: Adherence to nearby organs.

nodes: Number of lymph nodes with detectable cancer.

differ: Differentiation of tumour (1=well, 2=moderate, 3=poor).

extent: Extent of local spread (1=submucosa, 2=muscle, 3=serosa, 4=contiguous structures).

surg: Time from surgery to registration (0=short, 1=long).

node4: More than 4 positive lymph nodes.

Code
data(colonCS, package = "condSURV")
str(colonCS)
'data.frame':   929 obs. of  15 variables:
 $ time1   : num  968 3087 542 245 523 ...
 $ event1  : num  1 0 1 1 1 1 1 0 0 0 ...
 $ Stime   : num  1521 3087 963 293 659 ...
 $ event   : num  1 0 1 1 1 1 1 0 0 0 ...
 $ rx      : Factor w/ 3 levels "Obs","Lev","Lev+5FU": 3 3 1 3 1 3 2 1 2 3 ...
 $ sex     : num  1 1 0 0 1 0 1 1 1 0 ...
 $ age     : num  43 63 71 66 69 57 77 54 46 68 ...
 $ obstruct: num  0 0 0 1 0 0 0 0 0 0 ...
 $ perfor  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ adhere  : num  0 0 1 0 0 0 0 0 1 0 ...
 $ nodes   : num  5 1 7 6 22 9 5 1 2 1 ...
 $ differ  : num  2 2 2 2 2 2 2 2 2 2 ...
 $ extent  : num  3 3 2 3 3 3 3 3 3 3 ...
 $ surg    : num  0 0 0 1 1 0 1 0 0 1 ...
 $ node4   : num  1 0 1 1 1 1 1 0 0 0 ...
Code
head(colonCS[, 1:7])
  time1 event1 Stime event      rx sex age
1   968      1  1521     1 Lev+5FU   1  43
2  3087      0  3087     0 Lev+5FU   1  63
3   542      1   963     1     Obs   0  71
4   245      1   293     1 Lev+5FU   0  66
5   523      1   659     1     Obs   1  69
6   904      1  1767     1 Lev+5FU   0  57

The individuals mentioned in lines 1, 3, 4, 5, and 6 have unfortunately faced a recurrence of their tumors and have subsequently died as a result. In contrast, the individual represented in line 2 is currently alive and has shown no signs of recurrence by the conclusion of the follow-up period. It is important to note that when we refer to “event1 = 1,” we are indicating those individuals who experienced a recurrence but are still alive at the end of the follow-up. Conversely, “event = 0” signifies individuals who have not had any recurrence of their tumors.

Survival Object and Conditional Survival Probabilities

The survCS() function in the {condSURV} package creates a survival object based on the selected variables for analysis. This function checks whether the data has been entered correctly and generates a survCS object. The arguments for this function must be provided in the following order: time1, event1, time2, event2, …, Stime, and event, where time1, time2, …, Stime represent the ordered event times, and event1, event2, …, event are their corresponding indicator statuses. This function serves a similar purpose to the Surv() function in the {survival} package.

Recurrence profoundly impacts patient outcomes, significantly influencing the course of treatment and survival rates. To better understand this impact, we can analyze ordered multivariate event time data that tracks the timeline from patient enrollment to the occurrence of cancer recurrence and ultimately to death. By estimating the conditional survival probabilities, which are mathematically represented as \(S(y | x) = P(T > y | T_1 > x)\), we can gain valuable insights into the prognosis of patients who have undergone surgery for cancer. This approach allows us to identify those individuals who, despite not experiencing a recurrence of cancer, have a higher likelihood of surviving their illness over time. The findings generated from this analysis can play a crucial role in clinical decision-making. They can guide healthcare providers in tailoring personalized care plans for patients, determining who may require more frequent follow-ups and intensified monitoring. This can ultimately enhance the quality of care and support improved survival outcomes for patients navigating their cancer journey.

You can estimate conditional survival probabilities using the survCOND() function. Start by providing a formula, with the response on the left side of the tilde (~) symbol. This response needs to be a “survCS” object, which you create with the survCS() function. You can add one covariate - either qualitative or quantitative - on the right side of the formula. This allows you to estimate survival probabilities based on current or past covariate measures. In the absence of covariates, researchers can utilize two primary methods to estimate conditional survival probabilities. The first method is based on Kaplan-Meier weights (KMW), which allow for the calculation of survival probabilities by accounting for censored data and providing a step function that estimates survival over time. The second method employs the landmarkapproach, which focuses on specific time points, using data from patients who have survived up to those points to estimate survival probabilities moving forward. Moreover, a smoothed version of the landmark approach is also available, which employs statistical techniques to produce a more refined and continuous estimate of survival probabilities over time. This smoothing can help to reduce the variability in estimates and provide clearer insights into survival trends.

Kaplan-Meier Weights

First, we estimate the survival probability \(S(y | x) = P(T > y | T_1 > x)\) given \(x = 365\) (one year) and \(y = 1825\) (five years). We use the function survCOND() with the method based on Kaplan-Meier weights (method = "KMW").

Code
# set seed for reproducibility
set.seed(123)
# Conditional survival probabilities
colon.kmw.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, 
                              x = 365, 
                              y = 1825,
                              data = colonCS,
                              method = "KMW")
# summary
summary(colon.kmw.1)

P(T>y|T1>365) 

    y  estimate lower 95% CI upper 95% CI
 1825 0.7303216    0.7003249    0.7643409

The output provides the estimated conditional survival probabilities from one year to five years, along with the 95% confidence intervals (conf = TRUE) using 200 bootstrap replicates (n.boot = 200). The results indicate that the estimated survival probability at one year is 0.73, with a 95% confidence interval of [0.70, 0.76].

When a specific value of \(x\), estimates for conditional survival rates can be derived for a vector of \(y\) values. This process allows us to analyze how survival probabilities change with time. In the following example, we will illustrate this concept by providing a detailed analysis of the estimated conditional survival associated with a given \(x\) value across a range of corresponding \(y\) values.

Code
# Conditional survival probabilities
colon.kmw.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1,
                              x = 365,
                              y = 365 * 1:7, # for 1 to 7 years
                              data = colonCS, 
                              method = "KMW")
# summary
summary(colon.kmw.2)

P(T>y|T1>365) 

    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9441430    0.9284067    0.9597404
 1095 0.8624983    0.8374853    0.8887853
 1460 0.7750519    0.7424305    0.8044838
 1825 0.7303216    0.6980508    0.7642995
 2190 0.6879923    0.6519382    0.7225499
 2555 0.6548414    0.6079402    0.6949559

If argument \(y\) is omitted, then the survCOND() function allows the user to obtain estimates for all possible \(y\) values.

Code
colon.kmw.3 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, 
                              x = 365,
                              data = colonCS,
                              method = "KMW")
summary(colon.kmw.3)

P(T>y|T1>365) 

      y  estimate lower 95% CI upper 95% CI
  365.0 1.0000000    1.0000000    1.0000000
  421.0 0.9985694    0.9943893    1.0000000
  430.0 0.9971388    0.9915472    1.0000000
  448.0 0.9957082    0.9900095    1.0000000
  454.5 0.9942758    0.9879848    0.9986053
  465.0 0.9928434    0.9858531    0.9985368
  485.0 0.9914111    0.9842044    0.9971188
  486.0 0.9899787    0.9820033    0.9971017
  499.0 0.9885463    0.9805065    0.9956731
  510.0 0.9871140    0.9775674    0.9944475
  522.0 0.9856816    0.9757786    0.9941463
  529.0 0.9842492    0.9745089    0.9929294
  580.0 0.9828169    0.9731241    0.9916148
  589.0 0.9813845    0.9713181    0.9911094
  591.0 0.9799521    0.9690553    0.9898848
  599.0 0.9785198    0.9676484    0.9886242
  603.0 0.9770874    0.9665513    0.9872702
  616.0 0.9756551    0.9650985    0.9870508
  628.0 0.9742227    0.9635436    0.9858558
  629.0 0.9727903    0.9610313    0.9841993
  641.0 0.9713580    0.9593378    0.9825457
  642.0 0.9699256    0.9578226    0.9815200
  659.0 0.9684932    0.9564035    0.9807230
  664.0 0.9670609    0.9543266    0.9800897
  665.0 0.9656285    0.9528982    0.9786769
  666.0 0.9641961    0.9507844    0.9771472
  669.0 0.9627638    0.9496542    0.9750831
  674.0 0.9613314    0.9476276    0.9745659
  675.0 0.9598991    0.9456536    0.9731503
  678.0 0.9584667    0.9430785    0.9718170
  685.0 0.9570343    0.9415362    0.9709185
  692.0 0.9556020    0.9401030    0.9709185
  696.0 0.9541696    0.9374342    0.9704017
  708.0 0.9527372    0.9367063    0.9685134
  712.0 0.9513049    0.9350019    0.9670759
  716.0 0.9498725    0.9338013    0.9670263
  720.0 0.9484401    0.9310326    0.9655837
  721.0 0.9470078    0.9296863    0.9636654
  729.0 0.9455754    0.9280299    0.9614612
  730.0 0.9441430    0.9274801    0.9614261
  739.0 0.9427107    0.9258112    0.9599432
  743.0 0.9412783    0.9237359    0.9578589
  755.0 0.9398460    0.9219075    0.9576566
  759.0 0.9384136    0.9201178    0.9556967
  764.0 0.9369812    0.9184469    0.9550049
  765.0 0.9355489    0.9172699    0.9529495
  774.0 0.9341165    0.9167045    0.9517245
  795.0 0.9326841    0.9150295    0.9502963
  802.0 0.9298194    0.9121068    0.9486257
  806.0 0.9269547    0.9093378    0.9450792
  811.0 0.9255223    0.9067044    0.9446142
  832.0 0.9240900    0.9064696    0.9432265
  833.0 0.9226576    0.9040464    0.9423567
  840.0 0.9212252    0.9025273    0.9418352
  844.0 0.9197929    0.9006190    0.9409317
  845.0 0.9183605    0.8992883    0.9391476
  846.0 0.9169281    0.8978693    0.9381157
  854.0 0.9154958    0.8972156    0.9354852
  858.0 0.9140634    0.8943440    0.9338375
  862.0 0.9126310    0.8929152    0.9323770
  863.0 0.9111987    0.8908743    0.9296945
  874.0 0.9097663    0.8890771    0.9277896
  883.0 0.9083339    0.8862518    0.9261894
  884.0 0.9069016    0.8839661    0.9252764
  887.0 0.9026045    0.8784291    0.9228282
  905.0 0.8997398    0.8751707    0.9208680
  909.0 0.8983074    0.8744877    0.9194254
  928.0 0.8968750    0.8737030    0.9182873
  929.0 0.8954427    0.8722353    0.9182140
  936.0 0.8940103    0.8703484    0.9166870
  938.0 0.8925779    0.8678321    0.9153899
  939.0 0.8911456    0.8678321    0.9130830
  940.0 0.8897132    0.8661382    0.9122347
  942.0 0.8882808    0.8650009    0.9111160
  944.0 0.8868485    0.8629139    0.9099948
  957.0 0.8854161    0.8616977    0.9072310
  961.0 0.8839838    0.8599037    0.9053912
  963.0 0.8825514    0.8584249    0.9049047
  969.0 0.8811190    0.8570123    0.9034697
  977.0 0.8796867    0.8569769    0.9030688
  986.0 0.8782543    0.8543976    0.9008660
  993.0 0.8768219    0.8516866    0.9008314
  997.0 0.8739572    0.8493454    0.8977283
 1018.0 0.8725248    0.8465257    0.8967662
 1021.0 0.8710925    0.8451662    0.8953679
 1031.0 0.8696601    0.8425653    0.8938958
 1037.0 0.8682278    0.8423896    0.8934023
 1041.0 0.8667954    0.8410579    0.8914748
 1046.0 0.8653630    0.8384669    0.8914748
 1048.0 0.8639307    0.8382214    0.8901470
 1055.0 0.8624983    0.8367595    0.8890527
 1101.0 0.8610659    0.8338694    0.8886632
 1103.0 0.8596336    0.8337661    0.8886598
 1105.0 0.8582012    0.8323307    0.8850947
 1112.0 0.8567688    0.8308728    0.8847530
 1117.0 0.8553365    0.8296411    0.8833780
 1122.0 0.8539041    0.8277706    0.8827300
 1133.0 0.8524717    0.8248892    0.8812476
 1134.0 0.8510394    0.8248892    0.8800418
 1135.0 0.8496070    0.8233274    0.8764228
 1136.0 0.8481747    0.8219421    0.8737150
 1138.0 0.8467423    0.8177401    0.8718462
 1139.5 0.8453099    0.8172466    0.8709329
 1145.0 0.8424452    0.8116952    0.8696324
 1151.0 0.8410128    0.8104128    0.8682652
 1154.0 0.8395805    0.8100914    0.8662661
 1159.0 0.8381481    0.8075195    0.8654373
 1166.0 0.8367157    0.8060852    0.8640434
 1178.0 0.8352834    0.8032064    0.8626391
 1186.0 0.8338510    0.8017477    0.8612348
 1191.0 0.8324186    0.8002781    0.8611862
 1193.0 0.8309863    0.7988420    0.8600510
 1195.0 0.8295539    0.7986909    0.8571916
 1198.0 0.8281216    0.7973384    0.8556081
 1201.0 0.8266892    0.7958829    0.8546984
 1209.0 0.8252568    0.7939436    0.8530882
 1215.0 0.8238245    0.7915248    0.8518572
 1219.0 0.8223921    0.7900521    0.8515787
 1230.0 0.8209597    0.7886111    0.8503848
 1237.0 0.8195274    0.7866554    0.8475478
 1246.0 0.8166626    0.7833522    0.8452178
 1252.0 0.8152303    0.7819280    0.8441126
 1262.0 0.8123656    0.7782456    0.8422867
 1273.0 0.8109332    0.7756785    0.8409426
 1276.0 0.8080685    0.7713209    0.8366651
 1279.0 0.8066361    0.7694316    0.8355832
 1290.0 0.8052013    0.7669869    0.8338097
 1295.0 0.8037664    0.7669869    0.8335735
 1302.0 0.8023316    0.7655340    0.8323838
 1304.0 0.8008968    0.7650378    0.8321367
 1306.0 0.7994619    0.7647982    0.8306955
 1314.0 0.7980271    0.7644569    0.8289415
 1325.0 0.7965923    0.7603908    0.8268859
 1327.0 0.7951574    0.7589216    0.8241209
 1363.0 0.7937226    0.7586594    0.8235182
 1365.0 0.7922878    0.7572011    0.8223505
 1375.0 0.7908529    0.7557427    0.8220771
 1387.0 0.7894181    0.7542843    0.8201087
 1388.0 0.7879833    0.7521469    0.8195169
 1399.0 0.7865484    0.7515610    0.8172674
 1405.0 0.7851136    0.7505895    0.8158094
 1424.0 0.7836762    0.7501277    0.8150956
 1434.0 0.7808014    0.7462562    0.8102589
 1437.0 0.7793640    0.7440517    0.8081555
 1439.0 0.7779267    0.7420116    0.8074822
 1446.0 0.7764893    0.7405819    0.8060939
 1447.0 0.7750519    0.7386737    0.8047412
 1482.0 0.7736119    0.7366378    0.8034393
 1495.0 0.7721719    0.7352918    0.8019849
 1509.0 0.7707319    0.7351938    0.8005837
 1511.0 0.7692920    0.7351739    0.7991427
 1521.0 0.7678520    0.7321932    0.7986229
 1530.0 0.7664120    0.7308769    0.7967287
 1540.0 0.7649694    0.7306560    0.7949166
 1548.0 0.7620842    0.7263540    0.7919716
 1550.0 0.7606415    0.7258219    0.7910292
 1568.0 0.7591989    0.7243395    0.7904035
 1607.0 0.7577563    0.7221208    0.7895754
 1620.0 0.7563137    0.7206734    0.7876362
 1637.0 0.7548711    0.7192626    0.7862919
 1656.0 0.7534285    0.7170881    0.7858791
 1668.5 0.7519859    0.7170650    0.7848140
 1671.0 0.7505432    0.7160778    0.7815020
 1679.0 0.7491006    0.7132896    0.7813583
 1692.0 0.7476580    0.7131771    0.7804999
 1709.0 0.7462154    0.7130410    0.7786699
 1723.0 0.7447728    0.7111790    0.7760934
 1745.0 0.7433302    0.7097618    0.7746071
 1767.0 0.7418876    0.7068906    0.7737979
 1768.0 0.7404449    0.7054368    0.7716025
 1772.0 0.7390023    0.7042771    0.7714011
 1783.0 0.7375597    0.7027646    0.7698322
 1788.0 0.7361171    0.7014702    0.7678321
 1790.0 0.7346745    0.6984899    0.7648339
 1798.0 0.7332291    0.6967220    0.7633250
 1812.0 0.7317782    0.6942086    0.7633250
 1818.0 0.7303216    0.6932213    0.7618057
 1829.0 0.7288450    0.6931720    0.7602191
 1831.0 0.7273683    0.6931566    0.7587560
 1839.0 0.7258887    0.6916855    0.7572926
 1850.0 0.7244092    0.6896339    0.7559696
 1851.0 0.7229296    0.6880611    0.7556843
 1856.0 0.7214412    0.6850763    0.7529820
 1875.0 0.7199406    0.6836528    0.7523271
 1884.0 0.7184340    0.6835558    0.7499666
 1885.0 0.7169274    0.6817957    0.7469686
 1896.0 0.7154176    0.6793582    0.7468797
 1907.0 0.7138954    0.6793559    0.7464564
 1915.0 0.7123700    0.6775051    0.7453761
 1950.0 0.7108219    0.6759298    0.7438519
 1995.0 0.7092367    0.6759143    0.7416357
 2021.0 0.7076198    0.6739822    0.7407939
 2052.0 0.7059771    0.6718190    0.7406892
 2077.0 0.7042883    0.6672636    0.7406742
 2079.0 0.7025995    0.6668771    0.7377287
 2083.0 0.7009106    0.6647065    0.7342710
 2085.0 0.6992178    0.6636076    0.7342290
 2127.0 0.6974319    0.6628497    0.7328266
 2128.0 0.6956459    0.6611240    0.7311417
 2133.0 0.6938419    0.6575560    0.7288527
 2171.0 0.6899458    0.6555291    0.7255066
 2174.0 0.6879923    0.6523813    0.7217793
 2197.0 0.6858927    0.6522170    0.7206178
 2213.0 0.6837009    0.6480457    0.7194311
 2257.0 0.6813301    0.6451224    0.7182217
 2284.0 0.6788314    0.6429198    0.7157753
 2287.0 0.6763237    0.6377525    0.7132777
 2318.0 0.6736713    0.6367081    0.7103960
 2351.0 0.6708209    0.6366735    0.7103328
 2458.0 0.6673304    0.6332311    0.7069235
 2482.0 0.6636712    0.6298875    0.7004010
 2527.0 0.6593865    0.6268316    0.7002959
 2542.0 0.6548414    0.6177049    0.6955052
 2593.0 0.6495760    0.6104359    0.6884365
 2683.0 0.6430595    0.5960433    0.6853414
 2718.0 0.6355693    0.5869412    0.6784064
 2725.0 0.6277136    0.5788900    0.6741559
 2789.0 0.6163510    0.5635200    0.6704738
 2910.0 0.5961915    0.5152072    0.6585469

Landmark Approach

The landmark approach is another method for estimating conditional survival probabilities. This method focuses on specific time points, using data from patients who have survived up to those points to estimate survival probabilities moving forward. The landmark approach is particularly useful when researchers want to assess survival probabilities at specific time points, such as one year, two years, or five years after a particular event. This method allows for a more focused analysis of survival trends at key time intervals, providing valuable insights into patient prognosis and treatment outcomes.

You can estimate the conditional survival probability \(S(y | x) = P(T > y | T_1 > x)\) using landmark methods, specifically LDM (landmark method) and PLDM (presmoothed landmark method), using the same function, survCOND(). To calculate the unsmoothed landmark estimator, set the argument method = "LDM".

Code
colon.ldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, 
                              x = 365,
                              data = colonCS, 
                              method = "LDM")
summary(colon.ldm.1, 
        times = 365 * 1:7) # summary for 1 to 7 years
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9441319    0.9273492    0.9617597
 1095 0.8624695    0.8390540    0.8847895
 1460 0.7750019    0.7430522    0.8063969
 1825 0.7302521    0.6931427    0.7595598
 2190 0.6878056    0.6466317    0.7195994
 2555 0.6543273    0.6093782    0.6907188

To obtain the presmoothed landmark estimator, you should include the argument presmooth = TRUE as well.

Code
colon.pldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1,
                                x = 365,
                                data = colonCS, 
                                method = "LDM", 
                                presmooth = TRUE)
summary(colon.pldm.1, times = 365 * 1:7) # summary for 1 to 7 years
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9429609    0.9255522    0.9626508
 1095 0.8624778    0.8375309    0.8849554
 1460 0.7788757    0.7431374    0.8153376
 1825 0.7411599    0.7080464    0.7804742
 2190 0.6795849    0.6440411    0.7176257
 2555 0.6467549    0.6106700    0.6919127

If are interested in calculating the conditional survival function, denoted as \(S(y | x) = P(T > y | T_1 \leq x)\). This function represents the probability that an individual is alive at time \(y\), given that they were alive with a recurrence at a prior time \(x\). This quantity can also be estimated using the function survCOND() by setting the argument lower.tail = TRUE.

It’s important to note that, for a given value of x, setting lower.tail = TRUE offers survival estimates based on the condition \(T1 ≤ x\). In contrast, setting lower.tail = FALSE provides survival estimates under the condition \(T1 > x\). Additionally, it’s worth mentioning that the default behavior of survCOND() is to condition on \(T1 > x\).

Code
colon.ldm.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1,
                          x = 365,
                          data = colonCS,
                          method = "LDM", 
                          lower.tail = TRUE)
 summary(colon.ldm.2, times=c(90, 180, 365, 730, 1095, 1460, 1825))
    y   estimate lower 95% CI upper 95% CI
   90 0.96956522   0.94362414   0.98735085
  180 0.89565217   0.85530311   0.93275037
  365 0.66086957   0.58894389   0.72344415
  730 0.25652174   0.20183383   0.30419848
 1095 0.10434783   0.06993006   0.14597033
 1460 0.06956522   0.03652553   0.10527988
 1825 0.06086957   0.02803412   0.08925399

Plotting Conditional Survival Probabilities

The plot() function can be used to visualize the estimated conditional survival probabilities. The function plot() can be applied to the output of the survCOND() function to generate a plot of the estimated conditional survival probabilities. The plot() function allows you to customize the appearance of the plot by specifying the color of the lines, the confidence intervals, the x-axis label, the y-axis label, and the y-axis limits. You can also adjust the layout of the plot using the par() function to create multiple plots in a single window.

Code
par(mfrow = c(2, 2))
colon.ldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
 data = colonCS, method = "LDM")
plot(colon.ldm.1, col = 1, confcol = 2, xlab = "Time (days)", ylab = "S(y|365)",
  ylim = c(0.3, 1))
colon.pldm.1 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 365,
  data = colonCS, method = "LDM", presmooth = TRUE)
plot(colon.ldm.1, col = 1, confcol = 2, xlab = "Time (days)", ylab = "S(y|365)",
  ylim = c(0.3, 1))
colon.ldm.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 1095,
  data = colonCS, method = "LDM")
plot(colon.ldm.1)
colon.pldm.2 <- survCOND(survCS(time1, event1, Stime, event) ~ 1, x = 1095,
  data = colonCS, method = "LDM", presmooth = TRUE)
plot(colon.ldm.1)

When comparing the results from the two methods, LDM and PLDM, it is evident that the semiparametric estimator, PLDM, exhibits less variability, particularly at the right tail where it has more jump points. Additionally, the semiparametric estimator tends to yield higher values at the right tail. This is because the PLDM method employs a smoothing technique that reduces the variability in the estimates, resulting in a more continuous and refined estimate of the conditional survival probabilities over time. This smoothing process helps to produce a more accurate representation of the survival trends, making it easier to interpret the results and draw meaningful conclusions from the data.

Conditional Survival Probabilities with Covariates

The survCOND() function can also be used to estimate conditional survival probabilities with covariates. This allows you to assess how different factors influence the survival outcomes of patients over time. By including covariates in the analysis, you can identify the key predictors that impact patient prognosis and tailor treatment strategies accordingly. The conditional survival probabilities can be estimated based on the covariate values, providing valuable insights into the relationship between the covariates and survival outcomes.

Conditional Survival Probabilities with Treatment Covariate (rx)

The current version of the {condSURV} package allows for the inclusion of a single covariate. The following input commands provide estimates of the conditional survival function \(S(y | x) = P(T > y | T_1 > x)\) for the three treatment groups by incorporating the covariate (rx) on the right-hand side of the formula argument:

Code
colon.rx.ldm <- survCOND(survCS(time1, event1, Stime, event) ~ rx, 
                         x = 365,
                         data = colonCS,
                         method = "LDM")
summary(colon.rx.ldm, times = 365 * 1:6)
    rx = Obs 
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9469212    0.9169480    0.9725662
 1095 0.8672736    0.8240595    0.8995649
 1460 0.7655017    0.7162151    0.8178623
 1825 0.7123480    0.6562263    0.7685700
 2190 0.6562687    0.5956060    0.7193032

    rx = Lev 
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9411765    0.9137892    0.9680402
 1095 0.8280543    0.7797524    0.8785421
 1460 0.7375566    0.6695665    0.7964706
 1825 0.7102667    0.6390013    0.7700467
 2190 0.6704293    0.6041240    0.7267357

    rx = Lev+5FU 
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9442231    0.9112707    0.9722384
 1095 0.8884462    0.8488861    0.9297573
 1460 0.8165244    0.7670171    0.8622294
 1825 0.7639544    0.7092288    0.8176257
 2190 0.7314409    0.6698974    0.7846168

The results show that the estimated conditional survival probabilities for the three treatment groups (Obs, Lev, Lev+5-FU) from one year ato six years. The confidence intervals for these estimates are also provided, allowing for a more comprehensive interpretation of the results. By including covariates in the analysis, researchers can gain valuable insights into how different factors influence patient survival outcomes and tailor treatment strategies accordingly.

We can plot the estimated conditional survival probabilities for the three treatment groups using the plot() function.

Code
 plot(colon.rx.ldm, 
      xlab = "Time (days)", 
      ylab = "S(y|365)", conf = FALSE)

Conditional Survival Probabilities for Male and Female Patients

We one can obtain the corresponding survival probabilities \(S(y | x) = P(T > y|T_1 ≤ x)\) forboth genders (1 – male). Since this variable in the data.frame colonCS is of class “integer” it must be included in the formula using function factor.

Code
colon.sex.ldm <- survCOND(survCS(time1, event1, Stime, event) ~ factor(sex), 
                          x = 365,
                          data = colonCS, method = "LDM")
summary(colon.sex.ldm, times = 365 * 1:6)
    factor(sex) = 0 
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9569231    0.9341484    0.9755533
 1095 0.8769231    0.8417339    0.9085529
 1460 0.7876565    0.7453274    0.8264345
 1825 0.7475015    0.6963559    0.7883385
 2190 0.6940773    0.6403759    0.7406388

    factor(sex) = 1 
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9329893    0.9115717    0.9593632
 1095 0.8498782    0.8162112    0.8837580
 1460 0.7639861    0.7214863    0.8025647
 1825 0.7152471    0.6698563    0.7561516
 2190 0.6822945    0.6329177    0.7223319
Code
 plot(colon.sex.ldm, 
      xlab = "Time (days)", 
      ylab = "S(y|365)", conf = FALSE)

Conditional Survival Probabilities with Age Covariate

The {condSURV} package enables users to estimate conditional survival based on a continuous covariate, which can be of class “integer” or “numeric”. For instance, the estimates and plots for conditional survival can be computed for individuals aged 48 years, represented as \(S(y|x, Z = z) = P(T > y | T_1 > x, age = 48)\).

Code
colon.ipcw.age <- survCOND(survCS(time1, event1, Stime, event) ~ age,
                             x = 365,
                             z.value = 48, 
                             data = colonCS, 
                             lower.tail = FALSE)
 summary(colon.ipcw.age, times = 365 * 1:7)
    y  estimate lower 95% CI upper 95% CI
  365 1.0000000    1.0000000    1.0000000
  730 0.9582900    0.9086672    0.9939365
 1095 0.8994077    0.8420146    0.9501387
 1460 0.8069071    0.7311283    0.8931517
 1825 0.7490154    0.6526834    0.8356264
 2190 0.7211058    0.6374155    0.8108205
 2555 0.6860070    0.5729589    0.7830038
Code
plot(colon.ipcw.age, col = 1, confcol = 2, xlab = "Time (days)",
     ylab = "P(T>y|T1>365,age=48)", ylim = c(0.5, 1))

Summary and Conclusions

This tutorial has provided an overview of conditional survival analysis for ordered multivariate failure time data using the {condSURV} package in R. We have demonstrated how to estimate the conditional survival function for ordered multivariate failure time data and how to interpret the results. By analyzing the conditional survival probabilities, researchers can gain valuable insights into the prognosis of patients who have undergone surgery for cancer and identify those individuals who have a higher likelihood of surviving their illness over time. This information can be used to guide clinical decision-making and tailor personalized care plans for patients, ultimately improving survival outcomes.

References

  1. condSURV: An R Package for the Estimation of the Conditional Survival Function for Ordered Multivariate Failure Time Data

  2. Chapter 4 Joint Models for Longitudinal and Time-to-Event Data

  3. Conditional survival

  4. condsurv