Title: | Sieve Analysis Methods for Proportional Hazards Models |
---|---|
Description: | Implements a suite of semiparametric and nonparametric kernel-smoothed estimation and testing procedures for continuous mark-specific stratified hazard ratio (treatment/placebo) models in a randomized treatment efficacy trial with a time-to-event endpoint. Semiparametric methods, allowing multivariate marks, are described in Juraska M and Gilbert PB (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328-337 <doi:10.1111/biom.12016>, and in Juraska M and Gilbert PB (2016), Mark-specific hazard ratio model with missing multivariate marks. Lifetime Data Analysis 22(4):606-25 <doi:10.1007/s10985-015-9353-9>. Nonparametric kernel-smoothed methods, allowing univariate marks only, are described in Sun Y and Gilbert PB (2012), Estimation of stratified mark‐specific proportional hazards models with missing marks. Scandinavian Journal of Statistics}, 39(1):34-52 <doi:10.1111/j.1467-9469.2011.00746.x>, and in Gilbert PB and Sun Y (2015), Inferences on relative failure rates in stratified mark-specific proportional hazards models with missing marks, with application to human immunodeficiency virus vaccine efficacy trials. Journal of the Royal Statistical Society Series C: Applied Statistics, 64(1):49-73 <doi:10.1111/rssc.12067>. Both semiparametric and nonparametric approaches consider two scenarios: (1) the mark is fully observed in all subjects who experience the event of interest, and (2) the mark is subject to missingness-at-random in subjects who experience the event of interest. For models with missing marks, estimators are implemented based on (i) inverse probability weighting (IPW) of complete cases (for the semiparametric framework), and (ii) augmentation of the IPW estimating functions by leveraging correlations between the mark and auxiliary data to 'impute' the augmentation term for subjects with missing marks (for both the semiparametric and nonparametric framework). The augmented IPW estimators are doubly robust and recommended for use with incomplete mark data. The semiparametric methods make two key assumptions: (i) the time-to-event is assumed to be conditionally independent of the mark given treatment, and (ii) the weight function in the semiparametric density ratio/biased sampling model is assumed to be exponential. Diagnostic testing procedures for evaluating validity of both assumptions are implemented. Summary and plotting functions are provided for estimation and inferential results. |
Authors: | Michal Juraska [aut, cre], Li Li [ctb], Stephanie Wu [ctb] |
Maintainer: | Michal Juraska <[email protected]> |
License: | GPL-2 |
Version: | 1.1 |
Built: | 2024-10-29 05:17:01 UTC |
Source: | https://github.com/mjuraska/sieveph |
ggplot
ggplot
-style plotting for univariate marks. Point and interval estimates of the mark-specific treatment effect parameter specified by component
contrast
in summary.sievePH
or summary.kernel_sievePH
are plotted, together with scatter and box plots of the observed mark values by treatment.
ggplot_sieve( x, mark = NULL, tx = NULL, xlim = NULL, ylim = NULL, xtickAt = NULL, xtickLab = NULL, ytickAt = NULL, ytickLab = NULL, tickLabSize = 14, xlab = NULL, ylab = NULL, axisLabSize = 15, title = NULL, titleSize = 16, subtitle = NULL, subtitleSize = 10, txLab = c("Placebo", "Treatment"), txLabSize = 5, legendLabSize = 12, legendPosition = c(0.96, 1.08), legendJustification = c(1, 1), estLineSize = 1.6, ciLineSize = 1.2, boxplotWidth = 0.8, jitterFactor = 0.1, jitterSeed = 0, pointColor = c("blue", "red3"), pointSize = 1.7, bottomPlotMargin = c(-0.5, 0.3, 0, 0), topPlotMargin = c(0, 0.3, -0.12, 1.83), margin.axis.title.x = 10, margin.axis.title.y = 10, plotHeights = c(0.33, 0.67) )
ggplot_sieve( x, mark = NULL, tx = NULL, xlim = NULL, ylim = NULL, xtickAt = NULL, xtickLab = NULL, ytickAt = NULL, ytickLab = NULL, tickLabSize = 14, xlab = NULL, ylab = NULL, axisLabSize = 15, title = NULL, titleSize = 16, subtitle = NULL, subtitleSize = 10, txLab = c("Placebo", "Treatment"), txLabSize = 5, legendLabSize = 12, legendPosition = c(0.96, 1.08), legendJustification = c(1, 1), estLineSize = 1.6, ciLineSize = 1.2, boxplotWidth = 0.8, jitterFactor = 0.1, jitterSeed = 0, pointColor = c("blue", "red3"), pointSize = 1.7, bottomPlotMargin = c(-0.5, 0.3, 0, 0), topPlotMargin = c(0, 0.3, -0.12, 1.83), margin.axis.title.x = 10, margin.axis.title.y = 10, plotHeights = c(0.33, 0.67) )
x |
an object returned by |
mark |
a numeric vector specifying a univariate continuous mark. For subjects with a right-censored time-to-event, the value(s) in |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo) |
xlim |
a numeric vector of length 2 specifying the x-axis range ( |
ylim |
a numeric vector of length 2 specifying the y-axis range ( |
xtickAt |
a numeric vector specifying the position of x-axis tickmarks ( |
xtickLab |
a numeric vector specifying labels for tickmarks listed in |
ytickAt |
a numeric vector specifying the position of y-axis tickmarks ( |
ytickLab |
a numeric vector specifying labels for tickmarks listed in |
tickLabSize |
a numeric value specifying the font size of tickmark labels along both axes in the bottom panel ( |
xlab |
a character string specifying the x-axis label ( |
ylab |
a character string specifying the y-axis label ( |
axisLabSize |
a numeric value specifying the font size of both axis labels in the bottom panel ( |
title |
a character string specifying the plot title ( |
titleSize |
a numeric value specifying the font size of the plot title ( |
subtitle |
a character string specifying the plot subtitle ( |
subtitleSize |
a numeric value specifying the font size of the plot subtitle ( |
txLab |
a character vector of length 2 specifying the placebo and treatment labels (in this order). The default labels are |
txLabSize |
a numeric value specifying the font size of labels |
legendLabSize |
a numeric value specifying the font size of legend labels in the bottom panel ( |
legendPosition |
a numeric vector of length 2 specifying the position of the legend in the bottom panel ( |
legendJustification |
a numeric vector of length 2 specifying the justification of the legend in the bottom panel ( |
estLineSize |
a numeric value specifying the line width for the point estimate of the mark-specific treatment effect ( |
ciLineSize |
a numeric value specifying the line width for the confidence limits for the mark-specific treatment effect ( |
boxplotWidth |
a numeric value specifying the width of each box in the box plot ( |
jitterFactor |
a numeric value specifying the amount of vertical jitter ( |
jitterSeed |
a numeric value setting the seed of R's random number generator for jitter in the scatter plot ( |
pointColor |
a character vector of length 2 color-coding the placebo and treatment group (in this order) in the scatter plot ( |
pointSize |
a numeric value specifying the size of data points in the scatter plot ( |
bottomPlotMargin |
a numeric vector, using cm as the unit, passed on to argument |
topPlotMargin |
a numeric vector, using |
margin.axis.title.x |
a numeric number, using |
margin.axis.title.y |
a numeric number, using |
plotHeights |
a numeric vector specifying relative heights of the top and bottom panels ( |
A ggplot
object.
plot.summary.sievePH
, sievePH
, summary.sievePH
, kernel_sievePH
, summary.kernel_sievePH
n <- 200 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) markRng <- range(mark, na.rm=TRUE) # fit a model with a univariate mark using the sievePH method fit1 <- sievePH(eventTime, eventInd, mark, tx) sfit1 <- summary(fit1, markGrid=seq(markRng[1], markRng[2], length.out=10)) print(ggplot_sieve(sfit1, mark, tx)) # fit a model with a univariate mark using the kernel_sievePH method fit2 <- kernel_sievePH(eventTime, eventInd, mark, tx, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = NULL) sfit2 <- summary(fit2) print(ggplot_sieve(sfit2, mark, tx, xlim = markRng))
n <- 200 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) markRng <- range(mark, na.rm=TRUE) # fit a model with a univariate mark using the sievePH method fit1 <- sievePH(eventTime, eventInd, mark, tx) sfit1 <- summary(fit1, markGrid=seq(markRng[1], markRng[2], length.out=10)) print(ggplot_sieve(sfit1, mark, tx)) # fit a model with a univariate mark using the kernel_sievePH method fit2 <- kernel_sievePH(eventTime, eventInd, mark, tx, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = NULL) sfit2 <- summary(fit2) print(ggplot_sieve(sfit2, mark, tx, xlim = markRng))
kernel_sievePH
implements estimation and hypothesis testing method of
Sun et al. (2009) for a mark-specific proportional hazards model. The methods
allow separate baseline mark-specific hazard functions for different baseline
subgroups.
kernel_sievePH( eventTime, eventInd, mark, tx, zcov = NULL, strata = NULL, formulaPH = ~tx, tau = NULL, tband = NULL, hband = NULL, nvgrid = 100, a = NULL, b = NULL, ntgrid = NULL, nboot = 500, seed = NULL, maxit = 6 )
kernel_sievePH( eventTime, eventInd, mark, tx, zcov = NULL, strata = NULL, formulaPH = ~tx, tau = NULL, tband = NULL, hband = NULL, nvgrid = 100, a = NULL, b = NULL, ntgrid = NULL, nboot = 500, seed = NULL, maxit = 6 )
eventTime |
a numeric vector specifying the observed right-censored event time. |
eventInd |
a numeric vector indicating the event of interest (1 if event, 0 if right-censored). |
mark |
a numeric vector specifying a univariate continuous mark. No
missing values are permitted for subjects with |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo). |
zcov |
a data frame with one row per subject specifying possibly
time-dependent covariate(s) (not including |
strata |
a numeric vector specifying baseline strata ( |
formulaPH |
a one-sided formula object (on the right side of the
|
tau |
a numeric value specifying the duration of study follow-up period.
Failures beyond |
tband |
a numeric value between 0 and |
hband |
a numeric value between 0 and 1 specifying the bandwidth of the
kernel smoothing function over min-max normalized mark which is defined as
|
nvgrid |
an integer value (100 by default) specifying the number of equally spaced mark values between the minimum and maximum of the observed mark for which the treatment effects are evaluated. |
a |
a numeric value between the minimum and maximum of observed mark
values specifying the lower bound of the range for testing the null
hypotheses |
b |
a numeric value between the minimum and maximum of observed mark
specifying the upper bound of the range for testing the null hypotheses
|
ntgrid |
an integer value ( |
nboot |
number of bootstrap iterations (500 by default) for simulating
the distributions of the test statistics. If |
seed |
an integer specifying the random seed for reproducing the test statistics and p-values. By default, a specific seed is not set. |
maxit |
Maximum number of iterations to attempt for convergence in estimation. The default number is 6. |
kernel_sievePH
analyzes data from a randomized
placebo-controlled trial that evaluates treatment efficacy for a
time-to-event endpoint with a continuous mark. The parameter of interest is
the ratio of the conditional mark-specific hazard functions
(treatment/placebo), which is based on a stratified mark-specific
proportional hazards model. This model assumes no parametric form for the
baseline hazard function nor the treatment effect across different mark
values.
An object of class kernel_sievePH
which can be processed by
summary.kernel_sievePH
to obtain or print a summary of the
results. An object of class kernel_sievePH
is a list containing the
following components:
H10
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing for
. Columns
TSUP1
and Tint1
include test
statistics and p-values for testing vs.
for some v
(general alternative). Columns
TSUP1m
and Tint1m
include test statistics and p-values for testing
vs.
with strict inequality for some
in
(monotone alternative).
TSUP1
and TSUP1m
are
based on extensions of the classic Kolmogorov-Smirnov supremum-based test.
Tint1
and Tint1m
are based on generalizations of the
integration-based Cramer-von Mises test. Tint1
and Tint1m
involve integration of deviations over the whole range of the mark. If
nboot
is NULL
, H10
is returned as NULL
.
H20
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing : HR(v) does not
depend on
. Columns
TSUP2
and Tint2
include
test statistics and p-values for testing vs.
: HR
depends on
(general alternative). Columns
TSUP2minc
and Tint2minc
include test statistics and p-values for testing
vs.
: HR increases as
increases in
.
Columns
TSUP2mdec
and Tint2mdec
include test statistics and p-values for testing
vs.
: HR decreases as
increases in
.
TSUP2
, TSUP2mdec
, and TSUP2minc
are based on
extensions of the classic Kolmogorov-Smirnov supremum-based test.
Tint2
, Tint2mdec
, and Tint2minc
are based on generalizations of the
integration-based Cramer-von Mises test. Tint2
, Tint2mdec
, and Tint2minc
involve integration of deviations over the whole range of the mark. If
nboot
is NULL
, H20
is returned as NULL
.
estBeta
: a data frame summarizing point estimates and standard
errors of the mark-specific coefficients for treatment at equally-spaced
values between the minimum and the maximum of the observed mark values.
cBproc1
: a data frame containing equally-spaced mark values in
the column Mark
, test processes for observed data in
the column
Observed
, and for
nboot
independent
sets of normal samples in the columns S1, S2, . If
nboot
is NULL
, cBproc1
is returned as NULL
.
cBproc2
: a data frame containing equally-spaced mark values in
the column Mark
, test processes for observed data in
the column
Observed
, and for
nboot
independent
sets of normal samples in the columns S1, S2, . If
nboot
is NULL
, cBproc2
is returned as NULL
.
Lambda0
: an array of dimension K x nvgrid x ntgrid for the
kernel-smoothed baseline hazard function where
is the number of strata. If
ntgrid
is NULL
(by default), Lambda0
is returned as NULL
.
Sun, Y., Gilbert, P. B., & McKeague, I. W. (2009). Proportional hazards models with continuous marks. Annals of statistics, 37(1), 394.
Yang, G., Sun, Y., Qi, L., & Gilbert, P. B. (2017). Estimation of stratified mark-specific proportional hazards models under two-phase sampling with application to HIV vaccine efficacy trials. Statistics in biosciences, 9, 259-283.
set.seed(20240410) beta <- 2.1 gamma <- -1.3 n <- 200 tx <- rep(0:1, each = n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) } mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2) mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) / (beta - 2) mark <- ifelse(eventInd == 1, c(mark0, mark1), NA) # the true TE(v) curve underlying the data-generating mechanism is: # TE(v) = 1 - exp{alpha(beta) + beta * v + gamma} # complete-case estimation discards rows with a missing mark fit <- kernel_sievePH(eventTime, eventInd, mark, tx, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = 20)
set.seed(20240410) beta <- 2.1 gamma <- -1.3 n <- 200 tx <- rep(0:1, each = n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) } mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2) mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) / (beta - 2) mark <- ifelse(eventInd == 1, c(mark0, mark1), NA) # the true TE(v) curve underlying the data-generating mechanism is: # TE(v) = 1 - exp{alpha(beta) + beta * v + gamma} # complete-case estimation discards rows with a missing mark fit <- kernel_sievePH(eventTime, eventInd, mark, tx, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = 20)
kernel_sievePH
implements estimation methods of Sun and Gilbert (2012)
and hypothesis testing methods of Gilbert and Sun (2015) for a mark-specific
proportional hazards model accommodating that some failures have a missing
mark. The methods allow separate baseline mark-specific hazard functions for
different baseline subgroups. Missing marks are handled via augmented IPW
(AIPW) approach.
kernel_sievePHaipw( eventTime, eventInd, mark, tx, aux = NULL, auxType = NULL, zcov = NULL, strata = NULL, formulaPH = ~tx, formulaMiss = NULL, formulaAux = NULL, tau = NULL, tband = NULL, hband = NULL, nvgrid = 100, a = NULL, b = NULL, ntgrid = NULL, nboot = 500, seed = NULL, maxit = 6 )
kernel_sievePHaipw( eventTime, eventInd, mark, tx, aux = NULL, auxType = NULL, zcov = NULL, strata = NULL, formulaPH = ~tx, formulaMiss = NULL, formulaAux = NULL, tau = NULL, tband = NULL, hband = NULL, nvgrid = 100, a = NULL, b = NULL, ntgrid = NULL, nboot = 500, seed = NULL, maxit = 6 )
eventTime |
a numeric vector specifying the observed right-censored event time. |
eventInd |
a numeric vector indicating the event of interest (1 if event, 0 if right-censored). |
mark |
a numeric vector specifying a univariate continuous mark subject
to missingness at random. Missing mark values should be set to |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo). |
aux |
a numeric vector specifying a binary or a continuous auxiliary
covariate which may be potentially useful for predicting missingness, i.e,
the probability of missing, and for informing about the distribution of
missing marks. The mark missingness model only requires that the auxiliary
covariates be observed in subjects who experienced the event of interest.
For subjects with |
auxType |
a character string describing the data type of |
zcov |
a data frame with one row per subject specifying possibly
time-dependent covariate(s) (not including |
strata |
a numeric vector specifying baseline strata ( |
formulaPH |
a one-sided formula object (on the right side of the
|
formulaMiss |
a one-sided formula object (on the right side of the
|
formulaAux |
a one-sided formula object (on the right side of the
|
tau |
a numeric value specifying the duration of study follow-up period.
Failures beyond |
tband |
a numeric value between 0 and |
hband |
a numeric value between 0 and 1 specifying the bandwidth of the
kernel smoothing function over min-max normalized mark which is defined as
|
nvgrid |
an integer value (100 by default) specifying the number of equally spaced mark values between the minimum and maximum of the observed mark for which the treatment effects are evaluated. |
a |
a numeric value between the minimum and maximum of observed mark
values specifying the lower bound of the range for testing the null
hypotheses |
b |
a numeric value between the minimum and maximum of observed mark
specifying the upper bound of the range for testing the null hypotheses
|
ntgrid |
an integer value ( |
nboot |
number of bootstrap iterations (500 by default) for simulating
the distributions of the test statistics. If |
seed |
an integer specifying the random seed for reproducing the test statistics and p-values. By default, a specific seed is not set. |
maxit |
Maximum number of iterations to attempt for convergence in estimation. The default number is 6. |
kernel_sievePH
analyzes data from a randomized
placebo-controlled trial that evaluates treatment efficacy for a
time-to-event endpoint with a continuous mark. The parameter of interest is
the ratio of the conditional mark-specific hazard functions
(treatment/placebo), which is based on a stratified mark-specific
proportional hazards model. This model assumes no parametric form for the
baseline hazard function nor the treatment effect across different mark
values. For data with missing marks, the estimation procedure leverages
auxiliary predictors of whether the mark is observed and augments the IPW
estimator with auxiliary predictors of the missing mark value.
An object of class kernel_sievePH
which can be processed by
summary.kernel_sievePH
to obtain or print a summary of the
results. An object of class kernel_sievePH
is a list containing the
following components:
H10
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing for v
. Columns
TSUP1
and Tint1
include test
statistics and p-values for testing vs.
for any
(general alternative). Columns
TSUP1m
and Tint1m
include test statistics and p-values for testing
vs.
with strict inequality for some
in
(monotone alternative).
TSUP1
and TSUP1m
are
based on extensions of the classic Kolmogorov-Smirnov supremum-based test.
Tint1
and Tint1m
are based on generalizations of the
integration-based Cramer-von Mises test. Tint1
and Tint1m
involve integration of deviations over the whole range of the mark. If
nboot
is NULL
, H10
is returned as NULL
.
H20
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing : HR(v) does not
depend on
. Columns
TSUP2
and Tint2
include
test statistics and p-values for testing vs.
: HR
depends on
(general alternative). Columns
TSUP2minc
and Tint2minc
include test statistics and p-values for testing
vs.
: HR increases as
increases in
.
Columns
TSUP2mdec
and Tint2mdec
include test statistics and p-values for testing
vs.
: HR decreases as
increases in
.
TSUP2
, TSUP2mdec
, and TSUP2minc
are based on
extensions of the classic Kolmogorov-Smirnov supremum-based test.
Tint2
, Tint2mdec
, and Tint2minc
are based on generalizations of the
integration-based Cramer-von Mises test. Tint2
, Tint2mdec
, and Tint2minc
involve integration of deviations over the whole range of the mark. If
nboot
is NULL
, H20
is returned as NULL
.
estBeta
: a data frame summarizing point estimates and standard
errors of the mark-specific coefficients for treatment at equally-spaced
values between the minimum and the maximum of the observed mark values.
cBproc1
: a data frame containing equally-spaced mark values in
the column Mark
, test processes for observed data in
the column
Observed
, and for
nboot
independent
sets of normal samples in the columns S1, S2, . If
nboot
is NULL
, cBproc1
is returned as NULL
.
cBproc2
: a data frame containing equally-spaced mark values in
the column Mark
, test processes for observed data in
the column
Observed
, and for
nboot
independent
sets of normal samples in the columns S1, S2, . If
nboot
is NULL
, cBproc2
is returned as NULL
.
Lambda0
: an array of dimension K x nvgrid x ntgrid for the
kernel-smoothed baseline hazard function where
is the number of strata. If
ntgrid
is NULL
(by default), Lambda0
is returned as NULL
.
Gilbert, P. B. and Sun, Y. (2015). Inferences on relative failure rates in stratified mark-specific proportional hazards models with missing marks, with application to human immunodeficiency virus vaccine efficacy trials. Journal of the Royal Statistical Society Series C: Applied Statistics, 64(1), 49-73.
Sun, Y. and Gilbert, P. B. (2012). Estimation of stratified mark‐specific proportional hazards models with missing marks. Scandinavian Journal of Statistics, 39(1), 34-52.
Yang, G., Sun, Y., Qi, L., & Gilbert, P. B. (2017). Estimation of stratified mark-specific proportional hazards models under two-phase sampling with application to HIV vaccine efficacy trials. Statistics in biosciences, 9, 259-283.
set.seed(20240410) beta <- 4.1 gamma <- -1.3 n <- 200 tx <- rep(0:1, each = n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) } mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2) mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) / (beta - 2) mark <- ifelse(eventInd == 1, c(mark0, mark1), NA) # the true TE(v) curve underlying the data-generating mechanism is: # TE(v) = 1 - exp{alpha(beta) + beta * v + gamma} # a binary auxiliary covariate A <- sapply(exp(-0.5 - 0.2 * mark) / (1 + exp(-0.5 - 0.2 * mark)), function(p){ ifelse(is.na(p), NA, rbinom(1, 1, p)) }) linPred <- 1 + 0.4 * tx - 0.2 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, n) while (sum(R, na.rm = TRUE) < 10){ R[eventInd == 1] <- sapply(probs[eventInd == 1], function(p){ rbinom(1, 1, p) }) } # a missing-at-random mark mark[eventInd == 1] <- ifelse(R[eventInd == 1] == 1, mark[eventInd == 1], NA) # AIPW estimation; auxiliary covariate is used (not required) fit <- kernel_sievePHaipw(eventTime, eventInd, mark, tx, aux = A, auxType = "binary", formulaMiss = ~ eventTime, formulaAux = ~ eventTime + tx + mark, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = 20)
set.seed(20240410) beta <- 4.1 gamma <- -1.3 n <- 200 tx <- rep(0:1, each = n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) } mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2) mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) / (beta - 2) mark <- ifelse(eventInd == 1, c(mark0, mark1), NA) # the true TE(v) curve underlying the data-generating mechanism is: # TE(v) = 1 - exp{alpha(beta) + beta * v + gamma} # a binary auxiliary covariate A <- sapply(exp(-0.5 - 0.2 * mark) / (1 + exp(-0.5 - 0.2 * mark)), function(p){ ifelse(is.na(p), NA, rbinom(1, 1, p)) }) linPred <- 1 + 0.4 * tx - 0.2 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, n) while (sum(R, na.rm = TRUE) < 10){ R[eventInd == 1] <- sapply(probs[eventInd == 1], function(p){ rbinom(1, 1, p) }) } # a missing-at-random mark mark[eventInd == 1] <- ifelse(R[eventInd == 1] == 1, mark[eventInd == 1], NA) # AIPW estimation; auxiliary covariate is used (not required) fit <- kernel_sievePHaipw(eventTime, eventInd, mark, tx, aux = A, auxType = "binary", formulaMiss = ~ eventTime, formulaAux = ~ eventTime + tx + mark, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = 20)
plot
method for class summary.sievePH
and summary.kernel_sievePH
. For univariate marks, it plots point and interval estimates of the mark-specific treatment effect parameter specified by contrast
in summary.sievePH
, and,
optionally, scatter/box plots of the observed mark values by treatment. For bivariate marks, plotting is restricted to the point estimate, which is displayed as a surface. No plotting is provided for marks of higher dimensions.
## S3 method for class 'summary.sievePH' plot( x, mark = NULL, tx = NULL, xlim = NULL, ylim = NULL, zlim = NULL, xtickAt = NULL, xtickLab = NULL, ytickAt = NULL, ytickLab = NULL, xlab = NULL, ylab = NULL, zlab = NULL, txLab = c("Placebo", "Treatment"), title = NULL, ... )
## S3 method for class 'summary.sievePH' plot( x, mark = NULL, tx = NULL, xlim = NULL, ylim = NULL, zlim = NULL, xtickAt = NULL, xtickLab = NULL, ytickAt = NULL, ytickLab = NULL, xlab = NULL, ylab = NULL, zlab = NULL, txLab = c("Placebo", "Treatment"), title = NULL, ... )
x |
an object returned by |
mark |
either a numeric vector specifying a univariate continuous mark or a data frame specifying a multivariate continuous mark.
For subjects with a right-censored time-to-event, the value(s) in |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo) |
xlim |
a numeric vector of length 2 specifying the x-axis range ( |
ylim |
a numeric vector of length 2 specifying the y-axis range ( |
zlim |
a numeric vector of length 2 specifying the z-axis range in a 3-dimensional plot ( |
xtickAt |
a numeric vector specifing the position of x-axis tickmarks ( |
xtickLab |
a numeric vector specifying labels for tickmarks listed in |
ytickAt |
a numeric vector specifing the position of y-axis tickmarks ( |
ytickLab |
a numeric vector specifying labels for tickmarks listed in |
xlab |
a character string specifying the x-axis label ( |
ylab |
a character string specifying the y-axis label ( |
zlab |
a character string specifying the z-axis label in a 3-dimensional plot ( |
txLab |
a character vector of length 2 specifying the placebo and treatment labels (in this order). The default labels are |
title |
a character string specifying the plot title ( |
... |
other arguments to be passed to plotting functions |
For bivariate marks, markGrid
in summary.sievePH
must have equally spaced values for each component.
None. The function is called solely for plot generation.
sievePH
, sievePHipw
, sievePHaipw
and summary.sievePH
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) markRng <- range(mark, na.rm=TRUE) # fit a model with a univariate mark fit <- sievePH(eventTime, eventInd, mark, tx) sfit <- summary(fit, markGrid=seq(markRng[1], markRng[2], length.out=10)) plot(sfit, mark, tx)
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) markRng <- range(mark, na.rm=TRUE) # fit a model with a univariate mark fit <- sievePH(eventTime, eventInd, mark, tx) sfit <- summary(fit, markGrid=seq(markRng[1], markRng[2], length.out=10)) plot(sfit, mark, tx)
sievePH
implements the semiparametric estimation method of Juraska and Gilbert (2013) for the multivariate mark-
specific hazard ratio in the competing risks failure time analysis framework. It employs (i) the semiparametric
method of maximum profile likelihood estimation in the treatment-to-placebo mark density
ratio model (Qin, 1998) and (ii) the ordinary method of maximum partial likelihood estimation of the overall log hazard ratio in the Cox model.
sievePH
requires that the multivariate mark data are fully observed in all failures.
sievePH(eventTime, eventInd, mark, tx, strata = NULL)
sievePH(eventTime, eventInd, mark, tx, strata = NULL)
eventTime |
a numeric vector specifying the observed right-censored time to the event of interest |
eventInd |
a numeric vector indicating the event of interest (1 if event, 0 if right-censored) |
mark |
either a numeric vector specifying a univariate continuous mark or a data frame specifying a multivariate continuous mark.
No missing values are permitted for subjects with |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo) |
strata |
a numeric vector specifying baseline strata ( |
sievePH
considers data from a randomized placebo-controlled treatment efficacy trial with a time-to-event endpoint.
The parameter of interest, the mark-specific hazard ratio, is the ratio (treatment/placebo) of the conditional mark-specific hazard functions.
It factors as the product of the mark density ratio (treatment/placebo) and the ordinary marginal hazard function ignoring mark data.
The mark density ratio is estimated using the method of Qin (1998), while the marginal hazard ratio is estimated using coxph()
in the survival
package.
Both estimators are consistent and asymptotically normal. The joint asymptotic distribution of the estimators is detailed in Juraska and Gilbert (2013).
An object of class sievePH
which can be processed by
summary.sievePH
to obtain or print a summary of the results. An object of class
sievePH
is a list containing the following components:
DRcoef
: a numeric vector of estimates of coefficients in the weight function
in the density ratio model
DRlambda
: an estimate of the Lagrange multiplier in the profile score functions for (that arises by profiling out the nuisance parameter)
DRconverged
: a logical value indicating whether the estimation procedure in the density ratio model converged
logHR
: an estimate of the marginal log hazard ratio from coxph()
in the survival
package
cov
: the estimated joint covariance matrix of DRcoef
and logHR
coxphFit
: an object returned by the call of coxph()
nPlaEvents
: the number of events observed in the placebo group
nTxEvents
: the number of events observed in the treatment group
mark
: the input object
tx
: the input object
Juraska, M. and Gilbert, P. B. (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328–337.
Qin, J. (1998), Inferences for case-control and semiparametric two-sample density ratio models. Biometrika 85, 619–630.
summary.sievePH
, plot.summary.sievePH
, testIndepTimeMark
and testDensRatioGOF
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # fit a model with a univariate mark fit <- sievePH(eventTime, eventInd, mark1, tx) # fit a model with a bivariate mark fit <- sievePH(eventTime, eventInd, data.frame(mark1, mark2), tx)
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # fit a model with a univariate mark fit <- sievePH(eventTime, eventInd, mark1, tx) # fit a model with a bivariate mark fit <- sievePH(eventTime, eventInd, data.frame(mark1, mark2), tx)
sievePHaipw
implements the semiparametric augmented inverse probability weighted (AIPW) complete-case estimation method of Juraska and Gilbert (2015) for the multivariate mark-
specific hazard ratio, with the mark subject to missingness at random. It extends Juraska and Gilbert (2013) by (i) weighting complete cases (i.e., subjects with complete marks) by the inverse of their estimated
probabilities given auxiliary covariates and/or treatment, and (ii) adding an augmentation term (the conditional expected profile score given auxiliary covariates and/or treatment) to the IPW estimating equations in the density
ratio model for increased efficiency and robustness to mis-specification of the missingness model (Robins et al., 1994). The probabilities of observing the mark are estimated by fitting a logistic regression model with a user-specified linear predictor.
The mean profile score vector (the augmentation term) in the density ratio model is estimated by fitting a linear regression model with a user-specified linear predictor. Coefficients in the treatment-to-placebo mark density ratio model
(Qin, 1998) are estimated by solving the AIPW estimating equations. The ordinary method of maximum partial likelihood estimation is employed for estimating the overall log hazard ratio in the Cox model.
sievePHaipw( eventTime, eventInd, mark, tx, aux = NULL, strata = NULL, formulaMiss, formulaScore )
sievePHaipw( eventTime, eventInd, mark, tx, aux = NULL, strata = NULL, formulaMiss, formulaScore )
eventTime |
a numeric vector specifying the observed right-censored event time |
eventInd |
a numeric vector indicating the event of interest (1 if event, 0 if right-censored) |
mark |
either a numeric vector specifying a univariate continuous mark or a data frame specifying a multivariate continuous mark subject to missingness at random. Missing mark values should be set to |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo) |
aux |
a data frame specifying auxiliary covariates predictive of the probability of observing the mark. The mark missingness model only requires that the auxiliary covariates be observed in subjects who experienced the event of interest. For subjects with |
strata |
a numeric vector specifying baseline strata ( |
formulaMiss |
a one-sided formula object specifying (on the right side of the |
formulaScore |
a one-sided formula object specifying (on the right side of the |
sievePHaipw
considers data from a randomized placebo-controlled treatment efficacy trial with a time-to-event endpoint.
The parameter of interest, the mark-specific hazard ratio, is the ratio (treatment/placebo) of the conditional mark-specific hazard functions.
It factors as the product of the mark density ratio (treatment/placebo) and the ordinary marginal hazard function ignoring mark data.
The mark density ratio is estimated using the AIPW complete-case estimation method, following Robins et al. (1994) and extending Qin (1998), and
the marginal hazard ratio is estimated using coxph()
in the survival
package.
The asymptotic properties of the AIPW complete-case estimator are detailed in Juraska and Gilbert (2015).
An object of class sievePH
which can be processed by
summary.sievePH
to obtain or print a summary of the results. An object of class
sievePH
is a list containing the following components:
DRcoef
: a numeric vector of estimates of coefficients in the weight function
in the density ratio model
DRlambda
: an estimate of the Lagrange multiplier in the profile score functions for (that arises by profiling out the nuisance parameter)
DRconverged
: a logical value indicating whether the estimation procedure in the density ratio model converged
logHR
: an estimate of the marginal log hazard ratio from coxph()
in the survival
package
cov
: the estimated joint covariance matrix of DRcoef
and logHR
coxphFit
: an object returned by the call of coxph()
nPlaEvents
: the number of events observed in the placebo group
nTxEvents
: the number of events observed in the treatment group
mark
: the input object
tx
: the input object
Juraska, M., and Gilbert, P. B. (2015), Mark-specific hazard ratio model with missing multivariate marks. Lifetime Data Analysis 22(4): 606-25.
Juraska, M. and Gilbert, P. B. (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328-337.
Qin, J. (1998), Inferences for case-control and semiparametric two-sample density ratio models. Biometrika 85, 619-630.
Robins, J. M., Rotnitzky, A., and Zhao, L. P. (1994), Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association 89(427): 846-866.
summary.sievePH
, plot.summary.sievePH
, testIndepTimeMark
and testDensRatioGOF
n <- 500 tx <- rep(0:1, each=n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n / 2, 2, 5), rbeta(n / 2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n / 2, 1, 3), rbeta(n / 2, 5, 1)), NA) # a continuous auxiliary covariate A <- (mark1 + 0.4 * runif(n)) / 1.4 linPred <- -0.8 + 0.4 * tx + 0.8 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, length(probs)) while (sum(R, na.rm=TRUE) < 10){ R[eventInd==1] <- sapply(probs[eventInd==1], function(p){ rbinom(1, 1, p) }) } # produce missing-at-random marks mark1[eventInd==1] <- ifelse(R[eventInd==1]==1, mark1[eventInd==1], NA) mark2[eventInd==1] <- ifelse(R[eventInd==1]==1, mark2[eventInd==1], NA) # fit a model with a bivariate mark fit <- sievePHaipw(eventTime, eventInd, mark=data.frame(mark1, mark2), tx, aux=data.frame(A), formulaMiss= ~ tx * A, formulaScore= ~ tx * A + I(A^2))
n <- 500 tx <- rep(0:1, each=n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n / 2, 2, 5), rbeta(n / 2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n / 2, 1, 3), rbeta(n / 2, 5, 1)), NA) # a continuous auxiliary covariate A <- (mark1 + 0.4 * runif(n)) / 1.4 linPred <- -0.8 + 0.4 * tx + 0.8 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, length(probs)) while (sum(R, na.rm=TRUE) < 10){ R[eventInd==1] <- sapply(probs[eventInd==1], function(p){ rbinom(1, 1, p) }) } # produce missing-at-random marks mark1[eventInd==1] <- ifelse(R[eventInd==1]==1, mark1[eventInd==1], NA) mark2[eventInd==1] <- ifelse(R[eventInd==1]==1, mark2[eventInd==1], NA) # fit a model with a bivariate mark fit <- sievePHaipw(eventTime, eventInd, mark=data.frame(mark1, mark2), tx, aux=data.frame(A), formulaMiss= ~ tx * A, formulaScore= ~ tx * A + I(A^2))
sievePHipw
implements the semiparametric inverse probability weighted (IPW) complete-case estimation method of Juraska and Gilbert (2015) for the multivariate mark-
specific hazard ratio, with the mark subject to missingness at random. It extends Juraska and Gilbert (2013) by weighting complete cases by the inverse of their estimated
probabilities given auxiliary covariates and/or treatment. The probabilities are estimated by fitting a logistic regression model with a user-specified linear predictor.
Coefficients in the treatment-to-placebo mark density ratio model (Qin, 1998) are estimated by solving the IPW estimating equations. The ordinary method of maximum partial likelihood
estimation is employed for estimating the overall log hazard ratio in the Cox model.
sievePHipw( eventTime, eventInd, mark, tx, aux = NULL, strata = NULL, formulaMiss )
sievePHipw( eventTime, eventInd, mark, tx, aux = NULL, strata = NULL, formulaMiss )
eventTime |
a numeric vector specifying the observed right-censored event time |
eventInd |
a numeric vector indicating the event of interest (1 if event, 0 if right-censored) |
mark |
either a numeric vector specifying a univariate continuous mark or a data frame specifying a multivariate continuous mark subject to missingness at random. Missing mark values should be set to |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo) |
aux |
a data frame specifying auxiliary covariates predictive of the probability of observing the mark. The mark missingness model only requires that the auxiliary covariates be observed in subjects who experienced the event of interest. For subjects with |
strata |
a numeric vector specifying baseline strata ( |
formulaMiss |
a one-sided formula object specifying (on the right side of the |
sievePHipw
considers data from a randomized placebo-controlled treatment efficacy trial with a time-to-event endpoint.
The parameter of interest, the mark-specific hazard ratio, is the ratio (treatment/placebo) of the conditional mark-specific hazard functions.
It factors as the product of the mark density ratio (treatment/placebo) and the ordinary marginal hazard function ignoring mark data.
The mark density ratio is estimated using the IPW complete-case estimation method, extending Qin (1998), and
the marginal hazard ratio is estimated using coxph()
in the survival
package.
The asymptotic properties of the IPW complete-case estimator are detailed in Juraska and Gilbert (2015).
An object of class sievePH
which can be processed by
summary.sievePH
to obtain or print a summary of the results. An object of class
sievePH
is a list containing the following components:
DRcoef
: a numeric vector of estimates of coefficients in the weight function
in the density ratio model
DRlambda
: an estimate of the Lagrange multiplier in the profile score functions for (that arises by profiling out the nuisance parameter)
DRconverged
: a logical value indicating whether the estimation procedure in the density ratio model converged
logHR
: an estimate of the marginal log hazard ratio from coxph()
in the survival
package
cov
: the estimated joint covariance matrix of DRcoef
and logHR
coxphFit
: an object returned by the call of coxph()
nPlaEvents
: the number of events observed in the placebo group
nTxEvents
: the number of events observed in the treatment group
mark
: the input object
tx
: the input object
Juraska, M., and Gilbert, P. B. (2015), Mark-specific hazard ratio model with missing multivariate marks. Lifetime Data Analysis 22(4): 606-25.
Juraska, M. and Gilbert, P. B. (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328-337.
Qin, J. (1998), Inferences for case-control and semiparametric two-sample density ratio models. Biometrika 85, 619-630.
summary.sievePH
, plot.summary.sievePH
, testIndepTimeMark
and testDensRatioGOF
n <- 500 tx <- rep(0:1, each=n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n / 2, 2, 5), rbeta(n / 2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n / 2, 1, 3), rbeta(n / 2, 5, 1)), NA) # a continuous auxiliary covariate A <- (mark1 + 0.4 * runif(n)) / 1.4 linPred <- -0.8 + 0.4 * tx + 0.8 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, length(probs)) while (sum(R, na.rm=TRUE) < 10){ R[eventInd==1] <- sapply(probs[eventInd==1], function(p){ rbinom(1, 1, p) }) } # produce missing-at-random marks mark1[eventInd==1] <- ifelse(R[eventInd==1]==1, mark1[eventInd==1], NA) mark2[eventInd==1] <- ifelse(R[eventInd==1]==1, mark2[eventInd==1], NA) # fit a model with a bivariate mark fit <- sievePHipw(eventTime, eventInd, mark=data.frame(mark1, mark2), tx, aux=data.frame(A), formulaMiss= ~ tx * A)
n <- 500 tx <- rep(0:1, each=n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n / 2, 2, 5), rbeta(n / 2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n / 2, 1, 3), rbeta(n / 2, 5, 1)), NA) # a continuous auxiliary covariate A <- (mark1 + 0.4 * runif(n)) / 1.4 linPred <- -0.8 + 0.4 * tx + 0.8 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, length(probs)) while (sum(R, na.rm=TRUE) < 10){ R[eventInd==1] <- sapply(probs[eventInd==1], function(p){ rbinom(1, 1, p) }) } # produce missing-at-random marks mark1[eventInd==1] <- ifelse(R[eventInd==1]==1, mark1[eventInd==1], NA) mark2[eventInd==1] <- ifelse(R[eventInd==1]==1, mark2[eventInd==1], NA) # fit a model with a bivariate mark fit <- sievePHipw(eventTime, eventInd, mark=data.frame(mark1, mark2), tx, aux=data.frame(A), formulaMiss= ~ tx * A)
summary
method for an object of class kernel_sievePH
.
## S3 method for class 'kernel_sievePH' summary( object, contrast = c("te", "hr", "loghr"), sieveAlternative = c("twoSided", "oneSided"), confLevel = 0.95, ... ) ## S3 method for class 'summary.kernel_sievePH' print(x, digits = 4, ...)
## S3 method for class 'kernel_sievePH' summary( object, contrast = c("te", "hr", "loghr"), sieveAlternative = c("twoSided", "oneSided"), confLevel = 0.95, ... ) ## S3 method for class 'summary.kernel_sievePH' print(x, digits = 4, ...)
object |
an object of class |
contrast |
a character string specifying the treatment effect parameter
of interest. The default value is |
sieveAlternative |
a character string specifying the alternative
hypothesis for the sieve tests, which can be either |
confLevel |
the confidence level (0.95 by default) of reported confidence intervals |
... |
further arguments passed to or from other methods |
x |
an object of class |
digits |
the number of significant digits to use when printing (4 by default) |
print.summary.kernel_sievePH
prints a formatted summary of
results. Inference about coefficients in the kernel-smoothed mark-specific
proportional hazards model is tabulated. Additionally, a summary is
generated
from the tests of two relevant null hypotheses: (1) { for
all
}, and (2) {
for all
}. For the tests
of (2),
sieveAlternative
controls the choice of the alternative
hypothesis.
An object of class summary.kernel_sievePH
, which is a list
with the following components:
estBeta
: a data frame summarizing point estimates and standard
errors of the mark-specific coefficients for treatment.
HRunity.2sided
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing vs.
for any v
(general
alternative).
TSUP1
is based on an extension of the classic Kolmogorov-Smirnov
supremum-based test. Tint1
is a generalization
of the integration-based Cramer-von Mises test.
HRunity.1sided
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing vs.
with strict inequality for some v
(monotone alternative).
TSUP1m
is based on an extension of the classic
Kolmogorov-Smirnov supremum-based test. Tint1m
is a generalization
of the integration-based Cramer-von Mises test.
HRconstant.2sided
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing : HR(v) does not depend on v
vs.
: HR depends on v
(general alternative).
TSUP2
is based on an extension of the classic
Kolmogorov-Smirnov supremum-based test. Tint2
is a generalization
of the integration-based Cramer-von Mises test.
This component is available if sieveAlternative="twoSided"
.
HRconstant.1sided
: a data frame with test statistics (first row) and
corresponding p-values (second row) for testing : HR(v) does not depend on v
vs.
: HR increases as v increases in
and testing
vs.
: HR decreases as v increases in
.
TSUP2mdec
and TSUP2minc
are based on an extension of the classic
Kolmogorov-Smirnov supremum-based test. Tint2mdec
and Tint2minc
are a generalization
of the integration-based Cramer-von Mises test.
This component is available if sieveAlternative="oneSided"
.
te
: a data frame summarizing point and interval estimates of the
mark-specific treatment efficacy on the grid of mark values defined by
nvgrid
spanning from the minimum and maximum of the mark (available if contrast="te"
).
The confidence level is specified by confLevel
.
hr
: a data frame summarizing point and interval estimates of the
mark-specific hazard ratio on the grid of mark values defined by
nvgrid
spanning from the minimum and maximum of the mark (available if
contrast="hr"
). The confidence level is specified by confLevel
.
loghr
: a data frame summarizing point and interval estimates of
the mark-specific log hazard ratio on the grid of mark values defined by
nvgrid
spanning from the minimum and maximum of the mark (available if
contrast="loghr"
). The confidence level is specified by
confLevel
.
Gilbert, P. B. and Sun, Y. (2015). Inferences on relative failure rates in stratified mark-specific proportional hazards models with missing marks, with application to human immunodeficiency virus vaccine efficacy trials. Journal of the Royal Statistical Society Series C: Applied Statistics, 64(1), 49-73.
Sun, Y. and Gilbert, P. B. (2012). Estimation of stratified mark‐specific proportional hazards models with missing marks. Scandinavian Journal of Statistics, 39(1), 34-52.
set.seed(20240410) beta <- 2.1 gamma <- -1.3 n <- 200 tx <- rep(0:1, each = n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) } mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2) mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) / (beta - 2) mark <- ifelse(eventInd == 1, c(mark0, mark1), NA) # the true TE(v) curve underlying the data-generating mechanism is: # TE(v) = 1 - exp{alpha(beta) + beta * v + gamma} # a binary auxiliary covariate A <- sapply(exp(-0.5 - 0.2 * mark) / (1 + exp(-0.5 - 0.2 * mark)), function(p){ ifelse(is.na(p), NA, rbinom(1, 1, p)) }) linPred <- 1 + 0.4 * tx - 0.2 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, n) while (sum(R, na.rm = TRUE) < 10){ R[eventInd == 1] <- sapply(probs[eventInd == 1], function(p){ rbinom(1, 1, p) }) } # a missing-at-random mark mark[eventInd == 1] <- ifelse(R[eventInd == 1] == 1, mark[eventInd == 1], NA) # AIPW estimation; auxiliary covariate is used (not required) fit <- kernel_sievePHaipw(eventTime, eventInd, mark, tx, aux = A, auxType = "binary", formulaMiss = ~ eventTime, formulaAux = ~ eventTime + tx + mark, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = 20) sfit <- summary(fit) # print the formatted summary sfit # treatment efficacy estimates on the grid sfit$te
set.seed(20240410) beta <- 2.1 gamma <- -1.3 n <- 200 tx <- rep(0:1, each = n / 2) tm <- c(rexp(n / 2, 0.2), rexp(n / 2, 0.2 * exp(gamma))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) alpha <- function(b){ log((1 - exp(-2)) * (b - 2) / (2 * (exp(b - 2) - 1))) } mark0 <- log(1 - (1 - exp(-2)) * runif(n / 2)) / (-2) mark1 <- log(1 + (beta - 2) * (1 - exp(-2)) * runif(n / 2) / (2 * exp(alpha(beta)))) / (beta - 2) mark <- ifelse(eventInd == 1, c(mark0, mark1), NA) # the true TE(v) curve underlying the data-generating mechanism is: # TE(v) = 1 - exp{alpha(beta) + beta * v + gamma} # a binary auxiliary covariate A <- sapply(exp(-0.5 - 0.2 * mark) / (1 + exp(-0.5 - 0.2 * mark)), function(p){ ifelse(is.na(p), NA, rbinom(1, 1, p)) }) linPred <- 1 + 0.4 * tx - 0.2 * A probs <- exp(linPred) / (1 + exp(linPred)) R <- rep(NA, n) while (sum(R, na.rm = TRUE) < 10){ R[eventInd == 1] <- sapply(probs[eventInd == 1], function(p){ rbinom(1, 1, p) }) } # a missing-at-random mark mark[eventInd == 1] <- ifelse(R[eventInd == 1] == 1, mark[eventInd == 1], NA) # AIPW estimation; auxiliary covariate is used (not required) fit <- kernel_sievePHaipw(eventTime, eventInd, mark, tx, aux = A, auxType = "binary", formulaMiss = ~ eventTime, formulaAux = ~ eventTime + tx + mark, tau = 3, tband = 0.5, hband = 0.3, nvgrid = 20, nboot = 20) sfit <- summary(fit) # print the formatted summary sfit # treatment efficacy estimates on the grid sfit$te
summary
method for an object of class sievePH
.
## S3 method for class 'sievePH' summary( object, markGrid, contrast = c("te", "hr", "loghr"), sieveAlternative = c("twoSided", "HRincrease", "HRdecrease"), confLevel = 0.95, ... ) ## S3 method for class 'summary.sievePH' print(x, digits = 4, ...)
## S3 method for class 'sievePH' summary( object, markGrid, contrast = c("te", "hr", "loghr"), sieveAlternative = c("twoSided", "HRincrease", "HRdecrease"), confLevel = 0.95, ... ) ## S3 method for class 'summary.sievePH' print(x, digits = 4, ...)
object |
an object of class |
markGrid |
a matrix specifying a grid of multivariate mark values, where rows correspond to different values on the (multivariate) grid and columns correspond to components of the mark. A numeric vector is allowed
for univariate marks. The point and interval estimates of the |
contrast |
a character string specifying the treatment effect parameter of interest. The default value is |
sieveAlternative |
a character string specifying the alternative hypothesis for the sieve tests, which can be either |
confLevel |
the confidence level (0.95 by default) of reported confidence intervals |
... |
further arguments passed to or from other methods |
x |
an object of class |
digits |
the number of significant digits to use when printing (4 by default) |
print.summary.sievePH
prints a formatted summary of results. Inference about coefficients in the mark-specific proportional hazards model is tabulated. Additionally, a summary is generated
from the likelihood-ratio and Wald tests of two relevant null hypotheses: (1) { for all
}, and (2) {
for all
}. For the tests of (2) and a univariate
mark,
sieveAlternative
controls the choice of the alternative hypothesis.
An object of class summary.sievePH
, which is a list with the following components:
coef
: a data frame summarizing point and interval estimates of the density ratio model coefficients and the marginal log hazard ratio (the confidence level is specified by confLevel
), and p-values from the
two-sided Wald test of the null hypothesis that the parameter equals zero
pLR.HRunity.2sided
: a numeric vector with two named components: pLR.dRatio.2sided
is a p-value from the two-sided profile likelihood-ratio test of the null hypothesis , where
is the
vector of mark coefficients in the mark density ratio model, and
pLR.cox.2sided
is a p-value from the two-sided partial likelihood-ratio test of the null hypothesis , where
is the
marginal log hazard ratio in the Cox model. The two p-values are intended for the use of the Simes (1986) procedure as described on page 4 in Juraska and Gilbert (2013).
pWald.HRunity.2sided
: a p-value from the two-sided Wald test of the null hypothesis { for all
}
pWtWald.HRunity.1sided
: a p-value from the one-sided weighted Wald test of the null hypothesis { for all
} against the alternative hypothesis {
and
is
increasing in each component of
}
pLR.HRconstant.2sided
: a p-value from the two-sided profile likelihood-ratio test of the null hypothesis { for all
}. This component is available if
sieveAlternative="twoSided"
.
pLR.HRconstant.1sided
: a numeric vector with two named components: pLR.dRatio.2sided
is a p-value from the two-sided profile likelihood-ratio test of the null hypothesis { for all
},
and
estBeta
is the point estimate of the univariate mark coefficient in the density ratio model. This component is available if the mark is univariate and sieveAlternative="oneSided"
.
pWald.HRconstant.2sided
: a p-value from the two-sided Wald test of the null hypothesis { for all
}. This component is available if
sieveAlternative="twoSided"
.
pWald.HRconstant.1sided
: a p-value from the one-sided Wald test of the null hypothesis { for all
} against the alternative hypothesis {
is increasing in
} or {
is decreasing in
}.
This component is available if the mark is univariate and
sieveAlternative="HRdecrease"
or sieveAlternative="HRincrease"
.
te
: a data frame summarizing point and interval estimates of the mark-specific treatment efficacy on the grid of mark values in markGrid
(available if contrast="te"
). The confidence level is specified
by confLevel
.
hr
: a data frame summarizing point and interval estimates of the mark-specific hazard ratio on the grid of mark values in markGrid
(available if contrast="hr"
). The confidence level is specified by
confLevel
.
loghr
: a data frame summarizing point and interval estimates of the mark-specific log hazard ratio on the grid of mark values in markGrid
(available if contrast="loghr"
). The confidence level is specified by
confLevel
.
Juraska, M. and Gilbert, P. B. (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328–337.
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # fit a model with a bivariate mark fit <- sievePH(eventTime, eventInd, data.frame(mark1, mark2), tx) sfit <- summary(fit, markGrid=matrix(c(0.3, 0.3, 0.6, 0.3, 0.3, 0.6, 0.6, 0.6), ncol=2, byrow=TRUE)) # print the formatted summary sfit # treatment efficacy estimates on the grid sfit$te
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # fit a model with a bivariate mark fit <- sievePH(eventTime, eventInd, data.frame(mark1, mark2), tx) sfit <- summary(fit, markGrid=matrix(c(0.3, 0.3, 0.6, 0.3, 0.3, 0.6, 0.6, 0.6), ncol=2, byrow=TRUE)) # print the formatted summary sfit # treatment efficacy estimates on the grid sfit$te
testDensRatioGoF
implements the complete-case goodness-of-fit test of Qin and Zhang (1997) for evaluating the validity of the specified mark density ratio model used for modeling a component of
the mark-specific hazard ratio model in Juraska and Gilbert (2013). Multivariate marks are accommodated. Subjects who experienced the event of interest but their mark is missing are discarded.
testDensRatioGOF( eventInd, mark, tx, DRcoef = NULL, DRlambda = NULL, iter = 1000 )
testDensRatioGOF( eventInd, mark, tx, DRcoef = NULL, DRlambda = NULL, iter = 1000 )
eventInd |
a numeric vector indicating the event of interest (1 if event, 0 if right-censored) |
mark |
either a numeric vector specifying a univariate continuous mark or a data frame specifying a multivariate continuous mark.
For subjects with a right-censored time-to-event, the value(s) in |
tx |
a numeric vector indicating the treatment group (1 if treatment, 0 if placebo) |
DRcoef |
a numeric vector of the coefficients |
DRlambda |
the Lagrange multiplier in the profile score functions for |
iter |
the number of bootstrap iterations (1000 by default) |
testDensRatioGoF
performs a goodness-of-fit test for the exponential form of the weight function, i.e., . Other weight functions are not considered.
Returns a list containing the following components:
teststat
: the value of the Kolmogorov-Smirnov-type test statistic
pval
: the bootstrap p-value from the Kolmogorov-Smirnov-type test of validity of the mark density ratio model
DRcoef
: the input object if different from NULL
or a numeric vector of estimates of coefficients in the weight function
in the density ratio model
DRlambda
: the input object if different from NULL
or an estimate of the Lagrange multiplier in the profile score functions for
Qin, J., & Zhang, B. (1997). A goodness-of-fit test for logistic regression models based on case-control data. Biometrika, 84(3), 609-618.
Juraska, M. and Gilbert, P. B. (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328-337.
Qin, J. (1998), Inferences for case-control and semiparametric two-sample density ratio models. Biometrika 85, 619-630.
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # test goodness-of-fit for a univariate mark testDensRatioGOF(eventInd, mark1, tx, iter=15) # test goodness-of-fit for a bivariate mark testDensRatioGOF(eventInd, data.frame(mark1, mark2), tx, iter=15)
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # test goodness-of-fit for a univariate mark testDensRatioGOF(eventInd, mark1, tx, iter=15) # test goodness-of-fit for a bivariate mark testDensRatioGOF(eventInd, data.frame(mark1, mark2), tx, iter=15)
A nonparametric Komogorov-Smirnov-type test of the null hypothesis that the time-to-event and a possibly multivariate mark
are conditionally independent given treatment
as described in Juraska and Gilbert (2013). The conditional independence is a necessary assumption for parameter identifiability in the time-independent density ratio model. A bootstrap
algorithm is used to compute the p-value.
testIndepTimeMark(data, iter = 1000)
testIndepTimeMark(data, iter = 1000)
data |
a data frame restricted to subjects in a given treatment group with the following columns (in this order): the observed right-censored time to the event of interest, the event indicator (1 if event, 0 if right-censored), and the mark variable (one column for each component, if multivariate) |
iter |
the number of bootstrap iterations (1000 by default) used for computing the p-value |
The test statistic is the supremum of the difference between the estimated conditional joint cumulative distribution function (cdf) of given
and the product of
the estimated conditional cdfs of
and
given
. The joint cdf is estimated by the nonparametric maximum likelihood estimator developed by
Huang and Louis (1998). The marginal cdf of
is estimated as one minus the Kaplan-Meier estimator for the conditional survival function of
, and the
cdf of
is estimated as the empirical cdf of the observed values of
. A bootstrap algorithm is used to compute the p-value.
Returns the bootstrap p-value from the test of conditional independence between and
given
.
Juraska, M. and Gilbert, P. B. (2013), Mark-specific hazard ratio model with multivariate continuous marks: an application to vaccine efficacy. Biometrics 69(2):328–337.
Huang, Y. and Louis, T. (1998), Nonparametric estimation of the joint distribution of survival time and mark variables. Biometrika 85, 785–798.
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # perform the test for a univariate mark in the placebo group testIndepTimeMark(data.frame(eventTime, eventInd, mark1)[tx==0, ], iter=20) # perform the test for a bivariate mark in the placebo group testIndepTimeMark(data.frame(eventTime, eventInd, mark1, mark2)[tx==0, ], iter=20)
n <- 500 tx <- rep(0:1, each=n/2) tm <- c(rexp(n/2, 0.2), rexp(n/2, 0.2 * exp(-0.4))) cens <- runif(n, 0, 15) eventTime <- pmin(tm, cens, 3) eventInd <- as.numeric(tm <= pmin(cens, 3)) mark1 <- ifelse(eventInd==1, c(rbeta(n/2, 2, 5), rbeta(n/2, 2, 2)), NA) mark2 <- ifelse(eventInd==1, c(rbeta(n/2, 1, 3), rbeta(n/2, 5, 1)), NA) # perform the test for a univariate mark in the placebo group testIndepTimeMark(data.frame(eventTime, eventInd, mark1)[tx==0, ], iter=20) # perform the test for a bivariate mark in the placebo group testIndepTimeMark(data.frame(eventTime, eventInd, mark1, mark2)[tx==0, ], iter=20)