Title: | Flexible and Efficient Evaluation of Principal Surrogates/Treatment Effect Modifiers |
---|---|
Description: | Implements estimation and testing procedures for evaluating an intermediate biomarker response as a principal surrogate of a clinical response to treatment (i.e., principal stratification effect modification analysis), as described in Juraska M, Huang Y, and Gilbert PB (2020), Inference on treatment effect modification by biomarker response in a three-phase sampling design, Biostatistics, 21(3): 545-560 <doi:10.1093/biostatistics/kxy074>. The methods avoid the restrictive 'placebo structural risk' modeling assumption common to past methods and further improve robustness by the use of nonparametric kernel smoothing for biomarker density estimation. A randomized controlled two-group clinical efficacy trial is assumed with an ordered categorical or continuous univariate biomarker response measured at a fixed timepoint post-randomization and with a univariate baseline surrogate measure allowed to be observed in only a subset of trial participants with an observed biomarker response (see the flexible three-phase sampling design in the paper for details). Bootstrap-based procedures are available for pointwise and simultaneous confidence intervals and testing of four relevant hypotheses. Summary and plotting functions are provided for estimation results. |
Authors: | Michal Juraska [aut, cre] |
Maintainer: | Michal Juraska <[email protected]> |
License: | GPL-2 |
Version: | 1.0.3 |
Built: | 2024-11-08 04:13:01 UTC |
Source: | https://github.com/mjuraska/pssmooth |
Estimates ,
, on a grid of
values in bootstrap resamples (see
riskCurve
for notation introduction). Cases
() and controls (
) are sampled separately yielding a fixed number of cases and controls in each bootstrap sample. Consequentially, the number of controls
with available phase 2 data varies across bootstrap samples.
bootRiskCurve( formula, bsm, tx, data, pstype = c("continuous", "ordered"), bsmtype = c("continuous", "ordered"), bwtype = c("fixed", "generalized_nn", "adaptive_nn"), hinge = FALSE, weights = NULL, psGrid = NULL, iter, seed = NULL, saveFile = NULL, saveDir = NULL )
bootRiskCurve( formula, bsm, tx, data, pstype = c("continuous", "ordered"), bsmtype = c("continuous", "ordered"), bwtype = c("fixed", "generalized_nn", "adaptive_nn"), hinge = FALSE, weights = NULL, psGrid = NULL, iter, seed = NULL, saveFile = NULL, saveDir = NULL )
formula |
a formula object with the binary clinical endpoint on the left of the |
bsm |
a character string specifying the variable name in |
tx |
a character string specifying the variable name in |
data |
a data frame with one row per randomized participant endpoint-free at |
pstype |
a character string specifying whether the biomarker response shall be treated as a |
bsmtype |
a character string specifying whether the baseline surrogate measure shall be treated as a |
bwtype |
a character string specifying the bandwidth type for continuous variables in the kernel density estimation. The options are |
hinge |
a logical value ( |
weights |
either a numeric vector of weights or a character string specifying the variable name in |
psGrid |
a numeric vector of |
iter |
the number of bootstrap iterations |
seed |
a seed of the random number generator supplied to |
saveFile |
a character string specifying the name of an |
saveDir |
a character string specifying a path for the output directory. If |
If saveFile
and saveDir
are both specified, the output list (named bList
) is saved as an .RData
file; otherwise it is returned only.
The output object is a list with the following components:
psGrid
: a numeric vector of values at which the conditional clinical endpoint risk is estimated in the components
plaRiskCurveBoot
and
txRiskCurveBoot
plaRiskCurveBoot
: a length(psGrid)
-by-iter
matrix of estimates of for
in
psGrid
,
with columns representing bootstrap samples
txRiskCurveBoot
: a length(psGrid)
-by-iter
matrix of estimates of for
in
psGrid
,
with columns representing bootstrap samples
cpointPboot
: if hinge=TRUE
, a numeric vector of estimates of the hinge point in the placebo group in each bootstrap sample
cpointTboot
: if hinge=TRUE
, a numeric vector of estimates of the hinge point in the treatment group in each bootstrap sample
Fong, Y., Huang, Y., Gilbert, P. B., and Permar, S. R. (2017), chngpt: threshold regression model estimation and inference, BMC Bioinformatics, 18.
riskCurve
, summary.riskCurve
and plotMCEPcurve
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=1, seed=10) # alternatively, to save the .RData output file (no '<-' needed): bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=1, seed=10, saveFile="out.RData", saveDir="./")
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=1, seed=10) # alternatively, to save the .RData output file (no '<-' needed): bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=1, seed=10, saveFile="out.RData", saveDir="./")
Plots point estimates and, if available, pointwise and simultaneous Wald-type bootstrap confidence intervals for the specified marginal causal effect predictiveness (mCEP) curve.
plotMCEPcurve( object, confLevel = 0.95, hingePoint = NULL, title = NULL, xLab = NULL, yLab = NULL, yLim = NULL, pType = c("l", "p") )
plotMCEPcurve( object, confLevel = 0.95, hingePoint = NULL, title = NULL, xLab = NULL, yLab = NULL, yLim = NULL, pType = c("l", "p") )
object |
an object returned by |
confLevel |
the confidence level (0.95 by default) of pointwise and simultaneous confidence intervals |
hingePoint |
the hinge point estimate ( |
title |
a character string specifying the plot title |
xLab |
a character string specifying the x-axis label ( |
yLab |
a character string specifying the y-axis label ( |
yLim |
a numeric vector of length 2 specifying the y-axis range ( |
pType |
a character string specifying the type of plot. Possible options are |
None. The function is called solely for plot generation.
riskCurve
, bootRiskCurve
and summary.riskCurve
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.3) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid) boot <- bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=2, seed=10) sout <- summary(out, boot, contrast="te") plotMCEPcurve(sout)
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.3) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid) boot <- bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=2, seed=10) sout <- summary(out, boot, contrast="te") plotMCEPcurve(sout)
Estimates ,
, on a grid of
values following the estimation method of Juraska, Huang, and Gilbert (2018), where
is the
treatment group indicator (
, treatment;
, placebo),
is a continuous or ordered categorical univariate biomarker under assignment to
measured at fixed time
after randomization, and
is a binary clinical endpoint (
, disease;
, no disease) measured after
. The
estimator employs the generalized product kernel density/probability estimation method of Hall, Racine, and Li (2004) implemented in the
np
package. The risks
,
, where
is a vector of discrete baseline covariates, are estimated by fitting inverse probability-weighted logistic regression
models using the
osDesign
package.
riskCurve( formula, bsm, tx, data, pstype = c("continuous", "ordered"), bsmtype = c("continuous", "ordered"), bwtype = c("fixed", "generalized_nn", "adaptive_nn"), hinge = FALSE, weights = NULL, psGrid = NULL, saveFile = NULL, saveDir = NULL )
riskCurve( formula, bsm, tx, data, pstype = c("continuous", "ordered"), bsmtype = c("continuous", "ordered"), bwtype = c("fixed", "generalized_nn", "adaptive_nn"), hinge = FALSE, weights = NULL, psGrid = NULL, saveFile = NULL, saveDir = NULL )
formula |
a formula object with the binary clinical endpoint on the left of the |
bsm |
a character string specifying the variable name in |
tx |
a character string specifying the variable name in |
data |
a data frame with one row per randomized participant endpoint-free at |
pstype |
a character string specifying whether the biomarker response shall be treated as a |
bsmtype |
a character string specifying whether the baseline surrogate measure shall be treated as a |
bwtype |
a character string specifying the bandwidth type for continuous variables in the kernel density estimation. The options are |
hinge |
a logical value ( |
weights |
either a numeric vector of weights or a character string specifying the variable name in |
psGrid |
a numeric vector of |
saveFile |
a character string specifying the name of an |
saveDir |
a character string specifying a path for the output directory. If |
If saveFile
and saveDir
are both specified, the output list (named oList
) is saved as an .RData
file; otherwise it is returned only.
The output object (of class riskCurve
) is a list with the following components:
psGrid
: a numeric vector of values at which the conditional clinical endpoint risk is estimated in the components
plaRiskCurve
and
txRiskCurve
plaRiskCurve
: a numeric vector of estimates of for
in
psGrid
txRiskCurve
: a numeric vector of estimates of for
in
psGrid
fOptBandwidths
: a conbandwidth
object returned by the call of the function npcdensbw
containing the optimal bandwidths, selected by likelihood
cross-validation, in the kernel estimation of the conditional density of given the baseline surrogate measure and any other specified baseline covariates
gOptBandwidths
: a conbandwidth
object returned by the call of the function npcdensbw
or npudensbw
containing the optimal bandwidths,
selected by likelihood cross-validation, in the kernel estimation of the conditional density of given any specified baseline covariates or the marginal density
of
if no baseline covariates are specified in
formula
cpointP
: if hinge=TRUE
, the estimate of the hinge point in the placebo group
cpointT
: if hinge=TRUE
, the estimate of the hinge point in the treatment group
Fong, Y., Huang, Y., Gilbert, P. B., and Permar, S. R. (2017), chngpt: threshold regression model estimation and inference, BMC Bioinformatics, 18.
Hall, P., Racine, J., and Li, Q. (2004), Cross-validation and the estimation of conditional probability densities, JASA 99(468), 1015-1026.
Juraska, M., Huang, Y., and Gilbert, P. B. (2020), Inference on treatment effect modification by biomarker response in a three-phase sampling design, Biostatistics, 21(3): 545-560, https://doi.org/10.1093/biostatistics/kxy074.
bootRiskCurve
, summary.riskCurve
and plotMCEPcurve
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid) # alternatively, to save the .RData output file (no '<-' needed): riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, saveFile="out.RData", saveDir="./")
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid) # alternatively, to save the .RData output file (no '<-' needed): riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, saveFile="out.RData", saveDir="./")
Summarizes point estimates and pointwise and simultaneous Wald-type bootstrap confidence intervals for a specified marginal causal effect predictiveness (mCEP) curve (see, e.g., Juraska, Huang, and Gilbert (2018) for the definition).
## S3 method for class 'riskCurve' summary( object, boot = NULL, contrast = c("te", "rr", "logrr", "rd"), confLevel = 0.95, ... )
## S3 method for class 'riskCurve' summary( object, boot = NULL, contrast = c("te", "rr", "logrr", "rd"), confLevel = 0.95, ... )
object |
an object of class |
boot |
an object returned by |
contrast |
a character string specifying the mCEP curve. It must be one of |
confLevel |
the confidence level of pointwise and simultaneous confidence intervals |
... |
for other methods |
A data frame containing point and possibly interval estimates of the specified mCEP curve.
Juraska, M., Huang, Y., and Gilbert, P. B. (2020), Inference on treatment effect modification by biomarker response in a three-phase sampling design, Biostatistics, 21(3): 545-560, https://doi.org/10.1093/biostatistics/kxy074.
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=2) out <- riskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data, psGrid=grid) boot <- bootRiskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data, psGrid=grid, iter=2, seed=10) summary(out, boot, contrast="te")
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=2) out <- riskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data, psGrid=grid) boot <- bootRiskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data, psGrid=grid, iter=2, seed=10) summary(out, boot, contrast="te")
Computes a two-sided p-value either from the test of { for all
}, where
is the overall causal treatment effect on the clinical
endpoint, or from the test of {
for all
in the interval
limS1
and a specified constant }, each against a general alternative
hypothesis. The testing procedures are described in Juraska, Huang, and Gilbert (2018) and are based on the simultaneous estimation method of Roy and Bose (1953).
testConstancy( object, boot, contrast = c("te", "rr", "logrr", "rd"), null = c("H01", "H02"), overallPlaRisk = NULL, overallTxRisk = NULL, MCEPconstantH02 = NULL, limS1 = NULL )
testConstancy( object, boot, contrast = c("te", "rr", "logrr", "rd"), null = c("H01", "H02"), overallPlaRisk = NULL, overallTxRisk = NULL, MCEPconstantH02 = NULL, limS1 = NULL )
object |
an object returned by |
boot |
an object returned by |
contrast |
a character string specifying the mCEP curve. It must be one of |
null |
a character string specifying the null hypothesis to be tested; one of |
overallPlaRisk |
a numeric value of the estimated overall clinical endpoint risk in the placebo group. It is required when |
overallTxRisk |
a numeric value of the estimated overall clinical endpoint risk in the treatment group. It is required when |
MCEPconstantH02 |
the constant |
limS1 |
a numeric vector of length 2 specifying an interval that is a subset of the support of |
A numeric value representing the two-sided p-value from the test of either or
.
Juraska, M., Huang, Y., and Gilbert, P. B. (2020), Inference on treatment effect modification by biomarker response in a three-phase sampling design, Biostatistics, 21(3): 545-560, https://doi.org/10.1093/biostatistics/kxy074.
Roy, S. N. and Bose, R. C. (1953), Simultaneous condence interval estimation, The Annals of Mathematical Statistics, 24, 513-536.
riskCurve
, bootRiskCurve
and testEquality
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid) boot <- bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=2, seed=10) fit <- glm(Y ~ Z, data=data, family=binomial) prob <- predict(fit, newdata=data.frame(Z=0:1), type="response") testConstancy(out, boot, contrast="te", null="H01", overallPlaRisk=prob[1], overallTxRisk=prob[2]) testConstancy(out, boot, contrast="te", null="H02", MCEPconstantH02=0, limS1=c(qS[1],1.5))
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out <- riskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid) boot <- bootRiskCurve(formula=Y ~ S + factor(X), bsm="Sb", tx="Z", data=data, psGrid=grid, iter=2, seed=10) fit <- glm(Y ~ Z, data=data, family=binomial) prob <- predict(fit, newdata=data.frame(Z=0:1), type="response") testConstancy(out, boot, contrast="te", null="H01", overallPlaRisk=prob[1], overallTxRisk=prob[2]) testConstancy(out, boot, contrast="te", null="H02", MCEPconstantH02=0, limS1=c(qS[1],1.5))
Computes a two-sided p-value either from the test of { for all
in
limS1
}, where and
are
each associated with either a different biomarker (measured in the same units) or a different endpoint or both, or from the test of {
for all
in
limS1
}, where is a baseline dichotomous phase 1 covariate of interest, each against a general alternative
hypothesis. The testing procedures are described in Juraska, Huang, and Gilbert (2018) and are based on the simultaneous estimation method of Roy and Bose (1953).
testEquality( object1, object2, boot1, boot2, contrast = c("te", "rr", "logrr", "rd"), null = c("H03", "H04"), limS1 = NULL )
testEquality( object1, object2, boot1, boot2, contrast = c("te", "rr", "logrr", "rd"), null = c("H03", "H04"), limS1 = NULL )
object1 |
an object returned by |
object2 |
an object returned by |
boot1 |
an object returned by |
boot2 |
an object returned by |
contrast |
a character string specifying the mCEP curve. It must be one of |
null |
a character string specifying the null hypothesis to be tested; one of |
limS1 |
a numeric vector of length 2 specifying an interval that is a subset of the support of |
A numeric value representing the two-sided p-value from the test of either or
.
Juraska, M., Huang, Y., and Gilbert, P. B. (2020), Inference on treatment effect modification by biomarker response in a three-phase sampling design, Biostatistics, 21(3): 545-560, https://doi.org/10.1093/biostatistics/kxy074.
Roy, S. N. and Bose, R. C. (1953), Simultaneous condence interval estimation, The Annals of Mathematical Statistics, 24, 513-536.
riskCurve
, bootRiskCurve
and testConstancy
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out0 <- riskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==0,], psGrid=grid) out1 <- riskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==1,], psGrid=grid) boot0 <- bootRiskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==0,], psGrid=grid, iter=2, seed=10) boot1 <- bootRiskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==1,], psGrid=grid, iter=2, seed=15) testEquality(out0, out1, boot0, boot1, contrast="te", null="H04")
n <- 500 Z <- rep(0:1, each=n/2) S <- MASS::mvrnorm(n, mu=c(2,2,3), Sigma=matrix(c(1,0.9,0.7,0.9,1,0.7,0.7,0.7,1), nrow=3)) p <- pnorm(drop(cbind(1,Z,(1-Z)*S[,2],Z*S[,3]) %*% c(-1.2,0.2,-0.02,-0.2))) Y <- sapply(p, function(risk){ rbinom(1,1,risk) }) X <- rbinom(n,1,0.5) # delete S(1) in placebo recipients S[Z==0,3] <- NA # delete S(0) in treatment recipients S[Z==1,2] <- NA # generate the indicator of being sampled into the phase 2 subset phase2 <- rbinom(n,1,0.4) # delete Sb, S(0) and S(1) in controls not included in the phase 2 subset S[Y==0 & phase2==0,] <- c(NA,NA,NA) # delete Sb in cases not included in the phase 2 subset S[Y==1 & phase2==0,1] <- NA data <- data.frame(X,Z,S[,1],ifelse(Z==0,S[,2],S[,3]),Y) colnames(data) <- c("X","Z","Sb","S","Y") qS <- quantile(data$S, probs=c(0.05,0.95), na.rm=TRUE) grid <- seq(qS[1], qS[2], length.out=3) out0 <- riskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==0,], psGrid=grid) out1 <- riskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==1,], psGrid=grid) boot0 <- bootRiskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==0,], psGrid=grid, iter=2, seed=10) boot1 <- bootRiskCurve(formula=Y ~ S, bsm="Sb", tx="Z", data=data[data$X==1,], psGrid=grid, iter=2, seed=15) testEquality(out0, out1, boot0, boot1, contrast="te", null="H04")