Title: | General Semiparametric Shared Frailty Model |
---|---|
Description: | Simulates and fits semiparametric shared frailty models under a wide range of frailty distributions using a consistent and asymptotically-normal estimator. Currently supports: gamma, power variance function, log-normal, and inverse Gaussian frailty models. |
Authors: | Vinnie Monaco [aut, cre], Malka Gorfine [aut], Li Hsu [aut] |
Maintainer: | Vinnie Monaco <[email protected]> |
License: | LGPL-2 |
Version: | 1.3.8 |
Built: | 2025-03-06 03:48:06 UTC |
Source: | https://github.com/vmonaco/frailtysurv |
The Diabetic Retinopathy Study (DRS) was performed to determine whether the onset of blindness in 197 high-risk diabetic patients could be delayed by laser treatment. The treatment was administered to one randomly-selected eye in each patient, leaving the other eye untreated. Thus, there are 394 observations, which are clustered by patient since the level of risk will tend to vary between patients. A failure occurred when visual acuity dropped to below 5/200. All patients had a visual acuity of at least 20/100 at the beginning of the study.
data("drs")
data("drs")
A data frame with 394 rows and 8 columns. There are two rows for each subject, one row for each eye:
unique identifier for each subject
subject's eye, where 1=right and 2=left
the observed follow-up time
outcome at the end of the observation period, where 1=blindness and 0 indicates censorship
a binary covariate, where 1=treated or 0=untreated
age (in years) at the onset of diabetes
type of laser used for treatment, where 1=xenon, 2=argon
type of diabetes, where 1=juvenile (age at dx < 20) and 2=adult
https://www.mayo.edu/research/documents/diabeteshtml/doc-10027460
## Not run: data(drs) # Clustered by subject fit.drs <- fitfrail(Surv(time, status) ~ treated + cluster(subject_id), drs, frailty="gamma") fit.drs # Variance estimates vcov(fit.drs) # Plot the estimated cumulative baseline hazard plot(fit.drs, type="cumhaz") ## End(Not run)
## Not run: data(drs) # Clustered by subject fit.drs <- fitfrail(Surv(time, status) ~ treated + cluster(subject_id), drs, frailty="gamma") fit.drs # Variance estimates vcov(fit.drs) # Plot the estimated cumulative baseline hazard plot(fit.drs, type="cumhaz") ## End(Not run)
Fit an extended Cox proportional hazards model with unobserved shared frailty variate and unspecified baseline hazard function, using a semiparametric estimation technique. See Gorfine et al.~(2006) and Zucker et al.~(2008) for details.
fitfrail(formula, dat, control, frailty, weights = NULL, se = FALSE, ...)
fitfrail(formula, dat, control, frailty, weights = NULL, se = FALSE, ...)
formula |
a |
dat |
data.frame that provides context for the formula |
control |
control parameters in the form of a |
frailty |
string name of the shared frailty distribution |
weights |
vector of cluster weights |
se |
logical value, whether the standard errors of the regression coefficient and frailty distribution parameter estimates should be calculated. These are obtained using the |
... |
additional arguments will be passed to |
A fitfrail object representing the shared frailty model.
beta |
the estimated regression coefficients |
theta |
the estimated frailty distribution parameters |
Lambda |
a |
Lambda |
a |
Lambda.fun |
a function of time that returns the estimated baseline |
loglik |
the log-likelihood |
iter |
the number of iterations performed |
trace |
the parameter trace during estimation |
The initial values of the regression coefficients are provided by coxph
. Convergence is reached when either the relative reduction or absolute reduction in loglikelihood or score equations (depending on the fitmethod used) are below a threshold. If the maxit iterations are performed before convergence, then the algorithm terminates with a warning.
The estimation method was developed by Malka Gorfine, Li Hsu, and David Zucker; implemented by John V. Monaco.
Gorfine M, Zucker DM, Hsu L (2006) Prospective survival analysis with a general semiparametric shared frailty model: A pseudo full likelihood approach. Biometrika, 93(3), 735-741.
Monaco JV, Gorfine M, Hsu L (2018) General Semiparametric Shared Frailty Model: Estimation and Simulation with frailtySurv Journal of Statistical Software, 86(4), 1-42
Zucker DM, Gorfine M, Hsu L (2008) Pseudo-full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: Asymptotic theory. Journal of Statistical Planning and Inference, 138(7), 1998-2016.
vcov.fitfrail
, genfrail
, simfrail
,
survfit
, coxph
## Not run: # # Generate synthetic survival data with regression coefficients # beta = c(log(2),log(3)) and theta = 2, where the shared frailty # values from a gamma distribution with expectation 1 and variance theta. # dat <- genfrail(N=300, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.35, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) # Fit a shared frailty model fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma") fit # The Lambda.fun function can give the estimated cumulative baseline hazard at # any time fit$Lambda.fun(seq(0, 100, by=10)) # Fit the DRS data, clustered on patient data(drs) fit.drs <- fitfrail(Surv(time, status) ~ treated + cluster(subject_id), drs, frailty="gamma") fit.drs ## End(Not run) # # A small example with c(log(2),log(3)) coefficients, Gamma(2) frailty, and # 0.10 censorship. # dat <- genfrail(N=30, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.10, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) # Fit a shared frailty model fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma", se=TRUE) fit # Summarize the survival curve head(summary(fit))
## Not run: # # Generate synthetic survival data with regression coefficients # beta = c(log(2),log(3)) and theta = 2, where the shared frailty # values from a gamma distribution with expectation 1 and variance theta. # dat <- genfrail(N=300, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.35, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) # Fit a shared frailty model fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma") fit # The Lambda.fun function can give the estimated cumulative baseline hazard at # any time fit$Lambda.fun(seq(0, 100, by=10)) # Fit the DRS data, clustered on patient data(drs) fit.drs <- fitfrail(Surv(time, status) ~ treated + cluster(subject_id), drs, frailty="gamma") fit.drs ## End(Not run) # # A small example with c(log(2),log(3)) coefficients, Gamma(2) frailty, and # 0.10 censorship. # dat <- genfrail(N=30, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.10, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) # Fit a shared frailty model fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma", se=TRUE) fit # Summarize the survival curve head(summary(fit))
This function creates a list of control parameters needed by fitfrail.
fitfrail.control(fitmethod = "loglik", abstol = 0, reltol = 1e-6, maxit = 100, int.abstol = 0, int.reltol = 1, int.maxit = 1000, init.beta="coxph", init.theta=NULL, verbose = FALSE)
fitfrail.control(fitmethod = "loglik", abstol = 0, reltol = 1e-6, maxit = 100, int.abstol = 0, int.reltol = 1, int.maxit = 1000, init.beta="coxph", init.theta=NULL, verbose = FALSE)
fitmethod |
string indicating which fit method should be used: "loglik" or "score". If "loglik", then the loglikelihood is maximized directily using optim (L-BFGS-B algorithm). If "score", then the system of normalized score equations is solved using nleqslv (Newton algorithm). |
abstol |
numeric absolute tolerance for convergence. If fitmethod is "loglik", convergence is reached when the absolute reduction in loglikelihood is less than abstol. If fitmethod is "score", convergence is reached when the absolute value of each normalized score equation is less then abstol. |
reltol |
numeric relative tolerance for convergence. If fitmethod is "loglik", convergence is reached when the relative reduction in loglikelihood is less than reltol. If fitmethod is "score", convergence is reached when the relative reduction of each estimated parameter is less than reltol. |
maxit |
integer, maximum number of iterations to perform |
int.abstol |
numeric absolute tolerence for convergence of the numeric integration. Only applicable for frailty distributions that require numerical integration. If 0, then ignore. Default is 0. |
int.reltol |
numeric relative tolerence for convergence of the numeric integration. Only applicable for frailty distributions that require numerical integration. If 0, then ignore. Default is 1. |
int.maxit |
integer, maximum number of numeric integration function evaluations. Only applicable for frailty distributions that require numerical integration. If 0, then no limit. Default is 100. |
init.beta |
initial regression coefficients paramater estimates, can be numeric, string, or NULL. If NULL, then a zero vector is used. If "coxph" is passed, then regression coefficients are initialized to the estimates given by coxph (with or without frailty, depending on the init.theta parameter, see below). If numeric, then the initial regression coefficients are specified explicitly. |
init.theta |
initial frailty distribution parameter estimates, can be numeric, string, or NULL. If NULL, then a sensible starting value is chosen for the specified frailty distribution. If "coxph", then hat.theta is initialized such that the initial frailty distribution has the same rank correlation as the estimated gamma frailty distribution. This is achieved by assuming gamma frailty and estimating theta using coxph. This estimate is then transferred to the appropriate frailty distribution through Kendall's tau. This method only works well for weak cluster dependence, i.e., small Kendall's tau. If numeric, then the initial frailty distribution parameters are specified explicitly. |
verbose |
logical value, whether to print a trace of the parameter estimation. |
A list of control parameters.
John V. Monaco, Malka Gorfine, Li Hsu
frailtySurv provides a suite of functions for generating clustered survival data, fitting multivariate shared frailty models under a wide range of frailty distributions, and visualizing the output. The semi-parametric estimators have better asymptotic properties than most existing implementations, including consistent and asymptotically-normal estimators. Moreover, this is the first package that implements semi-parametric estimators with inverse Gaussian and PVF frailty models.
The frailtySurv package provides functions
Package: | frailtySurv |
Type: | Package |
Version: | 1.2.0 |
Date: | August 2015 |
License: | LGPL-2 |
LazyLoad: | Yes |
John V. Monaco, Malka Gorfine, and Li Hsu.
Gorfine M, Zucker DM, Hsu L (2006) Prospective survival analysis with a general semiparametric shared frailty model: A pseudo full likelihood approach. Biometrika, 93(3), 735-741.
Zucker DM, Gorfine M, Hsu L (2008) Pseudo-full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: Asymptotic theory. Journal of Statistical Planning and Inference, 138(7), 1998-2016.
Generate clustered survival data from a shared frailty model, with hazard function given by
where is the cumulative baseline hazard,
is the frailty value of cluster
,
is the regression coefficient vector, and
is the covariate vector for individual
in cluster
.
The baseline hazard can be specified by the inverse cumualative baseline hazard, cumulative baseline hazard, or simply the baseline hazard. Frailty values can be sampled from gamma, power variance function (PVF), log-normal, inverse Gaussian, and positive stable distributions.
genfrail(N = 300, K = 2, K.param = c(2, 0), beta = c(log(2)), frailty = "gamma", theta = c(2), covar.distr = "normal", covar.param = c(0, 1), covar.matrix = NULL, censor.distr = "normal", censor.param = c(130, 15), censor.rate = NULL, censor.time = NULL, lambda_0 = NULL, Lambda_0 = NULL, Lambda_0_inv = NULL, round.base = NULL, control, ...)
genfrail(N = 300, K = 2, K.param = c(2, 0), beta = c(log(2)), frailty = "gamma", theta = c(2), covar.distr = "normal", covar.param = c(0, 1), covar.matrix = NULL, censor.distr = "normal", censor.param = c(130, 15), censor.rate = NULL, censor.time = NULL, lambda_0 = NULL, Lambda_0 = NULL, Lambda_0_inv = NULL, round.base = NULL, control, ...)
N |
integer; number of clusters |
K |
integer, string, or vector; If an integer, the number of members in each cluster. If a string, the name of the distribution to sample the cluster sizes from. This can be one of: "poisson", "pareto", or "uniform". The |
K.param |
vector of the cluster size distribution parameters if |
beta |
vector of regression coefficients. |
frailty |
string name of the frailty distribution. Can be one of: "gamma", "pvf", "lognormal", "invgauss", "posstab", or "none". See |
theta |
vector the frailty distribution parameters |
covar.distr |
string distribution to sample covariates from. Can be one of: "normal", "uniform", "zero" |
covar.param |
vector covariate distribution parameters. |
covar.matrix |
matrix with dimensions |
censor.distr |
string censoring distribution to use. Followup times are sampled from the censoring distribution to simulate non-informative right censorship. The censoring distribution can be one of: "normal", "lognormal", "uniform", "none". |
censor.param |
vector of censoring distribution parameters. For normal and lognormal censorship, this should be c(mu,sigma) where mu is the mean and sigma is the standard deviation (Note: this is still the mean and standard deviation for lognormal). For uniform censorship, the vector |
censor.rate |
numeric value between 0 and 1 to specify the empirical censoring rate. The mean specified in the |
censor.time |
vector of right-censorship times. This must have length N*K and specifies the right-censoring times of each observation. Note that this overrides all other censor.* params and cannot be used with variable cluster sizes. |
lambda_0 |
function baseline hazard. Only one of |
Lambda_0 |
function cumulative baseline hazard. This overrides |
Lambda_0_inv |
function inverse cumulative baseline hazard. This overrides both |
round.base |
numeric if specified, round the followup times to the nearest |
control |
control parameters in the form of a |
... |
additional arguments will be passed to |
A data.frame
with row-observations is returned.
family |
the cluster |
rep |
the member within each cluster |
time |
observed followup time |
status |
failure indicator |
Z1... |
covariates, where there are |
John V. Monaco, Malka Gorfine, and Li Hsu.
# Generate the same dataset 3 different ways # Using the baseline hazard (least efficient) set.seed(1234) dat.1 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, lambda_0=function(t, tau=4.6, C=0.01) (tau*(C*t)^tau)/t) # Using the cumulative baseline hazard set.seed(1234) dat.2 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, Lambda_0 = function(t, tau=4.6, C=0.01) (C*t)^tau) # Using the inverse cumulative baseline hazard (most efficient) set.seed(1234) dat.3 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, Lambda_0_inv=function(t, tau=4.6, C=0.01) (t^(1/tau))/C) # Generate data with PVF frailty, truncated Poisson cluster sizes, normal # covariates, and 0.35 censorship from a lognormal distribution set.seed(1234) dat.4 <- genfrail(N = 100, K = "poisson", K.param=c(5, 1), beta = c(log(2),log(3)), frailty = "pvf", theta = 0.3, covar.distr = "lognormal", censor.rate = 0.35) # Use the default baseline hazard # Cluster sizes have size >= 2, summarized by summary(dat.4) # An oscillating baseline hazard set.seed(1234) dat.5 <- genfrail(lambda_0=function(t, tau=4.6, C=0.01, A=2, f=0.1) A^sin(f*pi*t) * (tau*(C*t)^tau)/t) # Uniform censorship with 0.25 censoring rate set.seed(1234) dat.6 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, censor.distr = "uniform", censor.param = c(50, 150), censor.rate = 0.25, Lambda_0_inv=function(t, tau=4.6, C=0.01) (t^(1/tau))/C)
# Generate the same dataset 3 different ways # Using the baseline hazard (least efficient) set.seed(1234) dat.1 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, lambda_0=function(t, tau=4.6, C=0.01) (tau*(C*t)^tau)/t) # Using the cumulative baseline hazard set.seed(1234) dat.2 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, Lambda_0 = function(t, tau=4.6, C=0.01) (C*t)^tau) # Using the inverse cumulative baseline hazard (most efficient) set.seed(1234) dat.3 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, Lambda_0_inv=function(t, tau=4.6, C=0.01) (t^(1/tau))/C) # Generate data with PVF frailty, truncated Poisson cluster sizes, normal # covariates, and 0.35 censorship from a lognormal distribution set.seed(1234) dat.4 <- genfrail(N = 100, K = "poisson", K.param=c(5, 1), beta = c(log(2),log(3)), frailty = "pvf", theta = 0.3, covar.distr = "lognormal", censor.rate = 0.35) # Use the default baseline hazard # Cluster sizes have size >= 2, summarized by summary(dat.4) # An oscillating baseline hazard set.seed(1234) dat.5 <- genfrail(lambda_0=function(t, tau=4.6, C=0.01, A=2, f=0.1) A^sin(f*pi*t) * (tau*(C*t)^tau)/t) # Uniform censorship with 0.25 censoring rate set.seed(1234) dat.6 <- genfrail(N = 300, K = 2, beta = c(log(2),log(3)), frailty = "gamma", theta = 2, censor.distr = "uniform", censor.param = c(50, 150), censor.rate = 0.25, Lambda_0_inv=function(t, tau=4.6, C=0.01) (t^(1/tau))/C)
This function creates a list of control parameters needed by genfrail.
genfrail.control(censor.reltol = 1e-4, censor.subdivisions = 1000L, crowther.reltol = 1e-4, crowther.subdivisions = 1000L)
genfrail.control(censor.reltol = 1e-4, censor.subdivisions = 1000L, crowther.reltol = 1e-4, crowther.subdivisions = 1000L)
censor.reltol |
numeric relative tolerence for convergence of the censorship numerical integration. Default is 0.001. |
censor.subdivisions |
integer, maximum number of censorship numerical integration subdivisions. Default is 1000. |
crowther.reltol |
numeric relative tolerence for convergence of the numerical integration in Crowther's formula. Default is 0.001. |
crowther.subdivisions |
integer, maximum number of numerical integration subdivisions in Crowther's formula. Default is 1000. |
A list of control parameters.
John V. Monaco, Malka Gorfine, Li Hsu
This dataset contains the observed follow-up times and SMART statistics of 52k unique hard drives.
Daily snapshots of a large backup storage provider over 2 years were made publicly available. On each day, the Self-Monitoring, Analysis, and Reporting Technology (SMART) statistics of operational drives are recorded. When a hard drive is no longer operational, it is marked as a failure and removed from the subsequent daily snapshots. New hard drives are also continuously added to the population. In total, there are over 52k unique hard drives over approximately 2 years and 2885 (5.5%) failures.
data("hdfail")
data("hdfail")
A data frame with 52422 observations on the following 8 variables.
serial
unique serial number of the hard drive
model
hard drive model
time
the observed followup time
status
failure indicator
temp
temperature in Celsius
rsc
binary covariate, where 1 indicates sectors that encountered read, write, or verification errors
rer
binary covariate, where 1 indicates a non-zero rate of errors that occur in hardware when reading from data from disk.
psc
binary covariate, where 1 indicates there were sectors waiting to be remapped due to an unrecoverable error.
https://www.backblaze.com/cloud-storage/resources/hard-drive-test-data
## Not run: data(hdfail) # Select only Western Digital hard drives dat <- subset(hdfail, grepl("WDC", model)) fit.hd <- fitfrail(Surv(time, status) ~ temp + rer + rsc + psc + cluster(model), dat, frailty="gamma", fitmethod="score") fit.hd ## End(Not run)
## Not run: data(hdfail) # Select only Western Digital hard drives dat <- subset(hdfail, grepl("WDC", model)) fit.hd <- fitfrail(Surv(time, status) ~ temp + rer + rsc + psc + cluster(model), dat, frailty="gamma", fitmethod="score") fit.hd ## End(Not run)
fitfrail
objects
Plot the cumulative baseline hazard estimates or the parameter trace from model estimation.
## S3 method for class 'fitfrail' plot(x, type = c("cumhaz", "trace"), ...)
## S3 method for class 'fitfrail' plot(x, type = c("cumhaz", "trace"), ...)
x |
a |
type |
string, the type of plot. Can be either "cumhaz" to plot the mean estimated cumulative hazard or "trace" to plot the paramater and log-likelihood trace. |
... |
extra arguments include:
|
The plot object.
John. V Monaco, Malka Gorfine, Li Hsu
## Not run: data(drs) fit.drs <- fitfrail(Surv(time, status) ~ treated + cluster(subject_id), drs, frailty="gamma") # Plot the parameter and log-likelihood trace plot(fit.drs, type="trace") # This may take a while to run. # Use parameter B to specify the number of repetitions in the weighted bootstrap plot(fit.drs, type="cumhaz", CI=0.95) ## End(Not run)
## Not run: data(drs) fit.drs <- fitfrail(Surv(time, status) ~ treated + cluster(subject_id), drs, frailty="gamma") # Plot the parameter and log-likelihood trace plot(fit.drs, type="trace") # This may take a while to run. # Use parameter B to specify the number of repetitions in the weighted bootstrap plot(fit.drs, type="cumhaz", CI=0.95) ## End(Not run)
simfrail
objects
Plot the estimated parameter residuals or the mean estimated cumulative baseline hazard.
## S3 method for class 'simfrail' plot(x, type = c("residuals", "cumhaz"), ...)
## S3 method for class 'simfrail' plot(x, type = c("residuals", "cumhaz"), ...)
x |
a |
type |
string, the type of plot. Can be either "residuals" or "cumhaz". If |
... |
extra arguments include:
|
The plot object.
John. V Monaco, Malka Gorfine, Li Hsu
## Not run: set.seed(2015) sim <- simfrail(1000, genfrail.args=alist(beta=c(log(2),log(3)), frailty="gamma", censor.rate=0.30, N=300, K=2, theta=2, covar.distr="uniform", covar.param=c(0, 1), Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau), fitfrail.args=alist(formula=Surv(time, status) ~ Z1 + Z2 + cluster(family), frailty="gamma"), Lambda.times=1:120) # Make a boxplot of residuals plot(sim, type="residuals") # Plot the mean estimated cumulative baseline hazard and empirical 0.95 CI plot(sim, type="cumhaz") ## End(Not run)
## Not run: set.seed(2015) sim <- simfrail(1000, genfrail.args=alist(beta=c(log(2),log(3)), frailty="gamma", censor.rate=0.30, N=300, K=2, theta=2, covar.distr="uniform", covar.param=c(0, 1), Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau), fitfrail.args=alist(formula=Surv(time, status) ~ Z1 + Z2 + cluster(family), frailty="gamma"), Lambda.times=1:120) # Make a boxplot of residuals plot(sim, type="residuals") # Plot the mean estimated cumulative baseline hazard and empirical 0.95 CI plot(sim, type="cumhaz") ## End(Not run)
Generates simulated clustered survival data by repeatedly generating data, using a shared frailty model, and fitting the models. Respective arguments are passed to genfrail and coxph, and the resulting parameter estimates are aggregated and summarized.
This function is similar to simfrail
, except models are fitted using the coxph
.
simcoxph(reps, genfrail.args, coxph.args, Lambda.times, cores = 0)
simcoxph(reps, genfrail.args, coxph.args, Lambda.times, cores = 0)
reps |
number of times to repeat the simulation |
genfrail.args |
list of arguments to pass to |
coxph.args |
list of arguments to pass to |
Lambda.times |
vector of time points to obtain baseline hazard estimates at |
cores |
integer; if > 0, the number of cores to use; if < 0, the number of cores not to use; if 0, use all available cores |
A simcoxph
object that is essentially a data.frame
of the resulting parameter estimates. Each row is a single run, and columns are as follows.
seed |
the seed used for the run |
runtime |
the time it took to fit the model |
N |
number of clusters |
mean.K |
average cluster size |
cens |
empirical censorship |
beta |
true regression coefficients |
hat.beta |
estimated regression coefficients |
se.beta |
standard error of each regression coefficient |
theta |
true frailty distribution parameters |
hat.theta |
estimated frailty distribution parameters |
se.theta |
standard error of each frailty distribution parameter (NA since coxph does not currently provide this.) |
Lambda |
true cumulative baseline hazard at each Lambda.times point |
hat.Lambda |
estimated cumulative baseline hazard at each Lambda.times point |
se.Lambda |
standard error at each Lambda.times point (NA since coxph does not currently provide this) |
John. V Monaco, Malka Gorfine, Li Hsu
## Not run: sim <- simcoxph(reps=100, genfrail.args=alist( N=50, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, Lambda_0 = function(t, tau=4.6, C=0.01) (C*t)^tau), coxph.args=alist( formula=Surv(time, status) ~ Z1 + Z2 + cluster(family), frailty="gamma"), Lambda.times=1:120, cores = 0) # Summarize the results summary(sim) # Plot the residuals plot(sim, "residuals") ## End(Not run)
## Not run: sim <- simcoxph(reps=100, genfrail.args=alist( N=50, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, Lambda_0 = function(t, tau=4.6, C=0.01) (C*t)^tau), coxph.args=alist( formula=Surv(time, status) ~ Z1 + Z2 + cluster(family), frailty="gamma"), Lambda.times=1:120, cores = 0) # Summarize the results summary(sim) # Plot the residuals plot(sim, "residuals") ## End(Not run)
Generates simulated clustered survival data by repeatedly generating data, using a shared frailty model, and fitting the models. Respective arguments are passed to genfrail and fitfrail, and the resulting parameter estimates are aggregated and summarized.
simfrail(reps, genfrail.args, fitfrail.args, Lambda.times, vcov.args = list(), cores = 0, skip.SE = FALSE)
simfrail(reps, genfrail.args, fitfrail.args, Lambda.times, vcov.args = list(), cores = 0, skip.SE = FALSE)
reps |
number of times to repeat the simulation |
genfrail.args |
list of arguments to pass to |
fitfrail.args |
list of arguments to pass to |
Lambda.times |
vector of time points to obtain baseline hazard estimates at |
vcov.args |
list of arguments to pass to vcov.fitfrail for variance estimates. This is mainly used to specify whether bootstrap or estimated variances should be obtained. |
cores |
integer; if > 0, the number of cores to use; if < 0, the number of cores not to use; if 0, use all available cores |
skip.SE |
logical value, whether to skip the standard error estimates (saves time) |
A simfrail
object that is essentially a data.frame
of the resulting parameter estimates. Each row is a single run, and columns are as follows.
seed |
the seed used for the run |
runtime |
the time it took to fit the model |
N |
number of clusters |
mean.K |
average cluster size |
cens |
empirical censorship |
beta |
true regression coefficients |
hat.beta |
estimated regression coefficients |
se.beta |
standard error of each regression coefficient |
theta |
true frailty distribution parameters |
hat.theta |
estimated frailty distribution parameters |
se.theta |
standard error of each frailty distribution parameter |
Lambda |
true cumulative baseline hazard at each Lambda.times point |
hat.Lambda |
estimated cumulative baseline hazard at each Lambda.times point |
se.Lambda |
standard error at each Lambda.times point |
John. V Monaco, Malka Gorfine, Li Hsu
## Not run: sim <- simfrail(reps=100, genfrail.args=alist( N=50, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, Lambda_0 = function(t, tau=4.6, C=0.01) (C*t)^tau), fitfrail.args=alist( formula=Surv(time, status) ~ Z1 + Z2 + cluster(family), frailty="gamma"), Lambda.times=1:120, cores = 0) # Summarize the results summary(sim) # Plot the residuals plot(sim, "residuals") ## End(Not run)
## Not run: sim <- simfrail(reps=100, genfrail.args=alist( N=50, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, Lambda_0 = function(t, tau=4.6, C=0.01) (C*t)^tau), fitfrail.args=alist( formula=Surv(time, status) ~ Z1 + Z2 + cluster(family), frailty="gamma"), Lambda.times=1:120, cores = 0) # Summarize the results summary(sim) # Plot the residuals plot(sim, "residuals") ## End(Not run)
Returns a data.frame
summarizing the survival curve of the fitted model. If specified, this function uses a weighted bootstrap procedure to calculate SE of the survival curve estimates. Subsequente calls with the same arguments will use the cached SE and avoid performing the weighted bootstrap again.
## S3 method for class 'fitfrail' summary(object, type = "survival", Lambda.times = NULL, censored = FALSE, se = FALSE, CI = 0.95, ...)
## S3 method for class 'fitfrail' summary(object, type = "survival", Lambda.times = NULL, censored = FALSE, se = FALSE, CI = 0.95, ...)
object |
a |
type |
string indicating the type of summary: either "survival" for a summary of the survival curve, or "cumhaz" for a summary of the cumulative baseline hazard. |
Lambda.times |
vector of times where the curve should be evaluated. The resulting |
censored |
logical value, whether the survival curve should contain the censored times. Ignored if |
se |
logical value, whether the survival SE should be included with the results. If se=TRUE, a weighted bootstrap procedure is used to determine estimated survival SE. |
CI |
numeric, the confidence interval to evaluate upper and lower limits for the survival estimate at each time point |
... |
extra arguments will be passed to |
A data.frame
summarizing the survival curve with the following columns.
time |
the time points |
surv/cumhaz |
survival/cumulative hazard estimate at time t+ |
n.risk |
number of subjects at risk at time t- |
n.event |
the number of failures that occured from the last time point to time t+ |
std.err |
the SE of the survival estimate |
lower.ci |
lower bound on the specified confidence interval |
upper.ci |
upper bound on the specified confidence interval |
Similar to summary.survfit
function in the survival
package.
## Not run: dat <- genfrail(N=200, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.35, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma") surv <- summary(fitfrail, B=50, se=TRUE, CI=0.95) head(surv) ## End(Not run)
## Not run: dat <- genfrail(N=200, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.35, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma") surv <- summary(fitfrail, B=50, se=TRUE, CI=0.95) head(surv) ## End(Not run)
Compute the variance/covariance matrix for fitfrail estimated parameters. This can be performed by a an asymptotically-normal and consistent variance estimator or a weighted bootstrap. The resulting covariance matrix is cached in the fitted object and later retrieved if the same arguments to vcov.fitfrail
are supplied.
## S3 method for class 'fitfrail' vcov(object, boot=FALSE, B=100, Lambda.times=NULL, cores=0, ...)
## S3 method for class 'fitfrail' vcov(object, boot=FALSE, B=100, Lambda.times=NULL, cores=0, ...)
object |
a |
boot |
logical value, whether to use a weighted bootstrap. If boot == FALSE, a consistent estimator is used and the cumulative baseline hazard variance will not be estimated. |
B |
number of repetitions in the weighted bootstrap. |
Lambda.times |
time points where the variance/covariance should be evaluated. If Lambda.times == NULL, then the points where the cumulative baseline hazard increases (where failures occur) are used. |
cores |
number of cores to use when computing the covariance matrix in parallel |
... |
extra arguments are not used |
variance/covariance matrix for the fitfrail model parameters
## Not run: dat <- genfrail(N=200, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.35, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma") # boot=TRUE will give the weighted bootstrap variance estimates COV <- vcov(fit, boot=FALSE) COV ## End(Not run)
## Not run: dat <- genfrail(N=200, K=2, beta=c(log(2),log(3)), frailty="gamma", theta=2, censor.rate=0.35, Lambda_0=function(t, tau=4.6, C=0.01) (C*t)^tau) fit <- fitfrail(Surv(time, status) ~ Z1 + Z2 + cluster(family), dat, frailty="gamma") # boot=TRUE will give the weighted bootstrap variance estimates COV <- vcov(fit, boot=FALSE) COV ## End(Not run)