Thanks to visit codestin.com
Credit goes to github.com

Skip to content

tvma Function

Yajnaseni Chakraborti edited this page May 13, 2022 · 14 revisions

Time Varying Mediation Function: Continuous Outcome and Two Treatment Arms (Exposure Groups)

Authors:

Introduction

The purpose of this vignette is to provide users with a step-by-step guide for performing and interpreting the results of a time varying mediation analysis with 2 treatment (exposure) groups and continuous outcome. Please note that this package has been built considering the structure of a panel data, where each subject/participant has repeated responses (collected over time) for the variables of interest (outcome and mediator). However, we are not addressing the dynamic treatment regimens problem in this package. Therefore, we have assumed the scenario where the treatment (exposure) is constant and does not change over time. For more information on the mathematical model, refer to Cai et al., 2022.

Data

We will rely on a dataset, simulated based on the Wisconsin Smokers' Health Study 2 dataset (Baker et al., 2016), which includes 1086 individuals across 3 treatment conditions. One-third of participants received only a nicotine patch; another one-third received varenicline, and the final third of participants received a combination nicotine replacement therapy (NRT) which is nicotine patch + nicotine mini-lozenge. For our illustration, the outcome of interest is cessation fatigue that recorded how tired a participant felt of trying to quit smoking (7-point Likert scale). In addition, mediator variables were collected by asking participants if they felt a negative mood in the last fifteen minutes, and whether they wanted to smoke in the last 15 minutes, also recorded in the 7-point Likert scale. Both the outcome and mediator variables were assessed 2x per day over the course of twenty-eight days. The assessments were recorded daily (2x per day) for the first two weeks (rendering 30 time points of response since assessments start on day 0 post target quit day), and every other day (2x per day) for weeks 3 and 4 (rendering 14 time points of response).

A traditional approach to analyzing this type of data would be to utilize a mediation analysis. First, a single direct effect would be calculated by regressing the outcome on the treatment condition. Next, a single indirect effect would be computed by multiplying the effect of treatment condition on the mediator by the effect of the mediator on the outcome. However, this method potentially misses important information about the dynamic effect that a mediator may have over time. Specifically, we hypothesize that mood changes, and, thus, its mediating effect on one’s feelings of quitting smoking is likely to vary over time. We, therefore, propose a time varying mediation analysis which estimates the mediation effect as a function that varies over time.

The tvma function is developed for analyzing the mediation effect of 2 treatment (exposure) groups. To illustrate the analysis, we have considered the scenario of varenicline vs. not varenicline i.e. the subjects taking nicotine patch or combination NRT were collated as a single group.

Getting started

To use the time varying mediation analysis package in R, you must first install the package and load it. Before that, make sure you have R version 4.0.3. There are two ways to install the package from the CRAN (Comprehensive R Archive Network) repository by using install.packages or devtools function.

install.packages("tvmediation", dependencies = TRUE)
library(tvmediation)

The equivalent code using devtools is:

devtools::install_cran("tvmediation", dependencies = TRUE) # MAKE SURE YOU HAVE devtools INSTALLED AND LOADED #
library(tvmediation)

If you do not have devtools installed and loaded, you can do so using the following set of codes:

install.packages("devtools", dependencies = TRUE)
library(devtools)

Alternatively, if you want to install the package directly from the github repository to access new or revised functions in development, the following code is required:

devtools::install_github("dcoffman/tvmediation", dependencies = TRUE) # MAKE SURE YOU HAVE devtools INSTALLED AND LOADED #
library(tvmediation)

Formatting your data before calling the tvma function

Once installed, you can type ?tvmediation in the console to view the package documentation, as well as links to the important functions and data included in the package. The time-varying mediation analysis for the continuous outcome and 2 exposure groups, relies on 2 user functions tvma and LongToWide as well as a number of internal functions of the tvmediation package.

The tvma function requires 4 necessary and 5 optional input variables/arguments.

  1. treatment A binary vector with treatment schedule
  2. t.seq A vector of the time sequence of the measures
  3. mediator The matrix of mediator values in wide format
  4. outcome The matrix of outcome values in wide format

The optional inputs are:

  1. t.est The time sequence of estimation. This is by default equal to t.seq.
  2. plot TRUE or FALSE for plotting mediation effect. The default value is "FALSE".
  3. CI "none" or "boot" method of deriving confidence intervals. The default value is "boot".
  4. replicates Number of replicates for bootstrapping confidence intervals. The default value is 1000.
  5. verbose TRUE or FALSE for printing results to screen. The default value is "FALSE".

The dataset we will use for our illustration is named smoker and is also included in the package.

To load the simulated dataset smoker.rda, type:

data(smoker)

As discussed earlier, the current version of the tvma function only supports 2 treatment options. A separate function tvma_3trt for 3 treatment options is also available.

The smoker data frame is organized in a Long format with SubjectID repeating over multiple rows for each participant. The tvma function requires that our data be in Wide format in order to accurately estimate the time varying mediator coefficients. The tvmediation package includes a useful function LongToWideto help users properly format their data for analysis.

LongToWide has 3 main input arguments and a fourth optional argument.

  1. subject.id requires a column of subject identifiers.
  2. time.sequences requires a column of time points.
  3. outcome requires a column of measures that are to be transposed.
  4. verbose is an option that can be turned on to print the output of LongToWide to the console

The output of LongToWide is a matrix of data in a wide format where columns represent the subjects and rows represent the time sequence. Thus each cell contains the j-th subject's response at the i-th time point.

As the tvma function will require two matrices, one for mediators, and one for outcomes, we will use the LongToWide function twice as seen below:

mediator <- LongToWide(smoker$SubjectID, smoker$timeseq, smoker$NegMoodLst15min)
outcome <- LongToWide(smoker$SubjectID, smoker$timeseq, smoker$cessFatig)

Now your dataset might not always be in a long format, which requires to be converted in a wide format using the LongToWide function. Your data might already be in wide format, in which case there is no need to use the LongToWide function, and you can simply subset your dataset/dataframe. However, please note that mediator and outcome must be of class matrix; hence make sure you convert the classtype of the subsetted datasets to matrix before proceeding. This can be done using the R function as.matrix.

The function still requires 2 more variables that we have not yet created:

  1. treatment A binary numeric vector with treatment schedule
  2. t.seq A numeric vector of the time sequence of the measures

We can create these variables in the following way. For treatment, we looked at only one instance of each subject's response for varenicline, converted it to a numeric value, and subtracted 1 to yield a vector of zeros and ones.

# Step 1: Since each subject has multiple rows of data, extract the unique response of each subject to receiving varenicline. The data is still in dataframe format.
trt1 <- unique(smoker[ , c("SubjectID","varenicline")])

# Step 2: `2` to those subjects who received `varenicline` and `1` to the rest. The data is now in vector format.
trt2 <- as.numeric(trt1[ , 2])

# Step 3: subtract 1 from these numeric responses and procure a vector of zeros and ones
treatment <- trt2 -1

This steps can be alternatively collated into a single step and written as follows:

treatment <- as.numeric(unique(smoker[ , c("SubjectID","varenicline")])[, 2])-1

As discussed earlier, our goal in this example is estimating the time varying effect of varenicline on cessation fatigue compared to not varenicline, mediated via negative mood in the last fifteen minutes. We can look at the effect of varenicline vs. nicotine patch only or combination NRT vs. varencicline with an additional step of excluding the subjects belonging to the treatment group that is not of interest, and then following the steps as mentioned above. Please refer to the vignette on function tvmb to learn the steps involved when considering the scenario of varenicline vs. nicotine patch only or combination NRT vs. varencicline or combination NRT vs. nicotine patch only.

To generate t.seq, we found only one instance of each time point and then sorted from smallest to largest. There are 44 unique time points in our dataset where 0 after decimal indicates measurement recorded in the morning and 5 after decimal indicates measurement recorded in the evening.

t.seq <- sort(unique(smoker$timeseq))

We are now ready to perform our time varying mediation analysis.

Calling the tvma function

As discussed earlier, the tvma function requires 4 necessary and 5 optional input variables/arguments.

  1. treatment A binary vector with treatment schedule
  2. t.seq A vector of the time sequence of the measures
  3. mediator The matrix of mediator values in wide format
  4. outcome The matrix of outcome values in wide format

The optional inputs are:

  1. t.est The time sequence of estimation. This is by default equal to t.seq.
  2. plot TRUE or FALSE for plotting mediation effect. The default value is "FALSE".
  3. CI "none" or "boot" method of deriving confidence intervals. The default value is "boot".
  4. replicates Number of replicates for bootstrapping confidence intervals. The default value is 1000.
  5. verbose TRUE or FALSE for printing results to screen. The default value is "FALSE".

We will call our function with additional optional argument plot=TRUE. The rest of the optional arguments are left to their respective default values.

results <- tvma(treatment, t.seq, mediator, outcome, plot = TRUE)

Results

The tvma function returns a list of results that include:

  1. hat.alpha the estimated main treatment arm (exposure group) of interest effect on mediator
  2. CI.lower.alpha the lower limit of confidence intervals for coefficient hat.alpha
  3. CI.upper.alpha the upper limit of confidence intervals for coefficient hat.alpha
  4. hat.gamma the estimated main treatment arm (exposure group) of interest effect on outcome
  5. CI.lower.gamma the lower limit of confidence intervals for coefficient hat.gamma
  6. CI.upper.gamma the upper limit of confidence intervals for coefficient hat.gamma
  7. hat.beta the estimated mediator effect on outcome
  8. CI.lower.beta the lower limit of confidence intervals for coefficient hat.beta
  9. CI.upper.beta the upper limit of confidence intervals for coefficient hat.beta
  10. hat.tau the estimated main treatment arm (exposure group) of interest effect on outcome, excluding adjustment for mediator
  11. CI.lower.tau the lower limit of confidence intervals for coefficient hat.tau
  12. CI.upper.tau the upper limit of confidence intervals for coefficient hat.tau
  13. est.M the time varying mediation effect - main treatment arm (exposure group) of interest on outcome

Optional returns based on arguments CI = "boot" include:

  1. boot.se.m the estimated standard error of the time varying mediation effect est.M
  2. CI.lower the lower limit of confidence intervals of the time varying mediation effect est.M
  3. CI.upper the upper limit of confidence intervals of the time varying mediation effect est.M

The above estimates are compiled in a single dataframe which can be accessed using results$Estimates.

At each time point of interest t.est which in this case is equal to t.seq, the effects of treatment (exposure) on the mediator, exposure on the outcome (adjusted and not adjusted for the mediator) and mediator on the outcome are estimated along with the respective 95% CIs. The CIs are computed via a non-parametric bootstrap method (Efron and Tibshirani, 1986), drawing samples of size 1086 from the original sample with replacement, estimating the sample mean, and then applying the percentile method to compute the 95% CIs. Note that the confidence intervals for the alpha, gamma, beta and tau coefficients (hat.alpha, hat.gamma, hat.beta, hat.tau) are computed regardless of the value of CI argument in the function. est.M is the estimated mediation effect of varenicline compared to the other two treatment groups, that varies over t.est. For CI = "boot" (which is the default option unless the user chooses otherwise), the standard error of the estimated mediation effect and 95% CI is estimated via similar bootstrapping technique described earlier for the coefficients.

If plot = TRUE argument is passed, the results will also include the following figures:

  1. Alpha_CI plot for hat.alpha with 95% CIs across timeseq
  2. Gamma_CI plot for hat.gamma with 95% CIs across timeseq
  3. Tau_CI plot for hat.tau with 95% CIs across timeseq
  4. Beta_CI plot for hat.beta with 95% CIs across timeseq
  5. MedEff plot for est.M across timeseq
  6. MedEff_CI plot for est.M with 95% CIs across timeseq

We recommend using the plots to interpret your findings as it may be difficult to derive meaning from the numerical values alone. To display the plots, use results$ followed by the name of the plot to access the required plot accordingly. For example, calling results$Alpha_CI will display the plot for hat.alpha with 95% CIs across time sequence.

As discussed earlier, tvma accepts additional input arguments that allow the user to perform different analyses.

  1. t.est can be specified to select only certain time points at which to make the estimations.
  2. replicates can be set to any number to specify the number of bootstrap replicates to compute. The default value is 1000

For example, specifying t.est, and replicates in the following code will produce estimates at time points 0.2, 0.4, 0.6, and 0.8 using 500 bootstrap replicates.

results_new <- tvma(treatment, t.seq, mediator, outcome, t.est = c(0.2, 0.4, 0.6, 0.8), replicates = 500)

The tvma function computes bootstrap confidence intervals by default. Therefore, if the user decides to not bootstrap CIs for the mediation effect by specifying CI = "none", however by mistake also specifies replicates = 500, the function will not display an error, but simply execute without computing the CIs for mediation effect. Note that the CIs for the individual effects of exposure on mediator and mediator on the outcome is computed even if the user passes the argument CI = "none".

Summary

The tvmediation package provides a robust set of functions for estimating mediation effects that vary over time. The development of this tool has widespread application for use in human behavior research, clinical trials, addiction research, and others. By adopting the Time Varying Mediation Analysis for your data, we hope to characterize the realistic nature of an effect over time as opposed to the traditional approach of estimating an effect based on its single occurrence.

References

  1. Cai X, Coffman DL, Piper ME, Li R. Estimation and inference for the mediation effect in a time-varying mediation model. BMC Med Res Methodol. 2022;22(1):1-12. doi:10.1186/s12874-022-01585-x

  2. Baker TB, Piper ME, Stein JH, et al. Effects of Nicotine Patch vs Varenicline vs Combination Nicotine Replacement Therapy on Smoking Cessation at 26 Weeks: A Randomized Clinical Trial. JAMA. 2016;315(4):371. doi:10.1001/jama.2015.19284

  3. B. Efron, R. Tibshirani. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy. Statistical Science. 1986;1(1):54-75. doi:10.1214/ss/1177013815