Perform a Monte Carlo power analysis for the random intercept cross-lagged panel model (RI-CLPM) and the stable trait autoregressive trait state model (STARTS). This function computes performance metrics such as bias, mean square error, coverage, power, etc, for all model parameters, and can perform power analyses across multiple experimental conditions simultaneously. Conditions are defined in terms of sample size, number of time points, proportion of between-unit variance (ICC), and indicator reliability. See "Details" for information on (a) internal data simulation, (b) internal model estimation, (c) powRICLPM
's naming conventions of parameters, (d) parallel execution capabilities for speeding up the analysis, and (e) various extensions, such as the option to include measurement errors for data generation and estimation (i.e., the STARTS model), imposing various constraints over time, and many more.
Usage
powRICLPM(
target_power = 0.8,
search_lower = NULL,
search_upper = NULL,
search_step = 20,
sample_size = NULL,
time_points,
ICC,
RI_cor,
Phi,
within_cor,
reliability = 1,
skewness = 0,
kurtosis = 0,
estimate_ME = FALSE,
significance_criterion = 0.05,
alpha = NULL,
reps = 20,
bootstrap_reps = NULL,
seed = NA,
constraints = "none",
bounds = FALSE,
estimator = "ML",
save_path = NULL,
software = "lavaan"
)
Arguments
- target_power
A numeric value between 0 and 1, denoting the targeted power level.
- search_lower
A positive
integer
, denoting the lower bound of a range of sample sizes.- search_upper
A positive
integer
, denoting the upper bound of a range of sample sizes.- search_step
A positive
integer
, denoting an increment in sample size.- sample_size
(optional) An
integer
(vector), indicating specific sample sizes at which to evaluate power, rather than specifying a range using thesearch_*
arguments.- time_points
An
integer
(vector) with elements at least larger than 3, indicating number of time points.- ICC
A
double
(vector) with elements between 0 and 1, denoting the proportion of (true score) variance at the between-unit level. When measurement error is included in the data generating model, ICC is computed as the variance of the random intercept factor divided by the true score variance (i.e., controlled for measurement error).- RI_cor
A
double
between 0 and 1, denoting the correlation between random intercepts.- Phi
A matrix, with standardized autoregressive effects (on the diagonal) and cross-lagged effects (off-diagonal) in the population. Columns represent predictors and rows represent outcomes.
- within_cor
A
double
between 0 and 1, denoting the correlation between the within-unit components.- reliability
(optional) A
numeric
vector with elements between 0 and 1, denoting the reliability of the variables (see "Details").- skewness
(optional) A
numeric
, denoting the skewness values for the observed variables (seesimulateData
).- kurtosis
(optional) A
numeric
value, denoting the excess kurtosis values (i.e., compared to the kurtosis of a normal distribution) for the observed variables (seesimulateData
).- estimate_ME
(optional) A
logical
, denoting if measurement error variance should be estimated in the RI-CLPM (see "Details").- significance_criterion
(optional) A
double
, denoting the significance criterion.- alpha
(don't use) Deprecated, use
significance_criterion
instead.- reps
A positive
integer
, denoting the number of Monte Carlo replications to be used during simulations.- bootstrap_reps
(superseded) Uncertainty regarding simulation estimates is now computed analytically based on Morris et al. (2017). This argument is not used anymore.
- seed
An
integer
of length 1. If multiple cores are used, a seed will be used to generate a full L'Ecuyer-CMRG seed for all cores.- constraints
(optional) A
character
string, specifying the type of constraints that should be imposed on the estimation model (see "Details").- bounds
(optional) A
logical
, denoting if bounded estimation should be used for the latent variable variances in the model (see "Details").- estimator
(optional) A
character
string of length 1, denoting the estimator to be used (default:ML
, see "Details").- save_path
A
character
string of length 1, naming the directory to save (data) files to (used for validation purposes of this package). Variables are saved in alphabetical and numerical order.- software
A
character
string of length, naming which software to use for simulations; either "lavaan" or "Mplus" (see "Details").
Value
An object of class powRICLPM
, upon which summary()
, print()
, and plot()
can be used. The returned object is a list
with a conditions
and session
element. condition
itself is a list
of experimental conditions, where each element is again a list
containing the input and output of the power analysis for that particular experimental condition. session
is a list
containing information common to all experimental conditions.
Details
A rationale for the power analysis strategy implemented in this package can be found in Mulder (2023).
Data Generation
Data are generated using simulateData
from the lavaan package. Based on Phi
and within_cor
, the residual variances and covariances for the within-components at wave 2 and later are computed, such that the within-components themselves have a variance of 1. This implies that the lagged effects in Phi
can be interpreted as standardized effects.
Model Estimation using lavaan
When software = "lavaan"
(default), generated data are analyzed using lavaan
from the lavaan package. The default estimator is maximum likelihood (ML
). Other maximum likelihood based estimators implemented in lavaan can be specified as well. When skewed or kurtosed data are generated (using the skewness
and kurtosis
arguments), the estimator defaults to robust maximum likelihood MLR
. The population parameter values are used as starting values.
Parameter estimates from non-converged model solutions are discarded from the results. When bounds = FALSE
, inadmissible parameter estimates from converged solutions (e.g., a negative random intercept variance) are discarded. When bounds = TRUE
, inadmissible parameter estimates are retained following advice by De Jonckere and Rosseel (2022). The results include the minimum estimates for all parameters across replications to diagnose which parameter(s) might be the cause of the inadmissible solution.
Using Mplus
When software = "Mplus"
, Mplus input files will be generated and saved into save_path
. Note that it is not possible to generate skewed or kurtosed data in Mplus via the powRICLPM
package. Furthermore, bounded estimation is not available in Mplus. Therefore, the skewness
, kurtosis
, and bounds
will be ignored when software = "Mplus"
.
Naming Conventions Observed and Latent Variables
The observed variables in the RI-CLPM are given default names, namely capital letters in alphabetical order, with numbers denoting the measurement occasion. For example, for a bivariate RICLPM with 3 time points, we observe A1
, A2
, A3
, B1
, B2
, and B3
. Their within-components are denoted by wA1
, wA2
, ..., wB3
, respectively. The between-components have RI_
prepended to the variable name, resulting in RI_A
and RI_B
.
Parameters are denoted using lavaan model syntax (see the lavaan website). For example, the random intercept variances are denoted by RI_A~~RI_A
and RI_B~~RI_B
, the cross-lagged effects at the first wave as wB2~wA1
and wA2~wB1
, and the autoregressive effects as wA2~wA1
and wB2~wB1
. Use give(object, "names")
to extract parameter names from the powRICLPM
object.
Parallel Processing and Progress Bar
To speed up the analysis, power analysis for multiple experimental conditions can be executed in parallel. This has been implemented using future. By default the analysis is executed sequentially (i.e., single-core). Parallel execution (i.e., multicore) can be setup using plan
, for example plan(multisession, workers = 4)
. For more information and options, see https://future.futureverse.org/articles/future-1-overview.html#controlling-how-futures-are-resolved.
A progress bar displaying the status of the power analysis has been implemented using progressr. By default, a simple progress bar will be shown. For more information on how to control this progress bar and several other notification options (e.g., auditory notifications), see https://progressr.futureverse.org.
Extension: Measurement Errors (STARTS model)
Including measurement error to the RI-CLPM makes the model equivalent to the bivariate STARTS model by Kenny and Zautra (2001) without constraints over time. Measurement error can be added to the generated data through the reliability
argument. Setting the reliability-argument to 0.8 implies that 80 percent is the true score variance, and 20 measurement error variance. ICC
then denotes the proportion of true score variance captured by the random intercept factors. Estimating measurement errors (i.e., the STARTS model) is done by setting estimate_ME = TRUE
.
Extension: Imposing Constraints
The following constraints can be imposed on the estimation model using the constraints = "..."
argument:
lagged
: Time-invariant autoregressive and cross-lagged effects.residuals
: Time-invariant residual variances.within
: Time-invariant lagged effects and residual variances.stationarity
: Constraints such that at the within-unit level a stationary process is estimated. This included time-invariant lagged effects, and constraints on the residual variances.ME
: Time-invariant measurement error variances. Only possible whenestimate_ME = TRUE
.
Extension: Bounded Estimation
Bounded estimation is useful to avoid nonconvergence in small samples. Here, automatic wide bounds are used as advised by De Jonckere and Rosseel (2022), see optim.bounds
in lavOptions
. This option can only be used when no constraints are imposed on the estimation model.
References
De Jonckere, J., & Rosseel, Y. (2022). Using bounded estimation to avoid nonconvergence in small sample structural equation modeling. Structural Equation Modeling, 29(3), 412-427. doi:10.1080/10705511.2021.1982716
Kenny, D. A., & Zautra, A. (2001). Trait–state models for longitudinal data. New methods for the analysis of change (pp. 243–263). American Psychological Association. doi:10.1037/10409-008
Mulder, J. D. (2022). Power analysis for the random intercept cross-lagged panel model using the powRICLPM R-package. Structural Equation Modeling. doi:10.1080/10705511.2022.2122467
See also
summary.powRICLPM
: Summarize the setup ofpowRICLPM
object.give
: Extract information frompowRICLPM
objects.plot.powRICLPM
: Visualize resultspowRICLPM
object for a specific parameter.
Author
Jeroen D. Mulder j.d.mulder@uu.nl
Examples
# Define population parameters for lagged effects
Phi <- matrix(c(.4, .1, .2, .3), ncol = 2, byrow = TRUE)
# (optional) Set up parallel computing (i.e., multicore, speeding up the analysis)
library(future)
library(progressr)
future::plan(multisession, workers = 6)
if (FALSE) { # \dontrun{
# Run analysis (`reps` is small, because this is an example)
with_progress({
out_preliminary <- powRICLPM(
target_power = 0.8,
search_lower = 500,
search_upper = 700,
search_step = 100,
time_points = c(3, 4),
ICC = c(0.4, 0.6),
reliability = c(1, 0.8),
RI_cor = 0.3,
Phi = Phi,
within_cor = 0.3,
reps = 100,
seed = 1234
)
})
} # }