This vignette details how you can set up and execute a basic power
analysis for a bivariate random intercept cross-lagged panel model
(RI-CLPM) using the powRICLPM
package. Throughout, an
illustrating example will be used in which we wish to detect a small
cross-lagged effect
(defined here as the effect of
to
,
where
and
denote the latent within-unit components of
and
,
respectively) of 0.2 (standardized). For the design of our power
analysis we follow the steps in the strategy as described in Mulder (2023). Various extensions are available
for this basic power analysis, and are described in the Vignette Extensions.
Step 1: Determine experimental conditions
Before performing the power analysis, you must first determine the experimental conditions of interest. Experimental conditions (or: simulation conditions) are defined by characteristics of the study design that can impact statistical power. This includes, among others, characteristics like the sample size and the number of repeated measures. Decide on the number of repeated measures that will be used in the simulations, as well as the range of sample sizes over which you want to simulate the power.
For this example, we take a sample size range from 100 to 1000 first, increasing with steps of 100. Let the numbers of repeated measures range from 3 to 5. If these experimental conditions do not lead to the desired amount of power for detecting the small cross-lagged effect, the ranges can be extended later.
Step 2: Choose population parameter values
Next, determine population parameter values for generating data from the RI-CLPM. This requires the specification of:
-
Phi
: Standardized autoregressive and cross-lagged effects for the within-unit components of the model. These values are collected in a matrix, with columns representing predictors and rows representing outcomes. -
within_cor
: A correlation for the within-unit components. -
ICC
: The proportion of variance at the between-unit level (relative to the total variance). -
RI_cor
: The correlation between the random intercepts.
For our example, the parameter values are set to:
Phi <- matrix(c(.4, .1, .2, .3), ncol = 2, byrow = T)
# The .2 refers to our standardized cross-lagged effect of interest
within_cor <- 0.3
ICC <- 0.5
RI_cor <- 0.3
If you are unsure if you have specified the Phi
matrix
as intended, you can use the check_Phi()
function to give
you a summary of how the effects in your Phi
are
interpreted.
# Check `Phi` argument
check_Phi(Phi)
## According to this `Phi`, the lagged effects are:
## • Autoregressive effect of A: 0.4
## • Autoregressive effect of B: 0.3
## • Cross-lagged effect of A -> B: 0.2
## • Cross-lagged effect of B -> A: 0.1
Steps 3-5: Perform the power analysis
Steps 3 to 5 are automated by the powRICLPM()
function.
As input, you must provide:
- the desired power level using the
target_power
argument, - the range of sample sizes to simulate the power for using the
search_lower
,search_upper
, andsearch_step
arguments (alternatively, you can specify this directly by providing a vector of sample sizes to thesample_size
argument), - the number of time points for the simulated data using the
time_points
argument, - the population values
Phi
,within_cor
,ICC
, andRI_cor
, and - the number of Monte Carlo replications we want to perform per
experimental condition in the
reps
argument.
You can optionally specify:
-
alpha
: A numeric value denoting the significance criterion (default: 0.05). -
seed
: An integer to control the starting point of the random number generator. This is important to use if you want to replicate the results. When no seed it specified, a random seed will be generated and reported back to you.
Options to extend this basic power analysis setup are described in the Vignette Extensions
Now, we can perform the power analysis by running:
# Set number of replications
n_reps <- 100
output <- powRICLPM(
target_power = 0.8,
search_lower = 500,
search_upper = 1000,
search_step = 50,
time_points = c(3, 4),
ICC = ICC,
RI_cor = RI_cor,
Phi = Phi,
within_cor = 0.3,
reps = n_reps
)
Parallel processing using future
Performing a Monte Carlo power analysis with a large number of
replications, and across multiple experimental conditions can be
time-consuming. To speed up the process, it is recommended to perform
the power analysis across simulation conditions in parallel
(i.e., on multiple cores). To this end, the powRICLPM()
function has implemented future
’s parallel processing
capabilities.
Load the future
package, and use its plan()
function to change the power analysis execution from sequential
(i.e., single-core, the default), to multisession (i.e.,
multicore). Use the workers
argument to specify how many
cores you want to use. Next, run the powRICLPM
analysis,
and the power analysis will run on the specified number of cores. This
can result in a significant reduction of computing time. For more
information on other parallel execution strategies using futures, see
?future::plan()
.
Progress bar using progressr
It can be useful to get an approximation of the progress of the
powRICLPM
analysis while running the code, especially when
running the analysis in parallel. powRICLPM()
has
implemented progress notifications using the progressr
package. Simply put, there are two options through which you can get
progress notifications:
- You can subscribe to progress updates from a specific expression by
wrapping this expression with
with_progress({...})
. - You can subscribe to progress updates from everywhere by running
handlers(global = T)
.
Implementing the with_progress({...})
option, as well as
parallel execution of the powRICLPM
analysis, results in
the below code for the example:
# Load `future` and `progressr` packages
library(future)
library(progressr)
# Check how many cores are available
future::availableCores()
# Plan powRICLPM analysis to run on 1 core less than number of available cores
plan(multisession, workers = 7) # For the case of 8 available cores
# Run the powRICLPM analysis
with_progress({ # Subscribe to progress updates
output <- powRICLPM(
target_power = 0.8,
search_lower = 500,
search_upper = 1000,
search_step = 50,
time_points = c(3, 4),
ICC = ICC,
RI_cor = RI_cor,
Phi = Phi,
within_cor = 0.3,
reps = n_reps
)
})
# Revert back to sequential execution of code
plan(sequential)
For more information about progress notification options using
progressr
for end-users, including auditory and email
updates, see https://progressr.futureverse.org.
Step 6: Summarize results
The powRICLPM()
function creates a
powRICLPM
object: A list with results, upon which we can
call print()
, summary()
, give()
,
and plot()
functions to print, summarize, extract results,
and visualize the results, respectively.
print()
outputs a textual summary of the power analysis
design contained within the object it was called upon. It does not
output any performance metrics computed by the power analysis.
summary()
can be used in one of four ways. First,
summary can be used simply like print()
to get information
about the design of the power analysis (the different experimental
conditions), as well as the number of problems the occurred per
condition (e.g., non-convergence, fatal estimation errors, or
inadmissible results). Second, by specifying the
parameter = "..."
argument in summary()
, the
function will print the results specifically for that parameter across
all experimental conditions. Third, if you specify a specific
experimental condition using summary()
’s
sample_size
, time_points
, ICC
and
reliability
arguments, performance measures are outputted
for all parameters in that experimental condition.
The interpretation of the various performance measures available is
explained in the function documentation ?summary.powRICLPM()
.
# Summary of study design
summary(output)
# Summary of results for a specific parameter, across simulation conditions
summary(output, parameter = "wB2~wA1")
# Summary of all parameter for a specific simulation condition
summary(output, sample_size = 500, time_points = 4, ICC = 0.5, reliability = 1)
give()
extracts various bits of information from an
powRICLPM
object. The exact information to be extracted is
given by the what = "..."
argument:
-
what = "conditions"
gives the different experimental conditions per row, where each condition is defined by a unique combination of sample size, number of time points and ICC. -
what = "estimation_problems"
gives the proportion of fatal errors, inadmissible values, or non-converged estimations (columns) per experimental conditions (row). -
what = "results"
gives the average estimateaverage
, minimum estimateminimum
, standard deviation of parameter estimatesSD
, the average standard errorSEavg
, the mean square errorMSE
, the average width of the confidence intervalaccuracy
, the coverage ratecoverage
, and the proportion of times the p-value was lower than the significance criterionpower
. It requires setting theparameter = "..."
argument. -
what = "names"
gives the parameter names contained within thepowRICLPM
object.
# Extract experimental conditions
give(output, what = "conditions")
# Extract estimation problems
give(output, what = "estimation_problems")
# Extract results for cross-lagged effect "wB2~wA1"
give(output, what = "results", parameter = "wB2~wA1")
# Extract parameter names
give(output, what = "names")
Finally, plot()
creates a ggplot2
-plot for
a specific parameter (specified using the parameter = "..."
argument) with sample size on the x-axis, the simulated power on the
y-axis, lines grouped by number of time-points, and plots wrapped by
proportion of between-unit variance. plot()
returns a
ggplot2
object that can be fully customized using
ggplot2
functionality. For example, you can change the
scales, add titles, change geoms, etc. More information about options in
the ggplot2
framework can be found at https://ggplot2-book.org/index.html.
In the below example, I add a title and change the labels on the
x-axis:
# Create basic plot of powRICLPM object
p <- plot(output, parameter = "wB2~wA1")
p
# Adjust plot aesthetics
p2 <- p +
ggplot2::labs(
title = "Power analysis for RI-CLPM",
caption = paste0("Based on ", n_reps, " replications.")
) +
ggplot2::scale_color_discrete("Time points") +
ggplot2::guides(
color = ggplot2::guide_legend(title = "Time points", nrow = 1),
shape = ggplot2::guide_legend(title = "Reliability", nrow = 1),
fill = "none"
) +
ggplot2::scale_x_continuous(
name = "Sample size",
breaks = seq(500, 1000, 50),
guide = ggplot2::guide_axis(n.dodge = 2)
)
p2