# Clear work space and load necessary packages
rm(list = ls())
library(tidyverse) # General data handling and manipulation
library(TAM) # Item response theory modeling
library(mirt) # Simulate data from a multivariate normal distribution
library(MASS) # Simulate data from a multivariate normal distribution
# Set the seed for random number generation to ensure reproducibility
set.seed(2024)
# Number of items
<- 30
n_items
# Discrimination parameters (a_i) are randomly drawn from a log-normal
# distribution, as it reflects the natural variation and positive skew
# typically observed in empirical data.
<- rlnorm(n_items,.2,.3)
a_i
# Difficulty parameters (b_i) are randomly drawn from a normal distribution
<- rnorm(n_items, 0, 1) b_i
Objective
The example describes the validation of a newly developed computerized personality test with a forced-choice response format comparable to the Eysenck Personality Inventory (e.g., “Do you prefer reading to going out?”, yes/no). The test is assumed to contain 30 items. A random sample of the personality test items is drawn and the correlation with an external metric criterion (e.g., number of Facebook friends) is estimated. The parameter of interest is the standard error of the correlation between the latent trait and the criterion, which depends on both the sample size and the amount of missingness.
I. Determining the data generation for the complete dataset
- Number and distribution of factors (unidimensional vs. multidimensional)
- Number of items and item parameters (discriminations, difficulties)
- Item type (dichotomous, polytomous)
The true item parameters for the 30 dichotomous items are generated.
The function generate_mv_data
uses mirt::simdata
to simulate dichotomous data given the item discriminations (\(a\)), the item difficulties (\(b\)), and the trait distribution (\(\theta\)). In contrast to the first example, individuals’ abilities (\(\theta\)) are generated according to a multivariate normal distribution with a correlation of \(\rho = 0.5\) between the latent trait and the criterion.
Note that the mirt
package uses slope-intercept parameterization, so the item difficulty parameters (\(b\)) must be transformed into the item intercepts \(d_i = -a_i*b_i\). The transformation between the traditional IRT parameters (item discrimination and item difficulty) and the slope-intercept parameters can also be done with mirt::traditional2mirt
.
# Simulate dichotomous item responses based on the generated theta values
# - 'a' denotes the item discriminations
# - 'b' denotes the item difficulties
# - 'theta' denotes the univariate trait distribution
<- function(a, b, theta) {
generate_mv_data <-
resp simdata(a = a, d = -a*b, Theta = theta[, 1], itemtype = "dich") %>%
as.data.frame()
return(resp)
}
II. Defining the test design and the process of missing values
- Pattern of missingness (e.g., type of missingness, linking design)
- Amount of missing data
The items of the computer-administered personality test are randomly drawn from a larger item pool. This kind of missingness, where the probability of missing data on a variable is independent of both observed and unobserved data, is known as Missing Completely At Random (MCAR). The function data_MCAR_design
uses the complete simulated data and deletes observations under the assumption of MCAR.
# Induce missingness completely at random (MCAR) to the complete simulated data set
# - 'resp' denotes the complete data set
# - 'miss_rates' denotes the amount of missingness [0;1]
<- function(resp, miss_rates = 0) {
data_MCAR_design <- t(apply(resp, 1, function(row) {
resp sample(seq_along(row), length(row) * miss_rates)] <- NA;
row[return(row)
}))
return(resp)
}
III. Selecting the IRT model and the parameter of interest
- Underlying IRT model (e.g., 1PL, 2PL)
- IRT modeling software and estimation method
- Parameters to extract
The tam.mml.2pl
function from the TAM
package is used to estimate the two-parametric item response model (Birnbaum model) using the simulated data. The function returns the standard error of the correlation between the latent trait and the metric covariate. Note that this example uses TAM
because the criterion can be included in a latent regression model, and the standard error of the regression coefficient can be extracted directly. Alternatively, a bootstrapped standard error could be obtained via mirt
using the R package boot
.
# Estimate item response model
# - 'resp' denotes the data set
<- function(resp) {
estimate_irt
# Estimate a 2PL model using try-catch to handle errors, the 2nd dimension of 'theta'
# is included as an additional predictor variable.
<- tryCatch(
mod tam.mml.2pl(resp = resp, # 2PL model
Y = data.frame(theta[, 2]), # matrix of covariates in latent regression
verbose = FALSE),
error = function(e) NULL
)
# Calculate standard errors for the correlation if model is successfully estimated
if (!is.null(mod)) {
<- tam.se(mod)$beta[2, 2] # Extract the standard error of the correlation
est else {
} <- NA
est
}
return(est)
}
IV. Setting up the Monte Carlo Simulation
- Number of iterations
- Sample sizes to evaluate
The Monte Carlo simulation runs n_iterations
times, including the previous steps of (i) determining the data generation for the complete dataset, (ii) defining the test design and the process of missing values, (iii) selecting the IRT model and the parameter of interest. The simulation is run for different sample sizes between 200 and 700 (in increments of 50) and three different missing rates (0%, 33%, 67%).
The standard deviation of the SE of the correlation derived from 500 iterations was low (\(\sigma = .0052\)), also in the most demanding condition (\(n\) = 200, missing rate = 67%). Combined with a specified level of accuracy (\(\delta = .001\)) and a significance level (\(\alpha = .05\)), this implies a number of required iterations of approximately 104. Therefore, we use 104 iterations for the different sample sizes and missing rates.
# Number of iterations
<- 104
n_iterations
# Sample sizes (from 200 to 700, in steps of 50) and three missing rates
<- expand.grid(n_persons = seq(200, 700, 50),
grid miss_rates = c(0.67, 0.33, 0.00))
# Create data frame for results (res)
<- data.frame()
res
# Check if result file already exists
if (file.exists("example_2_res.rds")) {
<- readRDS("example_2_res.rds")
res else {
} # Loop over different sample sizes and missing rate combinations
for (g in 1:nrow(grid)) {
# Nested loop, running 'n_iterations' times
for (iter in 1:n_iterations) {
# Generate a random sample with 'n_persons' from a multivariate normal distribution
# - 'mu' denotes the mean vector for the multivariate normal distribution
# - 'Sigma' denotes the covariance matrix with a correlation of 0.5 between dimensions
<- mvrnorm(n = grid[g, "n_persons"],
theta mu = c(0, 0),
Sigma = matrix(c(1, 0.5, 0.5, 1), nrow = 2))
<- generate_mv_data(a_i, b_i, theta) %>%
dat data_MCAR_design(miss_rates = grid[g, "miss_rates"])
<- bind_rows(res,
res data.frame(iteration = iter,
n_persons = grid[g, "n_persons"],
percent_miss = grid[g, "miss_rates"],
se_est = estimate_irt(dat)))
}
}
# Save the results
saveRDS(res, file = "example_2_res.rds")
}
Results
The figure shows the average standard error for the correlation between the latent trait and an external criterion as a function of sample size on one hand, and the amount of missingness on the other.
# Preparation and aggregation of results
<- res %>%
res_plot mutate(percent_miss_dec = sprintf("%.2f", percent_miss)) %>% group_by(n_persons, percent_miss_dec) %>%
summarise(
m_SE = mean(se_est^2, na.rm = TRUE)^.5, # Calculate the average SE
.groups = 'drop')
# Plot standard error of correlation
ggplot(data=res_plot, aes(x=n_persons, y = m_SE, color = factor(percent_miss_dec))) +
geom_line(linewidth = 0.8, alpha = 0.8) +
geom_point(size = 1.5, alpha = 0.8) +
geom_abline(intercept = .05, slope = 0, col="red", lty = "twodash") +
labs(
x = "Sample size",
y = "Average SE(Correlation)",
color = "Percentage\nmissingness",
linetype = "Percentage\nmissingness"
+
) scale_color_manual(values = c("0.00" = "darkolivegreen",
"0.33" = "aquamarine3",
"0.67" = "darkolivegreen2")) +
ylim(0, 0.15) +
xlim(200, 700) +
theme_bw() +
theme(
axis.title = element_text(size = 14),
axis.text = element_text(size = 12),
legend.title = element_text(size = 14),
legend.text = element_text(size = 12),
legend.position = "inside",
legend.position.inside = c(.85, .80)
)
# Documentation for transparency and reproducibility
print(sessionInfo(), locale=FALSE)
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] MASS_7.3-60.2 mirt_1.42.6 lattice_0.22-6 TAM_4.2-21
[5] CDM_8.2-6 mvtnorm_1.3-0 lubridate_1.9.3 forcats_1.0.0
[9] stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2 readr_2.1.5
[13] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1 tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] tidyselect_1.2.1 farver_2.1.2 R.utils_2.12.3
[4] fastmap_1.2.0 digest_0.6.37 timechange_0.3.0
[7] lifecycle_1.0.4 Deriv_4.1.6 dcurver_0.9.2
[10] cluster_2.1.6 magrittr_2.0.3 compiler_4.4.1
[13] rlang_1.1.4 tools_4.4.1 utf8_1.2.4
[16] yaml_2.3.10 knitr_1.49 labeling_0.4.3
[19] htmlwidgets_1.6.4 curl_5.2.2 withr_3.0.2
[22] R.oo_1.27.0 grid_4.4.1 fansi_1.0.6
[25] colorspace_2.1-1 future_1.34.0 progressr_0.15.0
[28] GPArotation_2024.3-1 globals_0.16.3 scales_1.3.0
[31] cli_3.6.3 rmarkdown_2.29 vegan_2.6-8
[34] generics_0.1.3 rstudioapi_0.16.0 future.apply_1.11.3
[37] SimDesign_2.17.1 tzdb_0.4.0 sessioninfo_1.2.2
[40] pbapply_1.7-2 audio_0.1-11 splines_4.4.1
[43] parallel_4.4.1 beepr_2.0 vctrs_0.6.5
[46] Matrix_1.7-0 jsonlite_1.8.9 polycor_0.8-1
[49] hms_1.1.3 listenv_0.9.1 testthat_3.2.1.1
[52] snow_0.4-4 glue_1.8.0 parallelly_1.38.0
[55] admisc_0.35 codetools_0.2-20 stringi_1.8.4
[58] gtable_0.3.6 munsell_0.5.1 pillar_1.9.0
[61] htmltools_0.5.8.1 brio_1.1.5 R6_2.5.1
[64] evaluate_1.0.1 R.methodsS3_1.8.2 RPushbullet_0.3.4
[67] Rcpp_1.0.13-1 gridExtra_2.3 nlme_3.1-164
[70] permute_0.9-7 mgcv_1.9-1 xfun_0.49
[73] pkgconfig_2.0.3