# Clear work space and load necessary packages
rm(list = ls())
library(tidyverse) # General data handling and manipulation
library(mirt) # Simulate data from a multivariate normal distribution
library(mice) # Induce missingness at random to the data
# Set the seed for random number generation to ensure reproducibility
set.seed(2024)
# Number of items
<- 30
n_items
# Discrimination parameters for both dimensionas (a_i1, a_i2) are randomly
# drawn from a log-normal as it reflects the natural variation and
# positive skew typically observed in empirical data.
<- c(rlnorm(n_items/2,.2,.3), rep(NA, n_items/2))
a_i1 <- c(rep(NA, n_items/2), rlnorm(n_items/2,.2,.3))
a_i2 <- cbind(a_i1, a_i2) %>% as.matrix()
a_i
# Difficulty parameters (b_i) are randomly drawn from a normal distribution
<- rnorm(n_items, 0, 1) b_i
Objective
The example describes the examination of the relationship between two achievement tests: A 20-item math test and a 20-item reading test in secondary school. Both tests are scored dichotomously (true/false) with a correlation of \(\rho = .70\) between the skills. The parameter of interest is the standard error of the estimated correlation between math and reading. The test design features to be varied are the sample size and if there is a speededness effect present. In this example, we will use a speededness effect that will induce some missingness in the last third of the items.
I. Determining the data generation for the complete dataset
- Number and distribution of factors (unidimensional vs. multidimensional)
- Number of items and item parameters (discriminations, difficulties)
- Item type (dichotomous, polytomous)
True item parameters are generated for the 30 dichotomous items on the two dimensions (math and reading).
The conditional probability that a person \(p\) with ability parameters on both dimensions, \(\theta_{p1}\) and \(\theta_{p2}\), answers the item \(i\) correctly is given by the following formula:
\[ P(X_{pi} = 1 | a_{i1}, a_{i2}, b_{i}, \theta_{p1}, \theta_{p2}) = \frac{1}{1 + \exp (-(a_{i1} \theta_{p1} + a_{i2} \theta_{p2} - b_i))} \]
where \(a_{i1}\) and \(a_{i2}\) denote the discrimination parameters for the two dimensions, and \(b_i\) the difficulty parameter of item \(i\).
The function generate_data_mv
uses mirt::simdata
to simulate dichotomous data given the item discriminations (\(a\)), the item difficulties (\(b\)), and the number of observations (\(n\)). The responses are generated according to a multivariate normal distribution with a predefined covariance (or correlation) matrix (\(\Sigma\)) and a mean vector (\(\mu\)). Note that in the mirt
package, the transformation \(d = -a*b\) is used to match the slope-intercept parameterization of the logistic model.
# Simulate dichotomous item responses based on the following parameters
# - 'a' denotes the item discriminations
# - 'b' denotes and the item difficulties
# - 'n' denotes the number of observations
# The remaining arguments are preset
# - 'sigma' denotes the covariance matrix
# - 'mu' denotes the mean vector
# - 'itemtype' specifies the type of item (i.e., dichotomous)
<- function(a, b, n) {
generate_data_mv <-
resp simdata(a = a, d = -a*b, N = n,
mu = c(0, 0),
sigma = matrix(c(1, 0.7, 0.7, 1), nrow = 2),
itemtype = "dich") %>%
as.data.frame()
return(resp)
}
II. Defining the test design and the process of missing values
- Pattern of missingness (e.g., type of missingness, linking design)
- Amount of missing data
Each achievement test consists of 15 items. There is a speededness effect for the last 5 items for each of the separately timed tests. Thus, later items in both tests are less likely to be answered (response rate item 11: ~82%, items 12-13: ~73%, and items 14-15: ~46%, which results from the response pattern and the missing rate). In the present case, the probability of missing data depends on the observed data only, which is equivalent to Missing At Random (MAR).
The function data_MAR_design
uses the complete simulated data and deletes values using mice::ampute
. For more information on the multivariate amputation procedure see Schouten et al. (2016) and the accompanying vignette.
# Induce missingness at random (MAR) to the complete simulated data set
# - 'resp' denotes the complete data set
# - 'prop' denotes the severity of missingness
# (has to be seen in combination with missingness pattern)
# The remaining arguments are preset
# - 'patterns' denotes a matrix defining a specific missing data patterns
# - 'mech' denotes the mechanism of missing data
# - 'type' denotes the type of missingness, 'RIGHT' means that later
# items more likely to be missing
<- function(resp, miss_rates = 0) {
data_MAR_design
if(miss_rates != 0) {
# create some pattern of missingness for the first test (only the last 5 items)
# adp <- ampute.default.patterns(5)
<- cbind(c(0,1,1,1,1),
adp c(1,0,1,1,1),
c(1,1,0,1,1),
c(1,1,1,0,1),
c(1,1,1,1,0))
<- rbind(
pattern cbind(matrix(1, nrow =5, ncol = 10), adp),
c(1,1,1,1,1,1,1,1,1,1,1,1,1,0,0),
c(1,1,1,1,1,1,1,1,1,1,1,1,0,0,0),
c(1,1,1,1,1,1,1,1,1,1,1,0,1,0,0),
c(1,1,1,1,1,1,1,1,1,1,0,1,1,0,0),
c(1,1,1,1,1,1,1,1,1,1,1,0,0,0,0))
# Resample rows of patterns of missingness for the second test
<- pattern[sample(10, 10, replace = FALSE), ]
pattern_res1 <- pattern[sample(10, 10, replace = FALSE), ]
pattern_res2
# Append the resampled pattern
<- unique(rbind(cbind(pattern, pattern_res1),
pattern_miss cbind(pattern, pattern_res2)))
# Apply ampute to create missing data under MAR
<- ampute(data = resp,
dat_amp prop = miss_rates, # severity of missingness
patterns = pattern_miss, # custom patterns of missingness
mech = "MAR", # missing mechanism
type = "RIGHT") # missingness type
<- dat_amp$amp
resp # md.pattern(resp)
}
return(resp)
}
III. Selecting the IRT model and the parameter of interest
- Underlying IRT model (e.g., 1PL, 2PL)
- IRT modeling software and estimation method
- Parameters to extract
The estimate_irt
function estimates the multidimensional two-parametric item response model (Birnbaum model) with mirt
using the simulated data and returns the standard error of the correlation between the latent traits.
# Estimate item response model
# - 'resp' denotes the data set
<- function(resp) {
estimate_irt <- paste('math = 1-', n_items/2, "\n",
model_2dim 'read = ', n_items/2+1, '-', n_items, "\n",
'COV = math * read ')
# Fit the two-dimensional model using mirt
<- tryCatch(
mod mirt(data = resp, # Item responses
model = model_2dim,
itemtype = '2PL', # 2PL model
SE = TRUE),
error = function(e) NULL
)
# Calculate standard errors for the correlation if model is successfully estimated
if (!is.null(mod)) {
<- coef(mod, printSE = TRUE)$GroupPars["SE", "COV_21"] # Extract the standard error of the correlation
est else {
} <- NA
est
}
return(est)
}
IV. Setting up the Monte Carlo simulation
- Number of iterations
- Sample sizes to evaluate
The Monte Carlo simulation runs n_iterations
times, including the previous steps of (i) determining the data generation for the complete dataset, (ii) defining the test design and the process of missing values, (iii) selecting the IRT model and the parameter of interest. The simulation is run for different sample sizes between 100 and 500 (in increments of 50), with and without the speededness effect.
Using an estimated standard deviation for the standard error of the correlation (\(\sigma = 0.013\)) derived from 500 iterations, a specified level of accuracy (\(\delta = .001\)), and a significance level (\(\alpha = .05\)), the number of iterations needed amounts to approximately 649.
# Number of iterations
<- 649
n_iterations
# Sample sizes (from 100 to 500, in steps of 50) and
# the presence of a speededness factor (yes/no)
<- expand.grid(n_persons = seq(100, 500, 50),
grid miss_rates = c(.90, 0))
# Create data frame for results (res)
<- data.frame()
res
# Check if result file already exists
if (file.exists("example_4_res.rds")) {
<- readRDS("example_4_res.rds")
res else {
} # Loop over different sample sizes and missing rate combinations
for (g in 1:nrow(grid)) {
# Nested loop, running 'n_iterations' times
for (iter in 1:n_iterations) {
# Generate a random sample with 'n_persons' from a multivariate distribution
<-
dat generate_data_mv(a_i, b_i, grid[g, "n_persons"]) %>%
data_MAR_design(miss_rates = grid[g, "miss_rates"])
<- bind_rows(res,
res data.frame(iteration = iter,
n_persons = grid[g, "n_persons"],
percent_miss = grid[g, "miss_rates"],
se_est = estimate_irt(dat)))
}
}
# Save the results
saveRDS(res, file = "example_4_res.rds")
}
Results
The figure shows the average standard error for the correlation between the latent traits as a function of sample size on the one hand, and the presence or absence of a speededness effect on the other.
# Preparation and aggregation of results
<- res %>%
res_plot group_by(n_persons, percent_miss) %>%
summarise(
m_SE = mean(se_est^2, na.rm = TRUE)^.5, # Calculate the average SE
.groups = 'drop')
# Plot standard error of correlation
ggplot(data=res_plot, aes(x=n_persons, y = m_SE, color = factor(percent_miss))) +
geom_line(linewidth = 0.8, alpha = 0.8) +
geom_point(size = 1.5, alpha = 0.8) +
geom_abline(intercept = .05, slope = 0, col="red", lty = "twodash") +
labs(
x = "Sample size",
y = "Average SE(Correlation)",
color = "Speededness\neffect",
linetype = "Speededness\neffect"
+
) scale_color_manual(values = c("0.9" = "aquamarine3",
"0" = "darkolivegreen2"),
labels = c("0.9" = "yes",
"0" = "no")) +
ylim(0, 0.12) +
xlim(100, 500) +
theme_bw() +
theme(
axis.title = element_text(size = 14),
axis.text = element_text(size = 12),
legend.title = element_text(size = 14),
legend.text = element_text(size = 12),
legend.position = "inside",
legend.position.inside = c(.85, .80)
)
# Documentation for transparency and reproducibility
print(sessionInfo(), locale=FALSE)
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)
Matrix products: default
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] mice_3.17.0 mirt_1.42.6 lattice_0.22-6 lubridate_1.9.3
[5] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4 purrr_1.0.2
[9] readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
[13] tidyverse_2.0.0
loaded via a namespace (and not attached):
[1] tidyselect_1.2.1 farver_2.1.2 R.utils_2.12.3
[4] fastmap_1.2.0 rpart_4.1.23 digest_0.6.37
[7] timechange_0.3.0 lifecycle_1.0.4 Deriv_4.1.6
[10] dcurver_0.9.2 cluster_2.1.6 survival_3.6-4
[13] magrittr_2.0.3 compiler_4.4.1 rlang_1.1.4
[16] tools_4.4.1 utf8_1.2.4 yaml_2.3.10
[19] knitr_1.49 labeling_0.4.3 htmlwidgets_1.6.4
[22] curl_5.2.2 withr_3.0.2 R.oo_1.27.0
[25] nnet_7.3-19 grid_4.4.1 fansi_1.0.6
[28] jomo_2.7-6 colorspace_2.1-1 future_1.34.0
[31] progressr_0.15.0 GPArotation_2024.3-1 iterators_1.0.14
[34] globals_0.16.3 scales_1.3.0 MASS_7.3-60.2
[37] cli_3.6.3 rmarkdown_2.29 vegan_2.6-8
[40] generics_0.1.3 rstudioapi_0.16.0 future.apply_1.11.3
[43] SimDesign_2.17.1 tzdb_0.4.0 sessioninfo_1.2.2
[46] minqa_1.2.8 pbapply_1.7-2 audio_0.1-11
[49] splines_4.4.1 parallel_4.4.1 beepr_2.0
[52] vctrs_0.6.5 boot_1.3-30 glmnet_4.1-8
[55] Matrix_1.7-0 jsonlite_1.8.9 hms_1.1.3
[58] mitml_0.4-5 listenv_0.9.1 testthat_3.2.1.1
[61] foreach_1.5.2 snow_0.4-4 glue_1.8.0
[64] parallelly_1.38.0 pan_1.9 nloptr_2.1.1
[67] codetools_0.2-20 shape_1.4.6.1 stringi_1.8.4
[70] gtable_0.3.6 lme4_1.1-35.5 munsell_0.5.1
[73] pillar_1.9.0 htmltools_0.5.8.1 brio_1.1.5
[76] R6_2.5.1 evaluate_1.0.1 R.methodsS3_1.8.2
[79] backports_1.5.0 RPushbullet_0.3.4 broom_1.0.6
[82] Rcpp_1.0.13-1 gridExtra_2.3 nlme_3.1-164
[85] permute_0.9-7 mgcv_1.9-1 xfun_0.49
[88] pkgconfig_2.0.3