TLDR: If you can assume close fit for the RMSEA, there is no reason why you cannot for a ChiSquare test in SEMs. The method to do this is relatively simple, and may cause SEM practitioners to reconsider the ChiSquare test.
When assessing the fit of structural equation models, it is common for applied researchers to dismiss the \(\chi^2\) test because it will almost always detect a statistically significant discrepancy between your model and the data, given a large enough sample size. This is because, almost always, our models are approximations of the data. If our modelimplied covariance matrix actually matched the sample covariance matrix within sampling variability, the \(\chi^2\) test would not be statistically significant regardless of sample size.
Because of the sensitivity of the \(\chi^2\) test to large sample sizes, practitioners often rely on other fit indices like the RMSEA
, CFI
, and TLI
 all of which are based on the \(\chi^2\). For the RMSEA, MacCallum, Browne and Sugawara (1996)^{1} specified values of .05 and .08 as indicating close and mediocre^{2} fit respectively. And in lavaan, you automatically get a test of close fit for the RMSEA with confidence intervals and a pvalue. This test actually uses the \(\chi^2\) distribution, and there is no reason why one cannot perform a \(\chi^2\) test of close or mediocre fit depending on one’s standards.^{3}
The sections that follow may include details that not everyone would like to read about, you can skip to the bottom of the page for annotated lavaan code for how to compute a \(\chi^2\) test of close or mediocre fit.
So the formula for the RMSEA is:
$$\sqrt{\frac{\chi^2df}{df(N1)}}$$
where \(\chi^2\) is the \(\chi^2\) test statistic of your model, \(df\) is your model degrees of freedom, and \(N\) is sample size.
If your model fit the data perfectly, the numerator, \(\chi^2df\), is zero; this is the hypothesis the standard \(\chi^2\)test tests. And to test this hypothesis, it uses the \(\chi^2\)distribution. If we want to perform a test of close fit on the RMSEA, we do not assume a nil null distribution for the \(\chi^2\). Instead, we use the noncentral \(\chi^2\) distribution with a noncentrality parameter that corresponds to an RMSEA of .05. The idea is we accept some level of misspecification, and we use a distribution that corresponds to this level of misspecification. Lavaan reports the result of this test as one of the fit statistics.
For those who are not familiar with noncentral distributions, they are the general family of distributions to which the distributions we are familiar with belong. For example, the \(t\)test assumes a nil (zero) null effect so we use the noncentral \(t\)distribution, with an expected value (and noncentrality parameter) of zero. This distribution is what we call the \(t\)distribution. If we want to create confidence intervals without assuming a nil effect, we can actually use a \(t\)distribution while specifying its noncentrality parameter \(\lambda\). It is the distribution when the null of zero is false. The Wikipedia introduction to this topic for the \(t\)distribution is decent.
So how does this help us? The noncentrality parameter \(\lambda\) for the RMSEA test in lavaan is actually the \(\chi^2df\) value that corresponds to an RMSEA of .05. In math:
$$RMSEA = \sqrt{\frac{\chi^2df}{df(N1)}}$$
$$RMSEA^2 = \frac{\chi^2df}{df(N1)}$$
$$RMSEA^2 \times df(N1) = \chi^2df$$
Since \(\chi^2df\) is \(\lambda\), then:
$$\lambda = RMSEA^2 \times df(N1)$$
So for a test of close fit, \(\lambda\) is:
$$RMSEA^2 \times df(N1) = .05^2 \times df(N1) = .0025 \times df(N1)$$
And for a test of mediocre fit, \(\lambda\) is:
$$RMSEA^2 \times df(N1) = .08^2 \times df(N1) = .0064 \times df(N1)$$
Note that lavaan may do things a little differently.^{4}
Hence, given a model degrees of freedom, and sample size, we can calculate the noncentrality parameter \(\lambda\). And given \(\lambda\), a \(\chi^2\) value and the degrees of freedom for the model, we can calculate the pvalue for a test of close or mediocre fit.
The R syntax for this is:
pchisq(Chisqvalue, degreesoffreedom, noncentralityparameter, FALSE)
Demonstration
library(lavaan)
data("HolzingerSwineford1939")
# model syntax for a bifactor model with the HolzingerSwineford1939 dataset
# eliminating visual factor resolves Heywood case
writeLines(syntax < paste(
paste("g =~", paste0("x", 1:9, collapse = " + ")),
# paste("visual =~", paste0("x", 1:3, collapse = " + ")),
paste("textual =~", paste0("x", 4:6, collapse = " + ")),
paste("speed =~", paste0("x", 7:9, collapse = " + ")),
sep = "\n"
))
g =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
Run model & report fit measures. Reporting only fit statistics relevant to this demonstration:
summary(hs.fit < cfa(syntax, HolzingerSwineford1939, std.lv = TRUE,
orthogonal = TRUE), fit.measures = TRUE)
lavaan (0.523.1097) converged normally after 25 iterations
Number of observations 301
Estimator ML
Minimum Function Test Statistic 42.291
Degrees of freedom 21
Pvalue (Chisquare) 0.004
Root Mean Square Error of Approximation:
RMSEA 0.058
90 Percent Confidence Interval 0.032 0.083
Pvalue RMSEA <= 0.05 0.276
Chisquare is statistically significant, this test of perfect fit suggests the misfit between our modelimplied cov matrix and sample cov matrix is greater than expected due to sampling variability.
The default Chisquare test:
pchisq(q = 42.291, df = 21, ncp = 0, lower.tail = FALSE)
[1] 0.003867178
Use formula above to calculate noncentrality parameter for test of close fit: .0025 multiplied by model degrees of freedom by sample size  1
(ncp.close < .0025 * 21 * (301  1))
[1] 15.75
Calculate Chisquare test of close fit:
pchisq(q = 42.291, df = 21, ncp = ncp.close, lower.tail = FALSE)
[1] 0.2740353
The pvalue for a test of close fit is .27, close to the value reported by lavaan. The reason they are not closer is that lavaan does not subtract 1 from the sample size when calculating the noncentrality parameter under its default settings for ML. See below for details.^{4}
And if we lower our standards to conduct a chisquare test of mediocre fit: .0064 multiplied by model degrees of freedom by sample size  1
(ncp.med < .0064 * 21 * (301  1))
[1] 40.32
pchisq(q = 42.291, df = 21, ncp = ncp.med, lower.tail = FALSE)
[1] 0.9199686
If we assume mediocre misspecification in our model, the probability of observing our modelimplied covariance matrix is 92%. Pretty good.
In closing, SEM practitioners typically report the \(\chi^2\)test, but routinely expect the test to detect model misspecification, so often ignore it in practice. I hope the steps above show how one can conduct \(\chi^2\)tests that assume some degree of model misspecification as the null hypothesis. I guess I hope that by doing this, we can make our \(\chi^2\)tests somewhat relevant. The nice thing about the RMSEA and CI lavaan provides is that together, they may be more informative than a pvalue from a \(\chi^2\) test.
P.S.: Another approach to latent variable modeling is PLS path modeling. It is a method for SEMs based on OLS regression. It stems from the work of Hermann Wold. Wold was Joreskog’s (LISREL) advisor, Joreskog was Muthen’s (Mplus) advisor. This is why my title uses covariancebased SEM instead of latent variable models or just SEMs.

MacCallum, R. C., Browne, M. W., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1(2), 130–149. https://doi.org/10.1037/1082989X.1.2.130 ↩︎

I always thought mediocre meant a bad thing, it only means unexceptional, ordinary. ↩︎

I got this unoriginal idea from discussing with one of my colleagues, Menglin Xu. We were chatting around 11 pm in the office and she mentioned the noncentral \(\chi^2\) distribution in SEMs. Given my interest in noncentral distributions in relation to confidence intervals for effect sizes, this idea came to mind. ↩︎

I found out by digging around this page and calculating in R. I continued exploring and noticed lavaan using \(N\) only happens with ML estimation. If you try WLSMV estimation, lavaan uses \(N1\); and I got very confused on noticing this and emailed one of my factor analysis professors, Paul De Boeck. He replied in an email mentioning Wishart, bias correction and the lavaan manual. From the lavaan manual, lavaan’s default for ML estimation is something it refers to as the normal likelihood approach. When it does this, it uses \(N\). If you change it to the wishart likelihood approach by specifying
likelihood = "wishart"
within thesem()
,cfa()
orlavaan()
functions, it then uses \(N1\). This is only relevant for ML estimation. For other estimation methods, it’s \(N1\). I spent a few hours learning about the problem then trying to figure out what was going on, and I got an email reply within minutes of emailing my professor :). From the lavaan website on Wishart versus Normal. ↩︎ ↩︎
Comments powered by Talkyard.