6.2 Tau-Equivalent Models#

Essentially Tau-Equivalent Model#

The Essentially tau-equivalent measurement model is also quite flexible but it has one more restriction compared to the Tau Congeneric measurement model. It assumes that

  • items differ in their difficulty

  • items are equivalent in their discrimination power

  • items vary in their reliability

We therefore obtain estimates for the intercepts (Intercepts section) and for the errors (Variances section). A Latent Variables section is still present, but all loadings are fixed to 1.

Fit the model#

Usage#

This notebook fits essentially tau-equivalent and tau-equivalent models to the Data_EmotionalClarity.dat items. After loading the prepared item matrix, the models are specified in lavaan and compared via fit indices. FIrst we load our packages and our data.

# General imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Rpy2 imports
from rpy2 import robjects as ro
from rpy2.robjects import pandas2ri, numpy2ri
from rpy2.robjects.packages import importr

# Automatic conversion of arrays and dataframes
pandas2ri.activate()
numpy2ri.activate()

# Set random seed for reproducibility
ro.r('set.seed(123)')

# Ipython extenrsion for magix plotting
%load_ext rpy2.ipython

# R imports
importr('base')
importr('lavaan')
importr('psych')
importr('stats')

# Load data
file_name = "data/Data_EmotionalClarity.dat"
dat = pd.read_csv(file_name, sep="\t")
dat2 = dat.iloc[:, 1:7]
ro.globalenv['dat2'] = dat2
dat2.describe()
The rpy2.ipython extension is already loaded. To reload it, use:
  %reload_ext rpy2.ipython
item_1 item_2 item_3 item_4 item_5 item_6
count 238.000000 238.000000 238.000000 238.000000 238.000000 238.000000
mean 1.504005 1.422903 1.392156 1.304696 1.346359 1.305712
std 0.359060 0.368799 0.392299 0.407597 0.376931 0.383313
min 0.201307 0.046884 0.047837 0.038259 0.162969 0.061095
25% 1.281558 1.185936 1.155308 1.052121 1.129379 1.075682
50% 1.525186 1.422746 1.369148 1.289783 1.347554 1.286473
75% 1.737302 1.651186 1.643980 1.551804 1.567729 1.557722
max 2.437029 2.403697 2.455821 2.441999 2.408026 2.244108
# Specify the model
ro.r("mete = 'eta=~ item_1 + 1*item_2 + 1*item_3 + 1*item_4 + 1*item_5 + 1*item_6'")
# Fit the model
ro.r('fitmete = sem(mete, data=dat2, meanstructure=TRUE)')
# Print the output of the model for interpretation
summary_fitmete = ro.r("summary(fitmete, fit.measures=TRUE, standardized=TRUE)")
print(summary_fitmete)
lavaan 0.6-19 ended normally after 12 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        13

  Number of observations                           238

Model Test User Model:
                                                      
  Test statistic                                16.949
  Degrees of freedom                                14
  P-value (Chi-square)                           0.259

Model Test Baseline Model:

  Test statistic                               435.847
  Degrees of freedom                                15
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.993
  Tucker-Lewis Index (TLI)                       0.992

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -435.870
  Loglikelihood unrestricted model (H1)       -427.396
                                                      
  Akaike (AIC)                                 897.740
  Bayesian (BIC)                               942.880
  Sample-size adjusted Bayesian (SABIC)        901.674

Root Mean Square Error of Approximation:

  RMSEA                                          0.030
  90 Percent confidence interval - lower         0.000
  90 Percent confidence interval - upper         0.073
  P-value H_0: RMSEA <= 0.050                    0.737
  P-value H_0: RMSEA >= 0.080                    0.023

Standardized Root Mean Square Residual:

  SRMR                                           0.053

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  eta =~                                                                
    item_1            1.000                               0.253    0.682
    item_2            1.000                               0.253    0.689
    item_3            1.000                               0.253    0.664
    item_4            1.000                               0.253    0.656
    item_5            1.000                               0.253    0.657
    item_6            1.000                               0.253    0.650

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .item_1            1.504    0.024   62.432    0.000    1.504    4.047
   .item_2            1.423    0.024   59.697    0.000    1.423    3.870
   .item_3            1.392    0.025   56.265    0.000    1.392    3.647
   .item_4            1.305    0.025   52.077    0.000    1.305    3.376
   .item_5            1.346    0.025   53.815    0.000    1.346    3.488
   .item_6            1.306    0.025   51.677    0.000    1.306    3.350

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .item_1            0.074    0.008    9.254    0.000    0.074    0.535
   .item_2            0.071    0.008    9.186    0.000    0.071    0.525
   .item_3            0.081    0.009    9.410    0.000    0.081    0.559
   .item_4            0.085    0.009    9.475    0.000    0.085    0.570
   .item_5            0.085    0.009    9.468    0.000    0.085    0.569
   .item_6            0.088    0.009    9.518    0.000    0.088    0.577
    eta               0.064    0.007    9.004    0.000    1.000    1.000

You can see that the output looks very similar to that of the Tau Congeneric measurement model. Interpretation of the intercepts (Intercepts section) and the errors (Variances section) remains the same. The difference is that the loadings (Latent Variables section) are fixed to 1, implying equal discriminatory power for all items. Graphically, this results in parallel slopes across items. The fit indices are interpreted as before for the Tau Congeneric model.

Compare model fit#

Next, let’s compare the models we just fitted.

# recreate the tau-congeneric
ro.r("mtc = 'eta =~ item_1 + item_2 + item_3 + item_4 + item_5 + item_6'")
ro.r('fitmtc = sem(mtc, data=dat2, meanstructure=TRUE)')

# Essentially tau-equivalent model (notice the 1* in the model)
ro.r("mete = 'eta=~ item_1 + 1*item_2 + 1*item_3 + 1*item_4 + 1*item_5 + 1*item_6'")
ro.r('fitmete = sem(mete, data=dat2, meanstructure=TRUE)')

# Perform anova and print indexes
anova_mete_mtc = ro.r("anova(fitmete, fitmtc)")
print(anova_mete_mtc)
Chi-Squared Difference Test

        Df    AIC    BIC   Chisq Chisq diff    RMSEA Df diff Pr(>Chisq)
fitmtc   9 900.36 962.86  9.5683                                       
fitmete 14 897.74 942.88 16.9488     7.3805 0.044726       5     0.1938

According to the BIC and AIC the more restricted Essentially tau-equivalent model has a better model fit compared to the Tau Congeneric measurement model (as lower values for AIC and BIC indicate better model fit). The \(\chi^2\) Test however suggests that there are no significant differences in model fit as indicated by p > .05. This result is not too surprising as we already saw quite similar loading estimates across items in the Tau Congeneric measurement model (see Tau-congeneric notebook). Therefore, restricting the loadings to equivalence isn’t too much of a deviation from the Tau Congeneric measurement model (which does not restrict the loadings), resulting in a insignificant difference in model fit.

Tau-Equivalent Model#

The Tau-equivalent measurement model has one more restriction compared to the Essentially tau-equivalent model. It assumes that

  • items are equivalent in their difficulty

  • items are equivalent in their discrimination power

  • items vary in their reliability

We therefore obtain only the error estimates (Variances section). A Latent Variables and an Intercepts section still appear, but all loadings and intercepts are fixed.

Fit the model#

Since we need to code multiple lines in R at once, we will code such syntax as follows:

# The model
ro.r("""
      mte = 'eta =~ item_1 + 1*item_2 + 1*item_3 + 1*item_4 + 1*item_5 + 1*item_6
      item_1 ~ a*1
      item_2 ~ a*1
      item_3 ~ a*1
      item_4 ~ a*1
      item_5 ~ a*1
      item_6 ~ a*1'
      
      """)

everythin within the “”” r_code_here “”” will be passed as R code.

# Specify the model
ro.r("""
      mte = 'eta =~ item_1 + 1*item_2 + 1*item_3 + 1*item_4 + 1*item_5 + 1*item_6
      item_1 ~ a*1
      item_2 ~ a*1
      item_3 ~ a*1
      item_4 ~ a*1
      item_5 ~ a*1
      item_6 ~ a*1'
      
      """)
# Fit the model
ro.r('fitmte <- sem(mte, data=dat2, meanstructure=TRUE, estimator="ML")')
# Print the output of the model for interpretation
summary_fitmte = ro.r("summary(fitmte, fit.measures=TRUE, standardized=TRUE)")
print(summary_fitmte)
lavaan 0.6-19 ended normally after 13 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        13
  Number of equality constraints                     5

  Number of observations                           238

Model Test User Model:
                                                      
  Test statistic                               100.116
  Degrees of freedom                                19
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                               435.847
  Degrees of freedom                                15
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.807
  Tucker-Lewis Index (TLI)                       0.848

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)               -477.454
  Loglikelihood unrestricted model (H1)       -427.396
                                                      
  Akaike (AIC)                                 970.908
  Bayesian (BIC)                               998.686
  Sample-size adjusted Bayesian (SABIC)        973.329

Root Mean Square Error of Approximation:

  RMSEA                                          0.134
  90 Percent confidence interval - lower         0.109
  90 Percent confidence interval - upper         0.160
  P-value H_0: RMSEA <= 0.050                    0.000
  P-value H_0: RMSEA >= 0.080                    1.000

Standardized Root Mean Square Residual:

  SRMR                                           0.111

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  eta =~                                                                
    item_1            1.000                               0.252    0.637
    item_2            1.000                               0.252    0.683
    item_3            1.000                               0.252    0.663
    item_4            1.000                               0.252    0.639
    item_5            1.000                               0.252    0.655
    item_6            1.000                               0.252    0.634

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .item_1     (a)    1.381    0.018   76.284    0.000    1.381    3.485
   .item_2     (a)    1.381    0.018   76.284    0.000    1.381    3.736
   .item_3     (a)    1.381    0.018   76.284    0.000    1.381    3.630
   .item_4     (a)    1.381    0.018   76.284    0.000    1.381    3.496
   .item_5     (a)    1.381    0.018   76.284    0.000    1.381    3.585
   .item_6     (a)    1.381    0.018   76.284    0.000    1.381    3.471

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .item_1            0.093    0.010    9.529    0.000    0.093    0.594
   .item_2            0.073    0.008    9.138    0.000    0.073    0.534
   .item_3            0.081    0.009    9.318    0.000    0.081    0.560
   .item_4            0.092    0.010    9.513    0.000    0.092    0.592
   .item_5            0.085    0.009    9.387    0.000    0.085    0.571
   .item_6            0.095    0.010    9.548    0.000    0.095    0.598
    eta               0.064    0.007    8.880    0.000    1.000    1.000

Again, the output looks very similar to the previous ones. The interpretation also is equivalent to before. The only difference is that the loadings (Latent Variables section) and the intercept (Intercepts section) are fixed, meaning that we assume that all items have the same discriminatory power and the same difficulty. Graphically speaking, this means that the slopes and the intercepts of the items are equivalent. The interpretation of the fit indices is analogous to the Tau Congeneric measurement model (see above).

Compare model fit#

As before, we can use the anova() function to compare the model fits.

# Perform anova and print indexes
anova_mete_mte = ro.r("anova(fitmete, fitmte)")
print(anova_mete_mte)
Chi-Squared Difference Test

        Df    AIC    BIC   Chisq Chisq diff   RMSEA Df diff Pr(>Chisq)    
fitmete 14 897.74 942.88  16.949                                          
fitmte  19 970.91 998.69 100.116     83.168 0.25629       5  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

In this comparison, the more restricted Tau-equivalent model has significantly worse fit compared to the Essentially tau-equivalent model as indicated by the significant differences in \(\chi^2\). Also AIC and BIC favor the more flexible model.