6.4 Exercises#

1️⃣ Data Exploration#

Goal: Inspect the dataset structure.

Instructions:

  1. Download the dataset here and load the dataset.

  2. Explore it using the functions you know.

  3. Remove columns that are not items.

  4. Show summary statistics (.sd(), and .mean()) and correlation between items (.corr()).

  5. Visualise the correlation between items (you can use sns.heatmap())

  6. And plot the score distribution of one item, you can use: sns.histplot(df['column']).

before starting make sure you have lavaan installed:

  1. Open Anaconda Prompt (Windows) or Terminal (Mac).

  2. Copy and paste the appropriate command:

  • Windows:

    R -e "install.packages('lavaan', repos='https://cran.uni-muenster.de')"
    
  • Mac:

    R -e "install.packages('lavaan', repos='https://cran.r-project.org')"
    
# Use this cell to import any package you need
# Your code here

2️⃣ Model Comparison#

Goal: Fit two different models and compare their performance.

Instructions:

  1. Perform dimensionality assessment using one of the method introduced in the Dichotomous session (ex. princals())

  2. Fit a new model (you can call it better_model), after excluding one item from your dataset.

  3. Fit a Tau-Congeneric on the dataset containing all of the items and compare it to better_model.

  4. Discuss which model performs better and why.

# Import packages to use dimensionality assessment tools
from rpy2.robjects.packages import importr
importr("MPsychoR")
importr("psych")
importr("mirt")
# Your code here

3️⃣ Sorting item parameters#

Goal: Practise basic control‑flow constructs.

Exercises:

  1. Create a new load_sorted and a new inter_sorted variables.

  2. Use a for loop, if-else, and correct indexing to loop through df and sort the loadings and the intercept column.

  3. Do this without using any .sort() function

from rpy2 import robjects as ro
from rpy2.robjects import pandas2ri
import pandas as pd
pe = ro.r('parameterEstimates(fitmtc)')       
pe = pandas2ri.rpy2py(pe)
df = pd.DataFrame({
    'item': pe.loc[0:5, 'rhs'].values,
    'loading': pe.loc[0:5, 'est'].values,
    'intercept': pe.loc[13:18, 'est'].values
})

df