Analyzing rating scale data with cumulative link models using the ordinal package in R

Hispanic Linguistics Symposium 2022

Scott James Perry & Guilherme D. Garcia

University of Alberta, Université Laval

11/3/22

Download workshop script + data



  1. Go to http://scottjamesperry.com/resources/
  2. Download HLS 2022 ordinal workshop materials
  3. Unzip folder to preferred location
  4. Click on script ordinal_workshop_script_HLS.R to open RStudio

Other decisions will get made first



Type of judgement:

  • Binary
  • Slider
  • Ordered categories


What we hope you get out of this


You aren’t familiar with R and regression framework:

  • Be able to explain why we shouldn’t analyze ordinal data with linear regression/ANOVA
  • A conceptual understanding of how ordinal regression works
  • Direction towards materials that we have found useful


What we hope you get out of this


You aren’t familiar with R and regression framework:

  • Be able to explain why we shouldn’t analyze ordinal data with linear regression/ANOVA
  • A conceptual understanding of how ordinal regression works
  • Direction towards materials that we have found useful


If you are familiar with R and regression framework:

  • Code to fit and visualize ordinal models
  • Code to fit Bayesian ordinal models

Timeline


  1. Why commonly used methods aren’t appropriate

  2. Conceptual introduction to ordinal models

  3. Fit ordinal model with clm()

  4. Interpretation/visualization

  5. Mixed-effects ordinal models & addressing common problems

  6. Advantages of Bayesian ordinal models (time permitting)

Questions?

  • Submit your question(s) by visiting:
    • https://forms.office.com/r/VfFvH0mbF5
  • Or scan the QR code below:

What is the difference between numeric and ordinal data?

Numerical data:

  • Can be explicitly measured
  • Differences are constant (1-2 same difference as 50-51)
  • Allows us to perform arithmetic operations (e.g., addition, subtraction, averages)
  • E.g., Age, Height, Time

What is the difference between numeric and ordinal data?

Numerical data:

  • Can be explicitly measured
  • Differences are constant (1-2 same difference as 50-51)
  • Allows us to perform arithmetic operations (e.g., addition, subtraction, averages)
  • E.g., Age, Height, Time

Ordinal data:

  • Values not measured
  • Differences may not be constant (‘B.A.’-‘M.A.’ \(\neq\) ‘M.A.’-‘PhD’)
  • We cannot perform arithmetic operations (e.g., addition, subtraction, averages)
  • E.g., Education level, rating scales

Some ordinal data intuitively not numbers



  • These categories have an order
  • ‘Distance’ between categories likely not equal


Rating scales less-intuitively not numbers



  • Numbers presented have clear order
  • ‘Psychological distance’ for participants cannot be claimed to be equal



Common example of distribution of rating scale responses


Example data from two-group AJT task

Mismatch in assumptions between rating scales and linear models

  • Expectation of normal distribution not most probable value
  • Values outside of rating scale have to be possible

Why we can’t treat rating scales as numbers

  • Mental distance not always equal
  • Normal distribution poorly summarizes ordinal data
  • Bounded nature of ordinal scales a problem

Strategies that do not solve this problem:

  • z-scoring by participant
  • Averaging over items/participants

So what should we do?

  • Imagine acceptability as a continuous and gradient mental phenomenon

We can estimate where the amount of ‘acceptability’ changes from a 1 to a 2

Now the entire blue area is probability of a ‘1’ response

This corresponds to the probability we saw in the raw data



We model one less threshold than the number of rating categories

Our thresholds stay still, but our latent variable moves

Group A has higher than average latent variable, higher probability of higher categories

Group B has lower than average latent variable, probability of responding ‘1’ is now over 50%.

Outline of hypothetical experiment

Interpretation of relative clauses in English L2

(Near-)categorical interpretation in (1); ambiguity in (2)

  1. Mary saw [the cat]\(_{a}\) of [the nurse]\(_b\) who likes to dance
  2. Mary saw [the daughter]\(_{a}\) of [the nurse]\(_b\) who likes to dance
  • Different languages have been shown to have different biases:

    • English \(\rightarrow\) low attachment (b)

    • Spanish \(\rightarrow\) high attachment (a)

Can we modulate this bias by introducing a prosodic break?

  • Three conditions:

    1. Mary saw the daughter of the nurse who likes to dance (NoBreak)
    2. Mary saw the daughter # of the nurse who likes to dance (High)
    3. Mary saw the daughter of the nurse # who likes to dance (Low)
  • Two groups (L1): Spanish and English (controls)

  • Response variable 1: Who likes to dance? (primary)

  • Response variable 2: How certain are you? (secondary)

Let’s examine our data

  • Before modelling our data, let’s visualize the patterns of interest
  • En speakers look more certain about low attachment

Loading in our simulated data-set

# Loading in data and inspecting it
load("data/rc_hls.Rdata")

# Inspecting data
summary(rc_hls)
       Certainty3    Condition         L1     
 Not certain:321   NoBreak:300   English:300  
 Neutral    :289   High   :300   Spanish:600  
 Certain    :290   Low    :300                

Fitting models

  • We fit models using clm() function
  • Same syntax as lm() and glm()
  • This code fits additive model
# Fitting our ordinal model
clm_fit_addtive <- clm(Certainty3 ~
                         Condition +
                         L1,
                       data = rc_hls)

Walking through the interpretation

summary(clm_fit_addtive)
formula: Certainty3 ~ Condition + L1
data:    rc_hls

 link  threshold nobs logLik  AIC     niter max.grad cond.H 
 logit flexible  900  -974.41 1958.83 3(0)  8.00e-07 2.5e+01

Coefficients:
              Estimate Std. Error z value Pr(>|z|)   
ConditionHigh  0.45710    0.15019   3.043  0.00234 **
ConditionLow  -0.30766    0.15174  -2.028  0.04260 * 
L1Spanish     -0.09735    0.13166  -0.739  0.45966   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
                    Estimate Std. Error z value
Not certain|Neutral  -0.6153     0.1430  -4.303
Neutral|Certain       0.7506     0.1439   5.218

Fitting the model with the interaction

  • Interactions also use familiar syntax
  • This code fits interaction
  • Example on right same model as Certainty3 ~ Condition * L1
# Fitting our ordinal model
clm_fit_interaction <- clm(Certainty3 ~
                             Condition +
                             L1 +
                             Condition:L1,
                           data = rc_hls)

Interpreting interaction

summary(clm_fit_interaction)
formula: Certainty3 ~ Condition + L1 + Condition:L1
data:    rc_hls

 link  threshold nobs logLik  AIC     niter max.grad cond.H 
 logit flexible  900  -923.19 1860.38 4(0)  6.65e-10 1.2e+02

Coefficients:
                        Estimate Std. Error z value Pr(>|z|)    
ConditionHigh            -0.6426     0.2644  -2.430  0.01508 *  
ConditionLow              0.7430     0.2697   2.755  0.00587 ** 
L1Spanish                -0.1696     0.2302  -0.737  0.46133    
ConditionHigh:L1Spanish   1.7096     0.3262   5.241 1.60e-07 ***
ConditionLow:L1Spanish   -1.6126     0.3316  -4.864 1.15e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
                    Estimate Std. Error z value
Not certain|Neutral  -0.7263     0.1941  -3.742
Neutral|Certain       0.7754     0.1944   3.989

Visualizing ordinal models with sjPlot


Multiple options for visualizing ordinal models: - Get predicted values from predict() and manipulate them manually - Plot predicted values automatically with a package like sjPlot


p <- plot_model(clm_fit_interaction,
                type = "pred",
                terms = c("Condition", "L1"),
                colors = c("black", "grey50"))

p + 
  theme_base(base_size = 18) +
  ggtitle("") +
  theme(axis.text.x = element_text(angle = 310, 
                                   hjust = 0)) +
  ylab("Predicted probability of response")

Visualizing ordinal models with sjPlot

Getting predicted probabilities and plotting them

Fitting mixed models with package ordinal


  • Same random effects syntax as lme4
  • Harder to fit than many other GLMMs
  • Analyze matched-guise experiment (thanks Gabriella Licata!)


mg_fit_1 <- clmm(Feminine_voice ~
                   GuiseGender +
                   (1|Participant),
                data = mg_dat)

Problem with ordinal models: thresholds not estimated

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: Feminine_voice ~ GuiseGender + (1 | Participant)
data:    mg_dat

 link  threshold nobs logLik  AIC    niter    max.grad cond.H
 logit flexible  260  -108.48 232.96 684(880) 1.66e-05 NaN   

Random effects:
 Groups      Name        Variance  Std.Dev. 
 Participant (Intercept) 1.039e-08 0.0001019
Number of groups:  Participant 65 

Coefficients:
                Estimate Std. Error z value Pr(>|z|)
GuiseGenderMale   -7.908        NaN     NaN      NaN

Threshold coefficients:
    Estimate Std. Error z value
1|2   -5.428        NaN     NaN
2|3   -5.315        NaN     NaN
3|4   -5.049        NaN     NaN
4|5   -4.041        NaN     NaN
5|6   -3.177        NaN     NaN
6|7   -2.128        NaN     NaN

Looking at the raw data, we can see why

Collapsing categories does not change interpretation

Here is the model with 2-6 collapsed

mg_fit_2 <- clmm(Feminine_voice_collapsed ~
                   GuiseGender +
                   (1|Participant),
                 data = mg_dat,
                 nAGQ = 10)
summary(mg_fit_2)
Cumulative Link Mixed Model fitted with the adaptive Gauss-Hermite 
quadrature approximation with 10 quadrature points 

formula: Feminine_voice_collapsed ~ GuiseGender + (1 | Participant)
data:    mg_dat

 link  threshold nobs logLik AIC    niter    max.grad cond.H 
 logit flexible  260  -86.59 181.19 265(301) 5.14e-08 4.7e+06

Random effects:
 Groups      Name        Variance  Std.Dev. 
 Participant (Intercept) 8.883e-09 9.425e-05
Number of groups:  Participant 65 

Coefficients:
                Estimate Std. Error z value Pr(>|z|)    
GuiseGenderMale  -7.1763     0.7992  -8.979   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
1|6  -4.7012     0.7326  -6.417
6|7  -2.1247     0.2827  -7.515

Advantages of Bayesian ordinal models

  • Convergence issues less of a problem
  • Easy and powerful plotting functions
  • Increased model flexibility (relaxing model assumptions like equal variances)

Relaxing equal variance assumption

Group A has higher than average latent variable, higher probability of higher categories

Group B has lower than average latent variable, probability of responding ‘1’ is now over 50%.

Relaxing equal variance assumption

Group A has higher than average latent variable that also has a smaller variance than Group B

Group B has lower than average latent variable with a wider scale parameter.

Why would we want to model unequal variances?

All error types possible when equal variance assumption incorrect (Liddell & Kruschke, 2018):

  • Finding effect when there is none
  • Not finding true effect
  • Finding an effect in the wrong direction

Fitting unequal variance ordinal model with Stan via brms

  • brms uses lme4-style syntax to fit Bayesian models
  • Additional parameter disc added to model, we can predict variation in scale
unequal_variance_formula <- bf(
  Friendly ~
    GuiseLanguage +
    GuiseGender +
    GuiseLanguage:GuiseGender +
    (1|Participant)) +
  lf(disc ~
       GuiseLanguage +
       GuiseGender +
       GuiseLanguage:GuiseGender +
       (1|Participant)
)

Interpreting model summary

 Family: cumulative 
  Links: mu = logit; disc = log 
Formula: Friendly ~ GuiseLanguage + GuiseGender + GuiseLanguage:GuiseGender + (1 | Participant) 
         disc ~ GuiseLanguage + GuiseGender + GuiseLanguage:GuiseGender + (1 | Participant)
   Data: g_ord (Number of observations: 260) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Group-Level Effects: 
~Participant (Number of levels: 65) 
                   Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)          1.63      0.47     0.83     2.64 1.01      735      853
sd(disc_Intercept)     0.93      0.17     0.63     1.32 1.00      968     1268

Population-Level Effects: 
                                          Estimate Est.Error l-95% CI u-95% CI
Intercept[1]                                 -5.72      1.41    -8.78    -3.29
Intercept[2]                                 -3.09      0.80    -4.82    -1.70
Intercept[3]                                 -2.12      0.59    -3.45    -1.11
Intercept[4]                                  0.04      0.31    -0.59     0.68
Intercept[5]                                  1.51      0.49     0.70     2.60
Intercept[6]                                  3.99      1.07     2.21     6.36
disc_Intercept                                0.09      0.31    -0.51     0.71
GuiseLanguageItalian                          0.42      0.30    -0.11     1.05
GuiseGenderMale                              -0.17      0.27    -0.72     0.33
GuiseLanguageItalian:GuiseGenderMale         -0.26      0.37    -1.02     0.47
disc_GuiseLanguageItalian                    -0.33      0.23    -0.80     0.12
disc_GuiseGenderMale                         -0.09      0.25    -0.57     0.39
disc_GuiseLanguageItalian:GuiseGenderMale     0.41      0.32    -0.22     1.04
                                          Rhat Bulk_ESS Tail_ESS
Intercept[1]                              1.01      942     1609
Intercept[2]                              1.00      950     1200
Intercept[3]                              1.00      961     1256
Intercept[4]                              1.00     1645     1936
Intercept[5]                              1.01      985     2218
Intercept[6]                              1.01      869     1425
disc_Intercept                            1.00      981     1480
GuiseLanguageItalian                      1.00     1982     2447
GuiseGenderMale                           1.00     2232     2348
GuiseLanguageItalian:GuiseGenderMale      1.00     2237     2684
disc_GuiseLanguageItalian                 1.00     1797     2266
disc_GuiseGenderMale                      1.00     1868     2387
disc_GuiseLanguageItalian:GuiseGenderMale 1.00     2030     2187

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Materials dealing with ordinal models

Garcia, G. D. (2021). Data visualization and analysis in second language research. Routledge.

McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.

Christensen, R. H. B. (2019). A Tutorial on fitting Cumulative Link Mixed Models with clmm2 from the ordinal Package. Tutorial for the R Package ordinal https://cran. r-project. org/web/packages/ordinal/Accessed, 1.

Materials dealing with ordinal models

Liddell, T. M., & Kruschke, J. K. (2018). Analyzing ordinal data with metric models: What could possibly go wrong?. Journal of Experimental Social Psychology, 79, 328-348.

Bürkner, P. C., & Vuorre, M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2(1), 77-101.

Veríssimo, J. (2021). Analysis of rating scales: A pervasive problem in bilingualism research and a solution with Bayesian ordinal models. Bilingualism: Language and Cognition, 24(5), 842-848.