Hispanic Linguistics Symposium 2022
University of Alberta, Université Laval
11/3/22
http://scottjamesperry.com/resources/
HLS 2022 ordinal workshop materials
ordinal_workshop_script_HLS.R
to open RStudio
Type of judgement:
You aren’t familiar with R and regression framework:
You aren’t familiar with R and regression framework:
If you are familiar with R and regression framework:
Why commonly used methods aren’t appropriate
Conceptual introduction to ordinal models
Fit ordinal model with clm()
Interpretation/visualization
Mixed-effects ordinal models & addressing common problems
Advantages of Bayesian ordinal models (time permitting)
https://forms.office.com/r/VfFvH0mbF5
Numerical data:
Numerical data:
Ordinal data:
Example data from two-group AJT task
Strategies that do not solve this problem:
Group A has higher than average latent variable, higher probability of higher categories
Group B has lower than average latent variable, probability of responding ‘1’ is now over 50%.
Interpretation of relative clauses in English L2
(Near-)categorical interpretation in (1); ambiguity in (2)
Different languages have been shown to have different biases:
English \(\rightarrow\) low attachment (b)
Spanish \(\rightarrow\) high attachment (a)
Three conditions:
Two groups (L1): Spanish and English (controls)
Response variable 1: Who likes to dance? (primary)
formula: Certainty3 ~ Condition + L1
data: rc_hls
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 900 -974.41 1958.83 3(0) 8.00e-07 2.5e+01
Coefficients:
Estimate Std. Error z value Pr(>|z|)
ConditionHigh 0.45710 0.15019 3.043 0.00234 **
ConditionLow -0.30766 0.15174 -2.028 0.04260 *
L1Spanish -0.09735 0.13166 -0.739 0.45966
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Threshold coefficients:
Estimate Std. Error z value
Not certain|Neutral -0.6153 0.1430 -4.303
Neutral|Certain 0.7506 0.1439 5.218
formula: Certainty3 ~ Condition + L1 + Condition:L1
data: rc_hls
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 900 -923.19 1860.38 4(0) 6.65e-10 1.2e+02
Coefficients:
Estimate Std. Error z value Pr(>|z|)
ConditionHigh -0.6426 0.2644 -2.430 0.01508 *
ConditionLow 0.7430 0.2697 2.755 0.00587 **
L1Spanish -0.1696 0.2302 -0.737 0.46133
ConditionHigh:L1Spanish 1.7096 0.3262 5.241 1.60e-07 ***
ConditionLow:L1Spanish -1.6126 0.3316 -4.864 1.15e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Threshold coefficients:
Estimate Std. Error z value
Not certain|Neutral -0.7263 0.1941 -3.742
Neutral|Certain 0.7754 0.1944 3.989
sjPlot
Multiple options for visualizing ordinal models: - Get predicted values from predict()
and manipulate them manually - Plot predicted values automatically with a package like sjPlot
sjPlot
ordinal
Cumulative Link Mixed Model fitted with the Laplace approximation
formula: Feminine_voice ~ GuiseGender + (1 | Participant)
data: mg_dat
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 260 -108.48 232.96 684(880) 1.66e-05 NaN
Random effects:
Groups Name Variance Std.Dev.
Participant (Intercept) 1.039e-08 0.0001019
Number of groups: Participant 65
Coefficients:
Estimate Std. Error z value Pr(>|z|)
GuiseGenderMale -7.908 NaN NaN NaN
Threshold coefficients:
Estimate Std. Error z value
1|2 -5.428 NaN NaN
2|3 -5.315 NaN NaN
3|4 -5.049 NaN NaN
4|5 -4.041 NaN NaN
5|6 -3.177 NaN NaN
6|7 -2.128 NaN NaN
mg_fit_2 <- clmm(Feminine_voice_collapsed ~
GuiseGender +
(1|Participant),
data = mg_dat,
nAGQ = 10)
summary(mg_fit_2)
Cumulative Link Mixed Model fitted with the adaptive Gauss-Hermite
quadrature approximation with 10 quadrature points
formula: Feminine_voice_collapsed ~ GuiseGender + (1 | Participant)
data: mg_dat
link threshold nobs logLik AIC niter max.grad cond.H
logit flexible 260 -86.59 181.19 265(301) 5.14e-08 4.7e+06
Random effects:
Groups Name Variance Std.Dev.
Participant (Intercept) 8.883e-09 9.425e-05
Number of groups: Participant 65
Coefficients:
Estimate Std. Error z value Pr(>|z|)
GuiseGenderMale -7.1763 0.7992 -8.979 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Threshold coefficients:
Estimate Std. Error z value
1|6 -4.7012 0.7326 -6.417
6|7 -2.1247 0.2827 -7.515
Group A has higher than average latent variable, higher probability of higher categories
Group B has lower than average latent variable, probability of responding ‘1’ is now over 50%.
Group A has higher than average latent variable that also has a smaller variance than Group B
Group B has lower than average latent variable with a wider scale parameter.
All error types possible when equal variance assumption incorrect (Liddell & Kruschke, 2018):
brms
brms
uses lme4
-style syntax to fit Bayesian modelsdisc
added to model, we can predict variation in scale Family: cumulative
Links: mu = logit; disc = log
Formula: Friendly ~ GuiseLanguage + GuiseGender + GuiseLanguage:GuiseGender + (1 | Participant)
disc ~ GuiseLanguage + GuiseGender + GuiseLanguage:GuiseGender + (1 | Participant)
Data: g_ord (Number of observations: 260)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Group-Level Effects:
~Participant (Number of levels: 65)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 1.63 0.47 0.83 2.64 1.01 735 853
sd(disc_Intercept) 0.93 0.17 0.63 1.32 1.00 968 1268
Population-Level Effects:
Estimate Est.Error l-95% CI u-95% CI
Intercept[1] -5.72 1.41 -8.78 -3.29
Intercept[2] -3.09 0.80 -4.82 -1.70
Intercept[3] -2.12 0.59 -3.45 -1.11
Intercept[4] 0.04 0.31 -0.59 0.68
Intercept[5] 1.51 0.49 0.70 2.60
Intercept[6] 3.99 1.07 2.21 6.36
disc_Intercept 0.09 0.31 -0.51 0.71
GuiseLanguageItalian 0.42 0.30 -0.11 1.05
GuiseGenderMale -0.17 0.27 -0.72 0.33
GuiseLanguageItalian:GuiseGenderMale -0.26 0.37 -1.02 0.47
disc_GuiseLanguageItalian -0.33 0.23 -0.80 0.12
disc_GuiseGenderMale -0.09 0.25 -0.57 0.39
disc_GuiseLanguageItalian:GuiseGenderMale 0.41 0.32 -0.22 1.04
Rhat Bulk_ESS Tail_ESS
Intercept[1] 1.01 942 1609
Intercept[2] 1.00 950 1200
Intercept[3] 1.00 961 1256
Intercept[4] 1.00 1645 1936
Intercept[5] 1.01 985 2218
Intercept[6] 1.01 869 1425
disc_Intercept 1.00 981 1480
GuiseLanguageItalian 1.00 1982 2447
GuiseGenderMale 1.00 2232 2348
GuiseLanguageItalian:GuiseGenderMale 1.00 2237 2684
disc_GuiseLanguageItalian 1.00 1797 2266
disc_GuiseGenderMale 1.00 1868 2387
disc_GuiseLanguageItalian:GuiseGenderMale 1.00 2030 2187
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Garcia, G. D. (2021). Data visualization and analysis in second language research. Routledge.
McElreath, R. (2020). Statistical rethinking: A Bayesian course with examples in R and Stan. Chapman and Hall/CRC.
Christensen, R. H. B. (2019). A Tutorial on fitting Cumulative Link Mixed Models with clmm2 from the ordinal Package. Tutorial for the R Package ordinal https://cran. r-project. org/web/packages/ordinal/Accessed, 1.
Liddell, T. M., & Kruschke, J. K. (2018). Analyzing ordinal data with metric models: What could possibly go wrong?. Journal of Experimental Social Psychology, 79, 328-348.
Bürkner, P. C., & Vuorre, M. (2019). Ordinal regression models in psychology: A tutorial. Advances in Methods and Practices in Psychological Science, 2(1), 77-101.
Veríssimo, J. (2021). Analysis of rating scales: A pervasive problem in bilingualism research and a solution with Bayesian ordinal models. Bilingualism: Language and Cognition, 24(5), 842-848.