Getting started with Bayesian inference

Bayesian workshop - STEP 2023

Scott James Perry

University of Alberta

Lesson outcomes

By the end of this session you will:

  1. Explain the Bayesian view of probability
  2. List and describe the components of a Bayesian model
  3. Be able to set priors and fit a simple model in brms

Some terminology - data vs. parameters


In the Bayesian framework, it’s a simple distinction:

  • Data are observed, we have measured them
  • Parameters are unobserved, we have to infer them


For example:

  • The frequency of a word is data
  • The effect of frequency on reaction time is a parameter

Some terminology - probability distribution



  • Function that describes probabilities of the different values of a variable

  • Unknown values can be represented by a function like a probability distribution - called random variables


Bayesian vs. Frequentist probability


Bayesian probability:

  • subjective belief
  • parameters are random variables (i.e., they have a distribution of values)


Frequentist probability:

  • frequency of occurrence in repeated outcomes
  • parameters are unknown, but fixed

Bayesian vs. Frequentist probability

You flip a coin once and hide it. What is the probability that it is heads?


Frequentist:


Bayesian:

Bayesian vs. Frequentist probability

You flip a coin once and hide it. What is the probability that it is heads?


Frequentist: I don’t know the outcome, but it’s either heads or tails. I could only say what the probability of heads for many repeated flips would be.


Bayesian:

Bayesian vs. Frequentist probability

You flip a coin once and hide it. What is the probability that it is heads?


Frequentist: I don’t know the outcome, but it’s either heads or tails. I could only say what the probability of heads for many repeated flips would be.


Bayesian: 50%!

What is Bayesian inference?


  • Treating model parameters as random variables
  • Discussing our uncertainty about them with probability
  • Update prior beliefs with data via Bayes’ Theorem



“Remember that using Bayes’ Theorem doesn’t make you a Bayesian. Quantifying uncertainty with probability makes you a Bayesian.” - Michael Betancourt

Bayes Theorem


\(\theta\) is some parameter value (like the effect of frequency) \(D\) is the observed data


The components of a Bayesian model

  • Posterior probability: what we want to get and interpret
  • Likelihood: the information we get from our data
  • Prior probability: the information we have without our data
  • Marginal probability: normalizing constant



The prior probability

  • Giving no information is like saying all values equally plausible
  • We can often always do better than that
  • Explicit assumptions based on domain knowledge
  • Must be reported (can be attacked)



The likelihood


  • Think of it as the data
  • Many frequentist models based on maximum likelihood estimation (this can lead to estimation problems)



Marginalization


  • Practical approaches to Bayesian inference skip this part
  • Involves solving unsolvable integral
  • We approximate posterior using Hamiltonian Monte Carlo

Modelling rt with a normal distribution


Model assumes rt comes from a normal distribution with:

  • mean: \(\mu\) (mu)
  • standard deviation: \(\sigma\) (sigma)

We can write this succinctly as:

\(rt \sim Normal(\mu,\sigma)\)

Workflow for model \(rt \sim Normal(\mu,\sigma)\)

  1. Choose priors for \(\mu\) and \(\sigma\)
  2. Update our priors with data
  3. Check our model
  4. Interpret our model posteriors

Choosing priors for \(\mu\) and \(\sigma\)


Remember, for the model \(rt \sim Normal(\mu,\sigma)\)

  • \(\mu\) is the average reaction time
  • \(\sigma\) is how variable reaction times are

Choosing priors for \(\mu\) and \(\sigma\)


Remember, for the model \(rt \sim Normal(\mu,\sigma)\)

  • \(\mu\) is the average reaction time
  • \(\sigma\) is how variable reaction times are



We are now going to move to practice with this simple model.

Please open up script S1_E2_brms_intro.R