The Name of the Fantasy-themed phonetics activity

Author

Scott James Perry

Published

December 9, 2023

I made this exercise while working as a TA for an undergraduate phonetics class. The goal was to give students additional material to practice phonetic transcription and interpreting spectrograms. There was also the added bonus that it helped me procrastinate when writing my first Generals Paper. Included with this activity is an audio file with three corresponding .TextGrid files. The audio Prologue_practice_audio.wav is the prologue of a popular book, ‘The Name of the Wind’ by Patrick Rothfuss, read out loud by yours truly. Below, you will find several activities that would provide an opportunity for any phonetician in training to practice what they have been learning.

Click here to download a zip file containing the audio file and text grids for this activity.

Transcription practice

In order to practice impressionistic transcription, you can open just the sound and transcribe the sounds by ear using the IPA. If you need a little bit of help from the orthography, you can find the orthographic transcription in the file Utterance_aligned.TextGrid.

Free segmentation

Another simple exercise that can be done using this audio is to practice segmenting speech into words and phones in Praat. An easy way to start this would be to open the file Utterance_aligned.TextGrid in Praat, add a second tier and begin by segmenting the words. A third tier could be added to segment things at the phone level.

Correcting an automatic segmentation

The file Automatic_alignment_ARPAbet.TextGrid is an automatic, computer-generated alignment of the words and phones in the text. It has two tiers, one with words segmented out and the other with the phones segmented out. It was created with a forced aligner, in this case, the Montreal Forced Aligner (McAuliffe et al. 2017). What I did was give the Montreal Forced Aligner the Prologue_practice_audio.wav file along with Utterance_aligned.TextGrid and it produced the automatic alignment in about 11 seconds. This is pretty cool. However, you will see that the alignment is not perfect, which is a great opportunity to put your phonetic knowledge and skills to the test.

It is important to point out that the Montreal Forced Aligner doesn’t give the transcription using the International Phonetic Alphabet, or at least, it didn’t back then. The transcription uses another system called ARPAbet. If you want to know how to read ARPAbet, the Wikipedia page has this conversion chart that seems to be accurate.

My recommended activity here is to practice reading spectrograms by correcting the boundaries placed between the sounds. While there aren’t really ‘right’ answers in this respect, there are certainly ones that are more wrong. Try moving the boundaries to less wrong places.

Checking transcription and segmentation

The last file is called Corrected_alignment_IPA.TextGrid. In this file, I changed the transcription into the probably more familiar IPA, and I corrected the alignment of the boundaries on both tiers. You can use this information to check your own transcriptions, your own segmentation of the audio, or the correction you have made to the computer-generated boundaries.

This file I made is not perfect. There are still mistakes. If you think that a transcription of mine is wrong or that I segmented some sounds badly, it is entirely possible you are right. Feel free to discuss how exactly I’m wrong with your instructor, TA, or fellow classmates (anything to get people talking about phonetics!)

Important notes for students

Connected speech and other random information

As this is connected speech, there is a lot of variability in my productions. Some sounds will not look like you expect them to, and I want to make sure that you are not confused. This activity is supposed to help, after all. The most variable segments are the interdental fricatives, even though that’s not always what they are. For some sounds, I have based my transcriptions on the specific productions.

In some of the textgrids, you will see strings of characters such as <unk>, sil, and sp. These correspond to the ‘unknown’, ‘silence’, and ‘silent period’. Not all words were included in the pronunciation dictionary, so for some words you will have to add them if you want a complete transcription or segmentation.

Asking questions

If I am your TA or instructor, I encourage you to direct your questions to the discussion forum on the course page. If I am not your TA, and your instructor has shared this with you, I would recommend directing your questions to your instructor. If you have stumbled upon this out in the wilds of the internet and have any comments or questions, feel free to email me!

References

McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, Michael Wagner, and Morgan Sonderegger. 2017. “Montreal Forced Aligner: Trainable Text-Speech Alignment Using Kaldi.” In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. https://doi.org/10.21437/interspeech.2017-1386.