Speech coding – third pass speech identification, affect and intensity
Coding is done on this webpage: https://uelbabydev.com/onacsa-home-audio-coding-phase-three-v1/
BEFORE YOU START: IDENTIFY WHETHER THE DECODER HAS IDENTIFIED THE CORRECT SPEAKER
Baby Coding
Vocalisation Type
Vocants
•Vocants are ‘speech-like’ sounds that occur within the infants’ normal pitch range- i.e. they are not high pitched, or low pitched.
•Whilst they are speech-like, vocants do not often sound like words- see examples below.
•Vocants can be coded as either: quasi-resonant vowels, fully resonant vowels, marginal syllables and canonical syllables. These are the first four options listed under vocalisation type. See below for definitions and examples.
Quasi-resonant vowels
•These are produced with normal phonatation- i.e. the type of sound usually produced in speech sounds, especially in vowels.
•However- they are produced with the vocal tract at rest- unlike fully resonant vowels (see below).
•These vocalisations consequently sound creaky/nasally, being produced with minimal breath support
Examples – QRV
Fully-resonant vowels
•Unlike QR vowels, these are produced when the vocal tract is open, resulting in normal phonatation.
•So these sound like vowel sounds that we would normally hear in speech: they are produced with full breath support, so they don’t sound creaky/nasally.
Examples – FRV
Marginal Syllable
•Transition from a consonant sound to a vowel-like sound e.g. ‘maa’.
•However- the transition between the consonant sound and the vowel sound is slow- around 200ms.
•This is the feature that distinguishes them from canonical syllables (see later slides).
Examples – MS
Canonical Syllables
•These include mature consonant-vowel transitions, and incorporate fully-resonant vowels. The transition between consonant and vowel occurs as it does in mature speech.
•For example ‘ba’ ‘da’ (see examples below)
Examples – CS
Other vocal types that ARE NOT VOCANTS
Squeals
• Any vocalization that is of notably higher pitch than the normal pitch range of the infant.
Growls
•Code growl when the voclaisation meets at least one of the two following criteria:
– The most salient pitch is notably lower than the normal range of the infant.
– The pitch of the utterance is within normal range, but is produced with high tension (i.e. the vocalisation is harsh-sounding).
Laughs and crys
• Code the vocalization as cry or laugh if you feel it has been produced reflexively.
Laughs – examples
Crys – examples
Raspberry or clicks
Trills or vibrants formed most often with the lips or the tongue and lips, and occasionally by the tongue body against the toothless alveolar ridge. Clicks are sounds that involve creating a supraglottal sub-cavity that has negative pressure. When the pressure is released, it forms a non-phonatory sound source (i.e., a sound that has no clear frequency).
Vegetative sounds
The infant is coughing or making another vegetative sound (snorting, coughing, breathing…)
Affect and Intensity
Both affect and intensity are coded on a 1-5 scale.
Affect
How positive/negative is the vocalisation? Most negative (1) would be cry, most positive (5) would be slightly under laughter, Neutral (3) means no positive or negative affect can be detected
Examples
2 – Not as negative as 1 but some negative affect detected
3 – Neutral
4 – Some positive affect detected, but not enough to be considered a 5
Intensity
1 | Low intensity of affect would be extremely calm/or depressed/or sleepy |
2 | Low-medium intensity of affect |
3 | Mid intensity of affect (3) would be normal conversation |
4 | Not quite max intensity but more than mid-range intensity |
5 | Maximum intensity of affect (5) would either be proper crying (negative) or really happy/about to laugh. |
Examples
1 – Low intensity
2- Low-medium intensity
3 – Mid intensity
4 – Not quite max intensity but more than mid-range intensity
Mum Coding
Who is the vocalisation directed towards?
Use context and the samantics of the caregiver’s speech to decide who the vocalisation is directed towards. Do not spend a long time over this- make a quick decision and move on. Keep this question in mind when answering the register question below.
Vocalisation register
Normal speech is the standard speech register that adults spontaneously use we talking to another adults. Motherese is the speech register that caregivers spontaneously use we talking to an infant. Importantly the register and context are theoretically different: a caregiver could use a normal speech register to talk to an infant, or motherese to talk to another adult.
When coding for the speech register, try not to focus on the context, but on these acoustic features:
- Increased pitch (higher frequency) compared to normal speech
- Increased pitch variability (more melodic) compared to normal speech
- Reduced pace (slower syllabic rate) compared to normal speech
- Elongated vowels (“oooh whooo’s the gooood booyyy”…) compared to normal speech
You can classify an adult vocalisation as motherese even if you can only identify one characteristic feature.
Examples – normal speech
Examples – motherese
Vocalisation type
Speech vocalisations
unintelligible | If the mum is speaking but you cannot make out what se is saying. |
Whisper | If the mum is whispering. You also need to code the content of her vocalisation. |
Naming | The caregiver is naming an object/person/ toy/ something in the environment. e.g., “this is the dog !” ; “this is a weird banana, isn’t it?” “wow look at the sky”; “is that your brother?!” |
Declarative/narrative/affirmation | the caregiver makes an affirmation, a statement, or any declaration to the baby, and/or describes a situation or an object. e.g., “the dog likes the ball !” ; “this is a monkey” ; “is that nice and soft”… |
Wh-Q | the mum asks a wh- (why, what, where, when…) question |
Yes/no-Q | The mum asks a polar (yes/no) question, or any other kind of question; e.g., “do you like the ball?”; “is this a monkey?” |
Redirection/imperative | the mum is redirecting the infants’ attention, or using an imperative tone of voice / sentence to ask the baby to do something. e.g., ‘put it down’; “do you want to play with the other one?”; “how about this one…”; “can I have them?” |
Praise | Praise/positive feedback/wonder: The mum is praising the baby, and/or giving positive feedback and/or in wonder, marvelling at something. e.g., “well done”; “here is a good girl”; “wow” … |
Comforting | the mum comforts the baby, or proposes to give a cuddle / kiss… e.g., “oooooh… you alright”; “do you want a cuddle?” (NB: this is also a question referring to the child’s desire, so here you’d need to select 3 categories) (NB: shushing is not usually picked up by the automated voice activity detector – if you do hear shushing it should be recorded as comforting) |
Imitation | the mother imitates a sound produced by the baby (e.g., a coo, or a proto-phone which is a speech like sound, like a proto-syllable). |
Physical states and/or actions – INFANT | the caregiver stating that the child ishungry or tired, and/or is referring to physiological states (stressed, tired…). Also use this if the mum is describing what the baby is doing (e.g., “you are trying to put the acorn in the shoe?”). |
Desires/preferences/wanting – INFANT | the mum refers to something that the child may like, dislike, don’t like, love, want, prefer, hate, can’t stand… Also use this rather than T (action) when the mum describes what the child is doing by referring to a motivation,e.g., “Are you playing with the ball again?”; “Do you like the ball?”; “All you want to do is play with the shoe!”. |
Cognition etc – INFANT | the mum refers to the child’s mind and cognition, using words like: think, decide, know, remember, realize, interested, not interested, curious. |
Emotions – INFANT | The mum refers to the child’s emotion. e.g., had enough, fed up, shy, self-conscious, happy, sad, scared, afraid, joyful, gleeful, full of the joys of spring, serious, grumpy, stressed, moody, in a good/bad mood, being difficult, worried, anxious, dazed, confused, excited, cross, not feeling yourself, startled, surprised, disgusted, bored, angry, bad tempered… |
Reading | The mum is reading a book aloud, e.g. ‘Once upon a time…” |
Calling baby | the mum is saying the baby’s name, e.g., “Jessie” or “baba” would be calling, so type 24; “Jessie, look!” would be calling and redirecting, so type 24 AND 14 |
Causal talk | The mum talks about causation e.g. “he is smiling because her is happy”, “he must have jumped into the pool because he is wet” etc |
Surprise/exclamation | The mum makes an exclamation or expresses surprise. |
Physical states and/or actions – MUM | the caregiver stating that she ishungry or tired, and/or is referring to physiological states (stressed, tired…). Also use this if the mum is describing what the she is doing (e.g., “I’m making mummy a coffe?”). |
Desires / preferences / wanting – MUM | the mum refers to something that she may like, dislike, don’t like, love, want, prefer, hate, can’t stand… Also use this rather than 27 (action) when the mum describes what she is doing by referring to a motivation,e.g., “Are you playing with the ball again?”; “Do you like the ball?”; “All you want to do is play with the shoe!”. |
Cognition etc – MUM | the mum refers to her own mind and cognition, using words like: think, decide, know, remember, realize, interested, not interested, curious. |
Emotions – MUM | The mum refers to her own emotion. e.g., had enough, fed up, self-conscious, happy, sad, scared, afraid, joyful, gleeful, full of the joys of spring, serious, grumpy, stressed, moody, in a good/bad mood, worried, anxious, dazed, confused, excited, cross, not feeling yourself, startled, surprised, disgusted, bored, angry, bad tempered… |
Validating emotion | If the mum talks about the baby’s emotional state (22) in a validating way by e.g. naming the emotion or state that the baby is expressing ‘you are feeling upset’, ‘you didn’t expect that and now you are sad’ |
Invalidating emotion | If the mum talks about the baby’s emotional state (22) in an invalidating way by e.g. ‘you’re being silly’, ‘calm down’, ‘there’s no need to be upset’, |
Avoiding emotion | If the mum avoids addressing the infants emotional state e.g. ‘I’m not listening to this anymore.’, ‘this just sounds like noise’. |
Non-speech vocalisations
Sing | the caregiver sings a song. |
Play | the caregiver makes non-speech play sounds, e.g., imitates the sound of the dog/cat/car… e.g., “vroooom” ; “woof woof”… |
laughter | the caregiver is laughing. |
Veg | Vegetative sound: the caregiver is coughing or making another vegetative sound (snorting, breathing…) |
Breath intake | The mum inhales audibly. |
Sigh/yawn | The mum sighs or yawns. |
Kiss/comforting | If the mum kisses the baby or makes a comforting noise e.g. ‘ahhhh’, ‘ohhhhh’. |
Affect and Intensity
Code affect and intensity on a 1-5 scale, according to similar criteria outlined in the section for infant coding.
Affect – examples
3 – Neutral
4 – Positive affect but not so much as to be a 5
5 – Most positive affect
Intensity – examples
1 – Low intensity
2 – Low-medium intensity
3 – Mid intensity
4 – Not quite max intensity but more than mid-range intensity
5 – Max intensity