Category: Speech

New paper by Molly Henry: Amusia diagnosis should better rely on signal detection theory

Post author By Jonas
Post date 4 October 2012

Signal detection theory – what helps preventing misdiagnoses and false positives in general can’t be bad for diagnosing Amusia either, one would think. Our very own Molly Henry and her former supervisor Devin McAuley now demonstrate in a just-accepted paper

Failure to apply signal detection theory to the Montreal Battery of Evaluation of Amusia may misdiagnose amusia

in Music Perception that this is indeed the case:

They show that analyses based on confidence ratings and ROC-curves outperforms simple percentage correct in diagnosing Amusia.

Here is the abstract, and watch out for the full paper to appear soon:

This article considers a signal detection theory (SDT) approach to evaluation of performance on the Montreal Battery of Evaluation of Amusia (MBEA).

Continue reading “New paper by Molly Henry: Amusia diagnosis should better rely on signal detection theory”

Tags amusia, Henry, McAuley, Music Perception, roc curves, signal detection theory

Auditory Working Memory Degraded Acoustics EEG / MEG Executive Functions Neural Oscillations Noise-Vocoded Speech Papers Publications Speech

New paper out: Obleser et al., The Journal of Neuroscience

Post author By Jonas
Post date 6 September 2012

Adverse Listening Conditions and Memory Load Drive a Common Alpha Oscillatory Network

Whether we are engaged in small talk or trying to memorise a telephone number — it is our short-term memory that ensures we don’t lose track. But what if the very same memory gets additionally taxed because the words to be remembered are hard to understand?

Screen Shot 2012-09-06 at 14.01.24 PM — Obleser et al., J Neurosci 2012: Alpha oscillations are enhanced both by memorised digits and by the adverse acoustic conditions that these digits had been presented in.

Obleser, J., Woestmann, M., Hellbernd, N., Wilsch, A. , Maess, B. (2012). Adverse listening conditions and memory load drive a common alpha oscillatory network. Journal of Neuroscience. September 5, 2012 • 32(36):12376 –12383

References

Obleser J, Wöstmann M, Hellbernd N, Wilsch A, Maess B. Adverse listening conditions and memory load drive a common α oscillatory network. J Neurosci. 2012 Sep 5;32(36):12376–83. PMID: 22956828. [Open with Read]

Tags Hellbernd, Journal of Neuroscience, Maess, Obleser, Wilsch, Wöstmann

Neural Oscillations Papers Publications Speech

New paper out: “Don’t be enslaved by the envelope” – Comment on Giraud & Poeppel (2012)

Post author By Jonas
Post date 31 August 2012

Today appears a comment / opinion article, with a tad bit of fresh evidence from our lab, that is mainly a reply to Anne-Lise Giraud and David Poeppel’s recent “perspective” article on Neural oscillations in speech.

We loved that article, obviously, but after the initial excitement, a few concerns stuck with us. In essence, the problems are (i) how to define theta for the purposes of analysing speech comprehension processes, (ii) not to overly focus on the speech envelope (i.e., not to neglect spectral / fine-structure aspects of speech), and (iii) the unsolved chicken–egg problem of how neural entrainment and speech intelligibility really relate to each other.

But read for yourself (It’s pleasantly short!).

References

Obleser J, Herrmann B, Henry MJ. Neural Oscillations in Speech: Don’t be Enslaved by the Envelope. Front Hum Neurosci. 2012 Aug 31;6:250. PMID: 22969717. [Open with Read]

Tags Frontiers in Human Neuroscience, Henry, Herrmann, Obleser

EEG / MEG Evoked Activity Linguistics Papers Perception Place of Articulation Features Publications Speech

New paper in press — Scharinger et al., PLOS ONE [Update]

Post author By Jonas
Post date 15 July 2012

We are happy that our paper

A Sparse Neural Code for Some Speech Sounds but Not for Others

is scheduled for publication in PLOS ONE on July 16^th, 2012.

This is also our first paper in collaboration with Alexandra Bendixen from the University of Leipzig.

The research reported in this article provides an extension of the predictive coding framework onto speech sounds and assumes that auditory processing uses predictions that are not only derived from ongoing contextual updates, but also from long-term memory representations — neural codes — of speech sounds. Using the German minimal pair [lats]/[laks] (bib/salmon) in a passive-oddball design, we find the expected Mismatch Negativity (MMN) asymmetry that is compatible with a predictive coding framework, but also with linguistic underspecification theory.

[Update]

Paper is available here.

References

Scharinger M, Bendixen A, Trujillo-Barreto NJ, Obleser J. A sparse neural code for some speech sounds but not for others. PLoS One. 2012;7(7):e40953. PMID: 22815876. [Open with Read]

Tags Bendixen, mismatch negativity, mmn, Obleser, PLOS ONE, reaction time, Scharinger, sculp, Trujillo-Barreto

Degraded Acoustics fMRI Noise-Vocoded Speech Papers Publications Speech

New paper in press: Erb et al., Neuropsychologia [Update]

Post author By Jonas
Post date 11 May 2012

I am very proud to announce our first paper that was entirely planned, conducted, analysed and written up since our group has been in existence. Julia joined me as the first PhD student in December 2010, and has since been busy doing awesome work. Check out her first paper!

Auditory skills and brain morphology predict individual differences in adaptation to degraded speech

Noise-vocoded speech is a spectrally highly degraded signal, but it preserves the temporal envelope of speech. Listeners vary considerably in their ability to adapt to this degraded speech signal. Here, we hypothesized that individual differences in adaptation to vocoded speech should be predictable by non-speech auditory, cognitive, and neuroanatomical factors. We tested eighteen normal-hearing participants in a short-term vocoded speech-learning paradigm (listening to 100 4- band-vocoded sentences). Non-speech auditory skills were assessed using amplitude modulation (AM) rate discrimination, where modulation rates were centered on the speech-relevant rate of 4 Hz. Working memory capacities were evaluated, and structural MRI scans were examined for anatomical predictors of vocoded speech learning using voxel-based morphometry. Listeners who learned faster to understand degraded speech showed smaller thresholds in the AM discrimination task. Anatomical brain scans revealed that faster learners had increased volume in the left thalamus (pulvinar). These results suggest that adaptation to vocoded speech benefits from individual AM discrimination skills. This ability to adjust to degraded speech is furthermore reflected anatomically in an increased volume in an area of the thalamus, which is strongly connected to the auditory and prefrontal cortex. Thus, individual auditory skills that are not speech-specific and left thalamus gray matter volume can predict how quickly a listener adapts to degraded speech. Please be in touch with Julia Erb if you are interested in a preprint as soon as we get hold of the final, typeset manuscript.

[Update#1]: Julia has also published a blog post on her work.

[Update#2] Paper is available here.

References

Erb J, Henry MJ, Eisner F, Obleser J. Auditory skills and brain morphology predict individual differences in adaptation to degraded speech. Neuropsychologia. 2012 Jul;50(9):2154–64. PMID: 22609577. [Open with Read]

Tags amplitude modulation rate, cochlear implant simulation, Eisner, Erb, Henry, Neuropsychologia, Obleser, perceptual learning, voxel-based morphometry

Auditory Cortex Auditory Speech Processing fMRI Papers Publications Speech

New paper out: McGettigan et al., Neuropsychologia

Post author By Jonas
Post date 31 January 2012

Last years’s lab guest and long-time collaborator Carolyn McGettigan has put out another one:

Speech comprehension aided by multiple modalities: Behavioural and neural interactions

I had the pleasure to be involved initially, when Carolyn conceived a lot of this, and when things came together in the end. Carolyn nicely demonstrates how varying audio and visual clarity comes together with the semantic benefits a listener can get from the famous Kalikow SPIN (speech in noise) sentences. The data highlight posterior STS and the fusiform gyrus as sites for convergence of auditory, visual and linguistic information.

Check it out!

References

McGettigan C, Faulkner A, Altarelli I, Obleser J, Baverstock H, Scott SK. Speech comprehension aided by multiple modalities: behavioural and neural interactions. Neuropsychologia. 2012 Apr;50(5):762–76. PMID: 22266262. [Open with Read]

Tags Altarelli, Baverstock, Faulkner, individual differences, McGettigan, Neuropsychologia, Obleser, Scott

Auditory Speech Processing Media Publications

3‑D animation of brain activations illustrates the idea of “upstream delegation”

Post author By Jonas
Post date 31 January 2012

Recently, with a data set dating back to my time in Angela Friederici’s department, we proposed the idea that auditory signal degradation would affect the exact configuration of activity along the main processing streams of language, in the superior temporal and inferior frontal cortex. We tentatively coined this process “upstream delegation”: The activations that were driven by increasing syntactic demands, with the challenge of decreasing signal quality coming on top, were all of a sudden found more “upstream” from where we had located them with improvingsignal quality.

In a fascinating and instructive interactive 3‑D version (Oh, this sound so 1990s but it’s true!) , you can now study and manipulate (in the literal, not the scientific misconduct-sense) this and various other findings from Angela’s lab yourself: Fire up Chrome or Firefox and Check it out here.

All of this is taken from a recent review by Angela [Friederici, AD (2011) Physiological Reviews, 91(4), 1357–1392], where she lays out her current take on inferior frontal cortex, the tracts connecting to and from it, and its role in syntax processing. The funky 3‑D stuff is by Ralph Schurade. Don’t ask how long it took us to get all the coordinates in place.

Auditory Perception EEG / MEG Events Evoked Activity Posters Publications Speech

Poster Presentations at SFN

Post author By Jonas
Post date 11 November 2011

There will be two poster presentations at SFN in Washington, DC., on the topic of auditory predictions in speech perception. The first poster, authored by Alexandra Bendixen, Mathias Scharinger, and Jonas Obleser, summarizes as follows:

Speech signals are often compromised by disruptions originating from external (e.g., masking noise) or internal (e.g., sluggish articulation) sources. Speech comprehension thus entails detecting and replacing missing information based on predictive and restorative mechanisms. The nature of the underlying neural mechanisms is not yet well understood. In the present study, we investigated the detection of missing information by occasionally omitting the final consonants of the German words “Lachs” (salmon) or “Latz” (bib), resulting in the syllable “La” (no semantic meaning). In three different conditions, stimulus presentation was set up so that subjects expected only the word “Lachs” (condition 1), only the word “Latz” (condition 2), or the words “Lachs” or “Latz” with equal probability (condition 3). Thus essentially, the final segment was predictable in conditions 1 and 2, but unpredictable in condition 3. Stimuli were presented outside the focus of attention while subjects were watching a silent video. Brain responses were measured with multi-channel electroencephalogram (EEG) recordings. In all conditions, an omission response was elicited from 125 to 165 ms after the expected onset of the final segment. The omission response shared characteristics of the omission mismatch negativity (MMN) with generators in auditory cortical areas. Critically, the omission response was enhanced in amplitude in the two predictable conditions (1, 2) compared to the unpredictable condition (3). Violating a strong prediction thus elicited a more pronounced omission response. Consistent with a predictive coding account, the processing of missing linguistic information appears to be modulated by predictive context.

The second poster looks at similar material, but contrasts coronal [t] with dorsal [k], yielding interesting asymmetries in MMN responses:

Research in auditory neuroscience has lead to a better understanding of the neural bases of speech perception, but the representational nature of speech sounds within words is still a matter of debate. Electrophysiological research on single speech sounds provided evidence for abstract representational units that comprise information about both acoustic structure and articulator configuration (Phillips et al., 2000), thereby referring to phonological categories. Here, we test the processing of word-final consonants differing in their place of articulation (coronal [ts] vs. dorsal [ks]) and acoustic structure, as seen in the time-varying formant (resonance) frequencies. The respective consonants distinguish between the German nouns Latz (bib) and Lachs (salmon), recorded from a female native speaker. Initial consonant-vowel sequences were averaged across the two nouns in order to avoid coarticulatory cues before the release of the consonants. Latz and Lachs served as standard and deviant in a passive oddball paradigm, while the EEG from 20 participants was recorded. The change from standard [ts] to deviant [ks] and vice versa was accompanied by a discernible Mismatch Negativity (MMN) response (Näätänen et al., 2007). This response showed an intriguing asymmetry, as seen in a main effect condition (deviant Latz vs. deviant Lachs, F(1,1920) = 291.84, p < 0.001) of an omnibus mixed-effect model. Crucially, the MMN for the deviant Latz was on average more negative than the MMN for the deviant Lachs from 135 to 185 ms post deviance onset (p < 0.001). We interpret these findings as reflecting a difference in phonological specificity: Following Eulitz and Lahiri, 2004, we assume coronal segments ([ts]) to have less specific (‘featurally underspecified’) representations than dorsal segments ([ks]). While in standard position, Lachs activated a memory trace with a more specific final consonant for which the deviant provided a stronger mismatch than vice versa, i.e. when Latz activated a memory trace with a less specific final consonant. Our results support a model of speech perception where sensory information is processed in terms of discrete units independent of higher lexical properties, as the asymmetry cannot be explained by differences in lexical surface frequencies between Latz and Lachs (both log-frequencies of 0.69). We can also rule out a frequency effect on the segmental level. Thus, it appears that speech perception involves a level of processing where individual segmental representations within words are evaluated.

Tags Bendixen, Obleser, Scharinger, Society for Neuroscience

Ref­er­ences

Ref­er­ences

Ref­er­ences

Ref­er­ences

Ref­er­ences

References

References

References

References

References