|
Abstracts
included from the following articles:
Frequency modulation detection in cochlear
implants listeners.
Independent contributions of amplitude
and frequency modulations to auditory perception. II. Melody,
tone, and speaker identification.
Frequency modulation detection in cochlear
implants listeners.
Clear Speech Perception in Normal-Hearing
and Cochlear-Implant Listeners
Facts and Artifacts in Auditory
Chimaeras
Independent contributions of amplitude
and frequency modulations to auditory perception. I. Consonant,
vowel, and sentence recognition.
Hongbin Chen, Fan-Gang Zeng
Univ. of California, Irvine
Irvine, CA92697 - 1257
Although amplitude modulation detection has been extensively
studied in both acoustic and electric hearing, frequency modulation
detection has been rarely studied in electric hearing. Here
we systematically studied the cochlear implant listeners
ability to detect three types of frequency modulations including,
upward sweep, downward sweep, and sinusoidally frequency modulated
stimuli. Difference limens (i.e., 70.7% correct response in
a 3IFC, 2-down and 1-up procedure) were measured as a function
of baseline frequency (from 75 to 1,000 Hz). Factors studied
included electrode position (apical vs. basal), stimulation
level (soft vs. comfortable), and modulation frequency (from
5 to 320 Hz dependent on the baseline frequency). Three postlingually
deafened adults using Nucleus-22 cochlear implant participated
in the experiment. For comparison, similar data were also
collected in normal hearing listeners. Preliminary data showed
an insignificant effect of electrode position and stimulation
level but a significant effect of baseline frequency and modulation
type on frequency modulation detection. Consistent with previous
data in simple rate discrimination, difference limens in detecting
all three types of frequency modulation increased monotonically
as a function of the baseline frequency. Despite of large
individual variability, difference limens for the sinusoidally
frequency modulation were about half of that for the upward
and downward frequency sweeps. The present data suggest that
cochlear implant listeners may be more sensitive to dynamic
frequency changes than steady-state frequency changes. We
hope to explore this difference to dynamically encode the
temporal fine structure in speech and music sounds for cochlear
implant users.
Independent contributions of amplitude
and frequency modulations to auditory perception. II. Melody,
tone, and speaker identification.
Ying-Yee Kong, Michael Vongphoe and Fan-Gang Zeng
University of California, Irvine
Irvine, CA 92697-1275
In a companion paper, we showed that amplitude modulation
provides sufficient information for speech recognition in
quiet, but additional frequency modulation is needed in noise.
Here we evaluated relative contributions of amplitude and
frequency modulations to melody, tone, and speaker identification.
Twelve familiar melodies were generated with or without tempo
information. Twenty-five Mandarin syllables, each having 4
tonal variations, were produced by a male and a female talker.
Six vowel tokens (3 used for training and 3 used for testing)
produced by 3 males, 3 females, 2 boys, and 2 girls were used
for speaker identification. Stimuli were processed to extract
slowly varying amplitude and frequency modulations from a
number of frequency bands (1-64 bands). Melody and speaker
identifications were conducted in both normal-hearing and
cochlear-implant listeners, whereas tone identification was
conducted in normal listeners only. Results showed that amplitude
modulation only (i.e., 1 band) produced about 80% correct
performance for melody identification with tempo and also
for tone identification in quiet. However, for melody identification
without tempo and for tone identification in noise (0 dB S/N),
the performance dropped to about 40% even with 8 frequency
bands. Similarly, listeners could recognize most of the vowels
but could not identify the speakers. When frequency modulation
was added, performance was restored to a level similar to
the unprocessed stimuli. These results suggest that amplitude
and frequency modulations independently contribute to auditory
perception, with amplitude modulation contributing gross temporal
information while frequency modulation contributing detailed
spectral information for accurate pitch perception and signal-and-noise
separation. Character Count: 1802 Max Characters: 2000
Clear Speech Perception in Normal-Hearing
and Cochlear-Implant Listeners
Sheng Liu1, Elsa DelRio1, Ann R. Bradlow2, Fan-Gang Zeng1
1Hearing and Speech Research Laboratory, University of California,
Irvine.
2Department of Linguistics, Northwestern University
Abstract
Previous studies have demonstrated that when instructed to
speak clearly to people with hearing loss, talkers can produce
clear speech, which has significantly higher intelligibility
in noise than conversational speech. Here we measured
clear and conversational speech perception at various signal
to noise ratios covering a range over which intelligibility
increased from about 0% to 100%. Stimuli consisted of ten
sets of BKB sentences produced by a male and a female talker
in clear and conversational speech. Speech-spectrum-shaped
noise was used to produce the different signal-to-noise ratios.
Real cochlear implant users and cochlear implant simulations
were also tested to measure the contribution of temporal envelope
cues to the clear speech advantage. A sigmoid function was
used to fit the measured data, producing 2 parameters indicative
of the speech reception threshold (i.e., the signal-to-noise
ratio at which 50% intelligibility was achieved) and the slope
of the psychometric function. We found that the speech reception
threshold was 9.1 dB for clear speech and 6.3
dB for conversational speech in normal listeners, and was
correspondingly 4.6 dB and 1.0 dB in implant listeners.
The differences in speech reception threshold translated into
about 15 percentage points in improved intelligibility scores
for normal listeners and about 20 percentage points for implant
listeners. Cochlear implant simulation produced similar results
to that obtained in real implant listeners. The present results
confirmed and extended previous findings in normal listeners.
In addition, the implant and its simulation data suggest a
direct contribution of temporal envelop cues to the intelligibility
advantage of clear speech over conversational speech. Further
analysis of temporal envelope cues in clear speech should
yield results that are not only important for understanding
mechanisms of speech perception but also for developing novel
processing algorithms in auditory prostheses.
Independent contributions of amplitude
and frequency modulations to auditory perception. I. Consonant,
vowel, and sentence recognition.
Kaibao Nie, Ginger Stickney, and
Fan-Gang Zeng
University of California, Irvine
Irvine, CA 92697-1275
Previous studies have demonstrated that one can understand
speech with primarily either temporal or spectral cues. However,
it is not clear why both cues are present in natural sounds
and how they are processed in the auditory system. Here we
developed a signal processing strategy that independently
extracted slowly-varying amplitude and frequency modulations
within a frequency band with the number of bands as an independent
variable. Normal-hearing listeners were presented with original
speech sounds and processed sounds including amplitude modulation
only and both amplitude and frequency modulations. The speech
materials were vowels, consonants, and IEEE sentences, presented
in quiet, or speech-shaped noise, or a single competing talker.
The addition of frequency modulation significantly increased
speech recognition with amplitude modulation only, particularly
with less frequency bands and in challenging noise conditions.
For example, the average vowel recognition score with amplitude
modulation only and four frequency bands was 62%, 38%, and
32% for quiet, 0, and 5 dB signal-to-noise ratio conditions,
respectively. With the addition of frequency modulation, the
corresponding score improved to 75%, 63%, and 52%, respectively.
Similarly, the addition of frequency modulation improved sentence
recognition by about 50 percentage points from 20% in the
presence of a competing talker. These results suggest that,
while amplitude modulation provides essential information
for speech recognition in quiet, frequency modulation can
enhance speech recognition by allowing the listener to extract
signal from noise. Further results in tonal language, music
and speaker identification will be presented in a companion
paper. These results are relevant to design of cochlear implants
and audio coding strategies. Character Count:1834. Max Characters:
2000
Facts and Artifacts in Auditory
Chimaeras
Fan-Gang Zeng, Kai-Bao Nie, Ginger Stickney,
Sheng Liu, Elsa Del Rio, Ying-Yee Kong, Hong-Bin Chen
Hearing and Speech Research Laboratory,
University of California, Irvine, California 92697-1275, USA
Smith, Delgutte and Oxenham (Nature, 416:8790, 2002)
produced auditory chimaeras by systematically
mixing one sounds temporal envelope with another sounds
fine temporal structure as a function of frequency bands (1-64).
They found that the envelope is most important for speech
reception, and the fine structure is most important for pitch
perception and sound localization. Here we identified
two technical problems that one should be aware of when interpreting
results derived from auditory chimaeras. First, one should
be aware of the ears natural ability to recover the
narrow-band envelope with the broad-band processing for a
small number of frequency bands (e.g., 1 and 2). Second, one
should be concerned about filter artifacts with the narrow-band
processing for a large number of bands (e.g., 32, 48, and
64). In addition, we conducted two experiments to challenge
Smith et al.s assertion regarding the envelope and fine
structure as the acoustic basis for the what and
where mechanisms. In one experiment, we used Smith
et al.s program to chimaerize two sentences that had
either a 15-dB interaural level difference or realistic interaural
differences through HRTF filters. Under these conditions,
we found that it was the envelope, rather than the fine structure,
that determines sound localization. In another experiment,
we performed classic filtering manipulation on the chimaerized
sounds with 16 bands and a 700-ms delayed envelope or fine
structure. With a low-pass filter having an 800-Hz cutoff
frequency, we found that one could lateralize the sound to
the side with leading fine structure but could not recognize
speech. Conversely, with a high-pass filter having an 800-Hz
cutoff frequency, one could easily recognize speech but could
not lateralize the sound. This result suggests that the dichotomy
revealed by the auditory chimaeras is an epiphenomenon of
classic duplex perception between low- and high-frequency
pathways.
Word count: 1965
back
|