Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
30 Cards in this Set
- Front
- Back
Why is studying speech acoustics important?
|
-it is an objective, physical measure of speech disorders.
-it can be used to identify specific parameters of disordered speech -it is non-invasive, inexpensive, and readily available. |
|
What is the Source Filter Model of speech acoustics?
|
Source X Filter X Radiation = Output
|
|
What are Source effects?
|
Source includes...
-glottal vibration and harmonics -these have a low frequency emphasis with a -12dB/octave slope |
|
What are radiation effects?
|
Radiation is...
-the radiation of sound at the lips, which is +6dB/octave |
|
What is the total output of voice at the lips?
(source effects) + (radiation effects)=total output |
(-12dB/octave) + (+6dB/octave)= -6dB/octave
|
|
What are Filter effects?
|
-the vocal tract acts as a resonating tube and filters/reshapes the source function of speech
-the resonance of the vocal tract is represented by FORMANTS--not harmonics |
|
What do formants depend upon?
|
1) Vocal tract length
2) Cross-sectional shape of the vocal tract |
|
How does vocal tract length impact formants?
|
-Tube length determines specific formant frequencies, and since our vocal tract is a tube, it plays a role in that.
|
|
Vocal tract length: what is the "odd quarter wavelength relationship"?
|
- ODD: a uniform tube that is closed at one end will have formants at odd multiples of the lowest frequency (e.g. 500 x 1, 500 x 3, 500 x 5, etc.)
-QUARTER: the lowest resonance frequency is the first formant, and this corresponds to a wavelength that is 4 times the length of the tube. In other words, the tube is 1/4 of the wavelength of the first formant |
|
Why does the quarter wavelength relationship work best?
|
-a tube resonantes best when there is a volume velocity maximum at the open end and a minimum at the closed end
-the quarter wavelength satisfies these 2 conditions |
|
How does the cross-sectional shape of the vocal tract impact formants?
|
1) constricting a tube at volume velocity maximum causes a reduction in the frequency of the related formant
2) constricting a tube at the volume velocity minimum causes an increase in the frequency of the related forment |
|
What are the 3 characteristics of Nasal Resonance?
|
1) Nasal formant: lower first formant around 300hz
2) Dampening of the rest of the formants 3) Antiformants or zeros that appear as large valleys (approaching 0) in the frequency in spectrograms: /m/=750-1250hz ; /n/=1450-2200hz; /eng/=above 3000hz |
|
What is Turbulent Noise?
|
This is a characteristic of FRICATIVES, AFFRICATES AND STOPS.
-air passing through narrow constriction generates noise -this stream of air generates eddies (irregular rotations of air pressure) whose fluctuations are referred to as TURBULENCE. |
|
What is Reynolds number?
|
Reynolds number: The critical flow velocity at which turbulence occurs.
-1800 is the critical value for turbulent speech noise |
|
What things affect fricative resonance?
|
-shape of oral cavity: for /s/ the resonator acts like a very short tube and there is higher resonance.
-shorter tube=higher F1 -more posterior place of constriction=decreased resonance -/sh/ has a lower F1 than /s/ |
|
What are the 4 main considerations related to digitizing speech?
|
1) Analog vs. digital, 2) sampling rate, 3) Filtering, 4) Quantization
|
|
What's the difference between analog and digital?
|
Analog=continuous, infinite recording
Digital=discrete, samples at certain intervals |
|
What is sampling rate and why is it important?
|
Sampling rate: Sampling chops up the analog signal into discrete time intervals.
-HIGHER=BETTER -You must choose a sampling rate that is 2 times the highest frequency of interest. -Recommended sampling rate is 20,000hz or MORE |
|
What is low-pass filtering?
|
Low-pass filtering: retains all frequencies below the filter frequency (i.e. higher frequencies are removed from the signal).
|
|
What is the Nyquist rule?
|
Nyquist rule: Before you digitize, filter your sample at 1/2 the sampling frequency
-this is to prevent aliasing (i.e. the introduction of false frequencies, i.e. "ghost signals", in your signal) |
|
What is Quantizing?
|
Quantizing: the coding of the signal's amplitude at each point in time
-based on a computer's binary coding (0 or 1) -each binary level is referred to as a "bit" -the more "bits", the better -AT LEAST 12 BITS RECOMMENDED, BUT 24 IS BEST |
|
What is an oscillogram and what does it show?
|
Oscillogram: amplitude vs time
SHOWS: -VOT -segment durations -F0 (pitch) |
|
What is a spectrogram and what does it show?
|
Spectrogram: frequency vs time
WIDE BAND: like a bank of 300hz parallel filters; good for resolving formant frequencies SHOWS: -VOT -segment durations -fricative/stop frequency -formant frequency -formant contours NARROW BAND: 45hz filter; good for resolving harmonics and F0 SHOWS -F0 (pitch) -F0 contours -Harmonics |
|
What is a formant tracking and what does it show us?
|
Formant tracking: the red dots we see on spectrums, usually based on spectral peaks
SHOWS: -Fricative/stop frequency -formant frequency -formant contours |
|
What are spectra and what can they show us?
|
Spectrum: amplitude vs. frequency over a single slice in time
Fast Fournier Transform(FFT): more detailed SHOWS: -fricative/stop frequency -formant frequency -F0 (pitch) -harmonics Linear Predictive Coding(LPC): less susceptible to noise but less detailed SHOWS: -fricative/stop frequency -formant frequency |
|
What are some VOICE MEASURES for dysarthric speech?
|
1.Jitter
2. Shimmer 3. Harmonic-to-noise ratio 4. Voice breaks (spasms) 5. Tremor of F0 and intensity |
|
What are vowel formants like for individuals with dysarthria?
|
1. F1 range: restrictions in elevation and/or lowering
2. F2 range: restrictions in retraction and/or advancement 3. Centralized F1 & fF: formants converge toward a central schwa 4. F1 & F2 variability: variable formants especially during prolonged vowels - irregular (i.e. chorea/dystonia) or regular variation (i.e. tremor) 5. Diphthong formants restricted and/or slow (reduced F2 slope) |
|
What are stops like for individuals with dysarthria?
|
1. Spirantization: prolonged fricative-like noise replaces transient burst
2. Voicing throughout closure: continuous voicing replaces stop gap 3. Nasalized stops ( i.e. /b/ -> /m/) 4. VOT variability 5. Maximum plosive repetition rate reduced and/or variable |
|
What are fricatives like for individuals with dysarthria?
|
1. Fricative noise is weak or absent
2. Peak and average frequency of ‘s’ and/or ‘sh’ is abnormal 3. Difference between average frequency of ‘s’ and ‘sh’ is restricted |
|
What are some notable prosodic measures for individuals with dysarthria?
|
1. Abnormal utterance durations (fast or slow rate of speech; words per minute)
2. Difference in duration of stressed and unstressed syllables is restricted (scanning speech) 3. Reduced average utterance intensity (quiet/hypophonic speech) 4.Reduced intensity variability across sentences (monoloudness) 5.Reduced fundamental frequency variability across sentences AND reduced declination of F0 across sentences (monopitch) |