ECE 769 - SPECIAL TOPICS IN SIGNAL
PROCESSING:
SPEECH AND AUDIO
PROCESSING
COURSE OBJECTIVES:
To provide an introduction to basic concepts and methodologies for the
analysis, modeling, synthesis and coding of speech and music. To provide a foundation for developing applications and for further
study in the field. To introduce software tools for the analysis and
manipulation of speech and music and to gain practical experience in
the design and implementation of speech and music processing algorithms.
-
Instructor:
-
Dr. Ian Bruce,
CRL 229, Ext. 26984.
ibruce@mail.ece.mcmaster.ca
PREREQUISITE: Senior
undergraduate or graduate level DSP course
TEXTBOOK:
-
Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice, 2001,
Prentice Hall
REFERENCES:
-
Morgan and Gold, Speech and Audio Signal Processing: Processing and Perception
of Speech and Music, 1999, John Wiley & Sons
DETAILED COURSE OUTLINE:
Introduction to the Production and Classification of Speech Sounds (1
lecture)
- Introduction to the Speech Communication Pathway
- Anatomy and Physiology of Speech Production
- Spectrographic Analysis of Speech
- Categorization of Speech Sounds
- Prosody
- Speech Perception
Acoustics of Music Production (1 lecture)
- Physics of Sound
- Vibration of strings
- Resonance of tubes
- Pitch and Timbre
Acoustics of Speech Production (1 lecture)
- Uniform Tube Model
- Discrete-Time Modelling Based on Tube Concatenation
- Vocal Fold/Vocal Tract Interaction
Room Acoustics and Digital Effects (1 lecture)
- Sound Waves in Rooms
- Room Acoustics as a Component in Speech Systems
- Digital Effects
Analysis and Synthesis of Pole-Zero Speech Models (1 lecture)
- Time-Dependent Processing
- All-Pole Modeling of Deterministic Signals
- Linear Prediction Analysis of Stochastic Speech Sounds
- Synthesis Based on All-Pole Modeling
- Pole-Zero Estimation
Short-Time Fourier Transform Analysis and Synthesis (1 lecture)
- Short-Time Analysis and Synthesis
- Signal Estimation from the Modified STFT
- Time-Scale Modification and Enhancement of Speech
Filter-Bank Analysis/Synthesis (1 lecture)
- Phase Vocoder
- Constant-Q Analysis/Synthesis
- Auditory Modeling
Sinusoidal Analysis/Synthesis (1 lecture)
- Sinusoidal Speech Model
- Estimation of Sinewave Parameters
- Synthesis
- Source/Filter Phase Model
- Additive Deterministic-Stochastic Model
Homomorphic Signal Processing (2 lectures)
- Homomorphic Systems for Convolution
- Complex Cepstrum of Speech-Like Sequences
- Spectral Root Homomorphic Filtering
- Short-Time Homomorphic Analysis of Periodic Sequences
- Short-Time Speech Analysis
- Analysis/Synthesis Structures
Speech Coding (2 lectures)
- Statistical Models of Speech
- Scalar Quantization
- Vector Quantization (VQ)
- Frequency-Domain Coding
- Model-Based Coding
- LPC Residual Coding
(Total Course = 12 x 3-hour lectures)
ASSESSMENT:
Assignments (20% ); Project and presentation (30%); Midterm (20%); Final
(30%).
TERM: