ECE 769 - Speech and Audio Processing

ECE 769 - SPECIAL TOPICS IN SIGNAL PROCESSING:
SPEECH AND AUDIO PROCESSING

COURSE OBJECTIVES:

To provide an introduction to basic concepts and methodologies for the analysis, modeling, synthesis and coding of speech and music. To provide a foundation for developing applications and for further study in the field. To introduce software tools for the analysis and manipulation of speech and music and to gain practical experience in the design and implementation of speech and music processing algorithms.

Instructor:: Dr. Ian Bruce,
CRL 229, Ext. 26984.
ibruce@mail.ece.mcmaster.ca

PREREQUISITE: Senior undergraduate or graduate level DSP course

TEXTBOOK:

Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice, 2001, Prentice Hall

REFERENCES:

Morgan and Gold, Speech and Audio Signal Processing: Processing and Perception of Speech and Music, 1999, John Wiley & Sons

DETAILED COURSE OUTLINE:

Introduction to the Production and Classification of Speech Sounds (1 lecture)

Introduction to the Speech Communication Pathway
Anatomy and Physiology of Speech Production
Spectrographic Analysis of Speech
Categorization of Speech Sounds
Prosody
Speech Perception

Acoustics of Music Production (1 lecture)

Physics of Sound
Vibration of strings
Resonance of tubes
Pitch and Timbre

Acoustics of Speech Production (1 lecture)

Uniform Tube Model
Discrete-Time Modelling Based on Tube Concatenation
Vocal Fold/Vocal Tract Interaction

Room Acoustics and Digital Effects (1 lecture)

Sound Waves in Rooms
Room Acoustics as a Component in Speech Systems
Digital Effects

Analysis and Synthesis of Pole-Zero Speech Models (1 lecture)

Time-Dependent Processing
All-Pole Modeling of Deterministic Signals
Linear Prediction Analysis of Stochastic Speech Sounds
Synthesis Based on All-Pole Modeling
Pole-Zero Estimation

Short-Time Fourier Transform Analysis and Synthesis (1 lecture)

Short-Time Analysis and Synthesis
Signal Estimation from the Modified STFT
Time-Scale Modification and Enhancement of Speech

Filter-Bank Analysis/Synthesis (1 lecture)

Phase Vocoder
Constant-Q Analysis/Synthesis
Auditory Modeling

Sinusoidal Analysis/Synthesis (1 lecture)

Sinusoidal Speech Model
Estimation of Sinewave Parameters
Synthesis
Source/Filter Phase Model
Additive Deterministic-Stochastic Model

Homomorphic Signal Processing (2 lectures)

Homomorphic Systems for Convolution
Complex Cepstrum of Speech-Like Sequences
Spectral Root Homomorphic Filtering
Short-Time Homomorphic Analysis of Periodic Sequences
Short-Time Speech Analysis
Analysis/Synthesis Structures

Speech Coding (2 lectures)

Statistical Models of Speech
Scalar Quantization
Vector Quantization (VQ)
Frequency-Domain Coding
Model-Based Coding
LPC Residual Coding

(Total Course = 12 x 3-hour lectures)

ASSESSMENT:

Assignments (20% ); Project and presentation (30%); Midterm (20%); Final (30%).

TERM:

II.