E6620 Applied Signal Recognition (2019) Instructor: Prof. Homayoon Beigi Textbook: H. Beigi, "Fundamentals of Speaker Recognition," Springer, New York 2011. Grading: Midterm (40%): - Implementation of a signal recognition project homework project and data selection/acquisition. - 2-page extended abstract describing the results and proposing modifications to one specific part of the engine to increase performance (accuracy, speed, or both) - 10 minute presentation of the above. - Scripts and results. Final (60%): - Discussion and Implementation of an Improvement in one of the aspects of the Signal recognition engine. - 6-page IEEE conference style paper describing the system and results obtained from the modification. - Code and Results. - 10 minute presentation of the results. Course Description: Applied Signal Recognition is a comprehensive course, covering all aspects of Signal Recognition from theory to practice. In this course such topics as Time and Spatial Signals (such as Audio, Image, and Vibration signals) Signal Representation, Signal Processing and Feature Extraction, Probability Theory and Statistics, Information Theory, Metrics and Divergences, Decision Theory, Parameter Estimation, Clustering and Learning, Transformation, Hidden Markov Modeling, Search Techniques, Deep Neural Networks, Support Vector Machines and other recent machine learning techniques used in signal recognition are covered in some detail. Also, applications in Machine and Structural Health analysis/prognosis, Objection Detection and Recognition, Audio Event Detection, Multimodal analysis, Image Recognition, Video Analysis are covered in detail. Also, several open source software packages are introduced, with detailed hands-on projects using Kaldi, Darknet, and Caffe to produce a fully functional signal recognition engine. The lectures cover the theoretical aspects as well as practical coding techniques. The course is graded based on a project. The Midterm (40% of the grade is in the form of a two page proposal for the project and the final (60% of the grade) is an oral presentation of the project plus a 6-page conference style paper describing the results of the research project. The instructor uses his own Textbook for the course, Homayoon Beigi, "Fundamentals of Speaker Recognition," Springer-Verlag, New York, 2011. Every week, the slides of the lecture are made available to the students. Topics to be covered: - Introduction (Overview of Speaker Recognition and its history) - Audio, Image, Virbration, brain-wave, and applications include human biometrics, imaging, geophysics, machinery, electronics, networking, languages, communications, and finance - Signal Representation of time-dependent signals Sampling, Quantization and Amplitude Errors Practical Sampling and Associated Errors - Signal Processing of time-dependent signals and Feature Extraction The Sampling Process Integral Transforms Spectral Analysis and Direct Method Features Linear Predictive Cepstral Coefficients (LPCC) Perceptual Linear Predictive (PLP) Analysis Alternative Cepstral-Based Features Other Features Signal Enhancement and Pre-Processing - Audio, Image, Video, Vibration, and Natural Language Processing Audio Event Detection Machine Health Analysis/Prognosis Structural Health Analysis/Prognosis Object and Face Detection and Recognition Emotion Analysis Multimodal Corpora Natural Language Processing - Recognition Software Creating a complete Recognition System Training Kaldi Darknet Caffe - Probability Theory and Statistics Set Theory Measure Theory Probability Measure Integration Functions Statistical Moments Discrete Random Variables Moment Estimation Multi-Variate Normal Distribution - Information Theory Sources The Relation between Uncertainty and Choice Discrete Sources Discrete Channels Continuous Sources Relative Entropy Fisher Information - Metrics and Divergences Distance (Metric) Divergences and Directed Divergences - Decision Theory Hypothesis Testing Bayesian Decision Theory Bayesian Classifier Decision Trees - Parameter Estimation Maximum Likelihood Estimation Maximum A-Posteriori (MAP) Estimation Maximum Entropy Estimation Minimum Relative Entropy Estimation Maximum Mutual Information Estimation (MMIE) Model Selection (AIC and BIC) - Unsupervised Clustering and Learning Vector Quantization (VQ) Basic Clustering Techniques Estimation using Incomplete Data - Transformation Principal Component Analysis (PCA) Generalized Eigenvalue Problem Nonlinear Component Analysis Linear Discriminant Analysis (LDA) Factor Analysis Probabilistic Linear Discriminant Analysis (PLDA) - Hidden Markov Modeling (HMM) Memoryless Models Discrete Markov Chains Markov Models Hidden Markov Models Model Design and States Training and Decoding Gaussian Mixture Models (GMM) Practical Issues - Search Techniques Practical Issues Finite State Transducers - Deep Neural Networks Perceptron Feedforward Networks Convolutional Neural Networks (CNN) Recurrent Neural Networks (RNN) Time-Delay Neural Networks (TDNN) Long-Short Term Memory Networks (LSTM) End-to-End Sequence (Encoder/Decoder) Neural Networks Hierarchical Mixtures of Experts (HME) Deep Learning Network and Practical Issues Transfer Learning - Support Vector Machines Risk Minimization The Two-Class Problem Kernel Mapping Positive Semi-Definite Kernels Non Positive Semi-Definite Kernels Kernel Normalization Kernel Principal Component Analysis (Kernel PCA) Nuisance Attribute Projection (NAP) The multiclass (Γ -Class) Problem