Ph.D. Thesis

Ph.D. Thesis: Multi-resolution and Source Separation for Improved Sound Event Detection based on Deep Neural Networks

🔗 Link to Ph.D. Dissertation
Defense Date: October 9th, 2023
Research Group: Audias - Audio, Data Intelligence and Speech
Institution: Universidad Autónoma de Madrid
Grade: Cum Laude

The slides for the dissertation are available here.

During my Ph.D., I focused on advancing Sound Event Detection (SED) using deep learning techniques. My research involved developing novel methods to enhance audio signal representations, particularly through a multi-resolution approach that captures diverse time and frequency characteristics of sound events. I participated in the DCASE Challenge, improving SED performance each year. Additionally, I explored the use of Source Separation neural networks to preprocess audio mixtures, leading to cleaner signals and better SED outcomes.

My work also included proposing a new distractor measure, studying mel-spectrogram resolutions, creating a synthetic dataset for severe event overlap, and analyzing semi-supervised training methods. These contributions aimed to address challenging SED scenarios, such as acoustic degradation and event overlap, across various sound event categories.