Avatar

Diego de Benito

AI Engineer & Researcher

location_on Madrid, Spain
school PhD in Computer Science and Telecommunications
work AI Specialist at EVO Banco



Ph.D. Thesis: Multi-resolution and Source Separation for Improved Sound Event Detection based on Deep Neural Networks

The slides for the dissertation are available here.


During my Ph.D., I focused on advancing Sound Event Detection (SED) using deep learning techniques. My research involved developing novel methods to enhance audio signal representations, particularly through a multi-resolution approach that captures diverse time and frequency characteristics of sound events. I participated in the DCASE Challenge, improving SED performance each year. Additionally, I explored the use of Source Separation neural networks to preprocess audio mixtures, leading to cleaner signals and better SED outcomes.

My work also included proposing a new distractor measure, studying mel-spectrogram resolutions, creating a synthetic dataset for severe event overlap, and analyzing semi-supervised training methods. These contributions aimed to address challenging SED scenarios, such as acoustic degradation and event overlap, across various sound event categories.