Loading…
Thursday January 30, 2025 9:30am - 11:30am IST

Authors - Sathiyapriya K, S Bharath, Rohith Sundharamurthy, Prithivi Raaj K, Rakesh Kumar S, Rakkul Pravesh M, N Arun Eshwer
Abstract - The convenience and security offered by voice-based authentication systems results in its increasing use in various sectors such as banking, e-commerce, telecommunications, etc. But these systems are open to vulnerabilities from voice spoofing attacks, including replay synthesis and voice conversion. The following work makes use of Mel-Frequency Cepstral Coefficients (MFCC), Constant-Q Transform (CQT), and a deep learning model Res2Net and creates a framework that can classify genuine and spoofed voices. MFCC and CQT are commonly used for feature extraction, and the Res2Net model classifies the audio. The system was evaluated against the ASVspoof 2021 dataset, the reason being that it has a diverse collection of audio samples (almost 180,000) samples, and also it is recognized by the research community. Our system recorded a low Equal Error Rate (EER) of 0.0332 and a Tandem Detection Cost Function (t-DCF) of 0.2246. This framework contributes to the advancement of secure voice authentication systems, addressing critical challenges in modern cybersecurity.
Paper Presenter
Thursday January 30, 2025 9:30am - 11:30am IST
Virtual Room E Pune, India

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link