Name: Deep Learning Model for Lip-Based Speech Synthesis
Start: 2025-01-31T12:15:00+0530
End: 2025-01-31T14:15:00+0530

Friday January 31, 2025 12:15pm - 2:15pm IST

Virtual Room E

Open Zoom

Authors - A.Kousar Nikhath, Aanchal Jain, Ananya D, Ramana Teja
Abstract - The project focuses on creating an advanced system for visual speech recognition by performing lipreading at the sentence level. Traditional approaches, which were limited to word-level recognition, often lacked sufficient contextual understanding and real-world usability. This work aims to overcome those limitations by utilizing cutting-edge deep learning models, such as CNNs, RNNs, and hybrid architectures, to effectively process visual inputs and generate coherent speech predictions. The system's development follows a systematic approach, beginning with a review of existing solutions and their shortcomings. The proposed framework captures both temporal and spatial dynamics of lip movements using specialized neural networks, significantly enhancing the accuracy of sentence-level predictions. Extensive testing on diverse datasets validates the system’s efficiency, scalability, and practical applications. This study underscores the critical role of robust feature extraction, sequential data modeling, and hierarchical processing in achieving effective sentence-level lipreading. The results demonstrate notable improvements in performance metrics. Additionally, the project outlines future advancements, including optimizing the system for real-time processing and resource-constrained environments, paving the way for practical implementation in multiple fields.

Paper Presenter

A.Kousar Nikhath

India

Friday January 31, 2025 12:15pm - 2:15pm IST
Virtual Room E Pune, India

Virtual Room 9E, Virtual Room E

Ninth International Conference SmartCom 2025

A.Kousar Nikhath

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!