Loading…
Friday January 31, 2025 9:30am - 11:30am IST

Authors - Sindhu C, Taruni Mamidipaka, Yoga Sreedhar Reddy Kakanuru, Summia Parveen, Saradha S
Abstract - India is a country with very rich ancient historical legacy. It preserved vast cultural and linguistic knowledge through stone inscriptions. Extracting text from ancient stone inscriptions and translating it into a language which is understandable by everyone is a very challenging task due to script variations, natural wear, and the uneven degraded surfaces of stone carvings. Our idea is to build a model which can extract the text from these stone inscriptions which were written in Telugu language and translate them into other Indian local languages. The Region-Based Convolutional Neural Network (R-CNN) model which is integrated with Tesseract OCR is trained on a custom dataset of 30,000 labelled images of Telugu script, encompassing Achulu (vowels), Hallulu (consonants), and Vathulu. By achieving a 96% accuracy in character detection, this model demonstrates significant reliability in recognizing Telugu characters from degraded and complex inscriptions. Data augmentation techniques, including rotations, flips, and shifts were used to further enhance the model’s robustness to different orientations and environmental conditions encountered in historical artifacts. The text which is extracted from the image is ultimately translated into Indian local languages using an API-based translation module, enabling a seamless interpretation of ancient content. This research contributes a comprehensive and automated solution for cultural preservation, giving us a scalable method to digitize and make historical inscriptions accessible to everyone which are integral to Telugu heritage and linguistic history.
Paper Presenter
Friday January 31, 2025 9:30am - 11:30am IST
Virtual Room B Pune, India

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link