Bengaluru-based Gnani AI has unveiled its latest speech-to-text model, Prisma v2.5, which claims to outperform competitors in eight out of nine Indian languages. This model is particularly noteworthy for its training on a vast dataset that captures the nuances of Indian accents, dialects, and the ambient noise typical in telephonic communications. With a word error rate of just 10% on critical utterances, Prisma v2.5 addresses a pressing issue in sectors like BFSI and healthcare, where miscommunication can lead to substantial financial discrepancies.
The model's development comes at a time when the demand for reliable voice AI solutions is surging across India. Ananth Nagaraj, Co-founder of Gnani AI, highlighted the business risks associated with inaccurate speech recognition, particularly in high-stakes environments like loan origination calls. The ability to accurately transcribe short utterances and domain-specific vocabulary could mitigate risks that arise from misinterpretations, which can cost businesses dearly.
Industry players have welcomed this innovation. Akshay Singhal from WeRize noted that Prisma v2.5's out-of-the-box accuracy meets the specific needs of Indian enterprises, a feat that many existing models have failed to achieve. This launch not only sets a new standard for voice AI in India but also emphasizes the importance of localized training data in developing effective AI solutions.



