Youtube’s step to add Hindi auto-captioning is said to finally benefit hearing-impaired individuals who can now listen to Hindi-language content.
Hindi subtitles have been added to some videos by the creators who have made them available but since this feature requires payments to enable Hindi transcription of the content, many have chosen to steer away from doing so.
Until this point, English auto-captioning had been enabled for all videos since 2010 but this feature was not available for Hindi content. It is surprising that it took so long for Hindi captions to become available considering that there is a significant percentage of Hindi speaking Indian viewers on Youtube. Furthermore, speakers of other regional languages of India also make up a large section of the Youtube viewers.
What does the inclusion of Hindi captions mean?
Hindi captioning, especially auto-captioning for videos, has become more common, thanks to platforms like Google. The presence of transcription in Hindi on Google Translate hinted at this earlier.
Now, with the introduction of auto-captioning for Hindi videos, it suggests that there is enough data on Hindi speech to provide accurate captions. This reflects a broader trend of increasing language data availability for Indian languages.
Before the rise of generative Artificial Intelligence, platforms like YouTube utilized voice recognition for accessibility. However, implementing this technology becomes challenging for languages with limited online representation.
Why did it take so long for Hindi auto captioning to become available on Youtube?
According to Mayuresh Nirhali from Reverie, a company addressing issues related to Indian languages on the Internet, solving the speech-to-text problem requires a substantial amount of Hindi speech data along with accurate transcripts. Artificial intelligence models then learn from this data to perform speech-to-text tasks effectively.
Developing AI-enabled services like speech recognition for Indian languages is particularly difficult due to several challenges including inconsistent encoding of text online, as well as regional variations in spelling and pronunciation.
With English Youtube videos themselves, there are notable errors in auto-captioning. Even after 13 years, it has not been perfected. A lot of words are mistaken and autocaptioning in many songs is erroneous. Accuracy issues still exist in the most popular language. When the speech-to-text AI fails to catch the right word, it either uses a mistakenly similar-sounding word or just leaves it empty. This highlights the need for better captioning, particularly to benefit hearing-impaired individuals, since even now auto-captions cannot be relied upon.
The accuracy issue
The accuracy issue exists because of the lack of data input fed to the technology on colloquial and lesser-known dialects and slang words. Speech-to-text AI models are not well-versed in realistically spoken tongues on the ground. These models do not understand the nuances of mixed languages and the intermingling of words from other languages and dialects and hence fail to accurately caption the speech.
The accuracy issues in widely spoken languages like English emphasize the continuous efforts required to refine and enhance captioning technologies to ensure accurate and inclusive representation of diverse linguistic expressions in online content.