IIIT Hyderabad Researchers’ ML Model To Revolutionise Movie Watching Experience

Can perform face-to-face translation that is in sync with the original

Watching the South Korean movie “Parasite” that is automatically translated into Hindi and is without sub-titles, may soon be a reality as a team of researchers from the International Institute of Information Technology (IIIT-H) has developed a machine learning model that can automatically translate a video of any person speaking in one language to another.

Using machine learning (ML), the model can take a video of a person speaking in one language and deliver a video output of the same speaker in another language. For instance, video of a person speaking in Telugu can automatically be translated in any other language such as Bangla, Hindi or Gujarati, in such a manner that the voice and lip movements match the language of one’s choice.

The research paper, “Towards AutomaticFace-to-Face Translation” was presented at the ACM International Conference on Multimedia at Nice, France in October 2019. A comparative analysis of IIT Hyderabad’s tool vis-à-vis Google Translatefor English-Hindi machine translation found the in-house tool to be more accurate.

Presently, translation systems for videos generate a translated speech output or textual subtitles and the often out-of-sync dubbed movie or other video content ruins the viewer’s experience.

In order to automate translation of videos, the team of researchers, led by Professor CV Jawahar, dean (Research &Development), developed an ML model that can perform face-to-face translation, reports Telangana Today.

LipGAN

The researchers, to obtain a fully translated video with accurate lip synchronization, the researchers introduced a visual module called LipGAN. The module can also correct lip movements in an original video to match the translated speech. For example, badly dubbed movies with out-of-sync lip movements can be corrected with LipGAN, that has been trained on large video data-sets, making it possible to work for any language, any identity, any voice.

Apart from making content such as movies, educational videos, TV news and interviews available to diverse audiences in various languages, there are potential futuristic applications such as cross-lingual video calls or the ability to have video calls in real-time with someone speaking a different language.

IIIT Hyderabad