Multilingual Speech-To-Text Translator

As a subsidiary of the Data Science Club at Georgia Tech, this project aims to build a full stack app built using React that uses existing speech to text APIs to convert multilingual speech-based input into a unilingual text-based output. This project was initiated as a hearing-aid software to help individuals from multilingual backgrounds. Coming from a multilingual family (I actually speak 5 languages!), I strongly identified with this project and its rationale.

My role:
I am part of the audio-processing and web app development teams for this project. In the audio-processing team, our focus is to build a speech-to-text model using neural networks to ‘catch’ the audio input and convert it into plain text. We are currently working on audio preprocessing techniques for noise removal, and language classification. In the web app development team, our focus is to build a minimum viable product with translation capabilities that has a good GUI.