AICTE’s AI-ML-based tool, built by BTech interns, has been used to translate over 1,300 textbooks into Indian languages.
R. Radhika | May 4, 2023 | 11:45 AM IST
NEW DELHI: In his final year of BTech in information technology, Karan Singh Garkel is spending his days dabbling in machine learning algorithms to build an Indian version of Google Lens. The tool, he claims, will not only be as accurate but also ensure safety of user data.
As a software development intern for the apex technical education regulator, All-India Council for Technical Education (AICTE), Garkel is helping build and improve an artificial intelligence and machine learning (AI-ML) Indian language translation tool – Anuvadini.
Over four months, Garkel hasn’t just learnt how AI and machine learning can be used to build tools for public good but also helped steer the education ministry’s plan to facilitate education in regional languages.
“It is an in-house language translation tool that is being used to translate books, texts, audio and even video in 14 different Indian and 13 foreign languages. As a deep learning tool based on disruptive technology, it provides real-time translation of even the lengthiest books along with formatting – an exact replica of the source but in a different language,” said Garkel, a student of Manipal Institute of Technology, Jaipur.
The tool, first launched in 2021 with limited features, is being upgraded by a team of college students. Since launch, it has been used to create more than 1,320 undergraduate and diploma textbooks; 1,625 translations of trade courses for ministry of skill development; and SWAYAM content. Apart from Indian languages, the content in foreign languages like Russian, Spanish, French, Arabic, German among others will be translated to allow access to foreign academic research.
Beyond academics, the tool may find application in healthcare, tourism, railways and airports, and the media.
As a part of the team, Garkel has been trained to build dictionaries and glossaries that can help make translation, especially of technical terms, easier. The team collaborates with various ministries to translate documents and information into Indian languages. For example, a document in English that a ministry wants to disseminate in various languages can be uploaded on the platform and translated in seconds. This is made possible with the help of in-built dictionaries and glossaries curated by the interns.
“As a python developer, my task is to build all the features in the backend. I am handling tasks like building a dictionary from skill ministry documents. I have extracted technical terms from the document by writing a code, cleaned the data and I will get it translated through our machine learning model into different Indian languages. We import this dictionary to our tool so that we can make translations better. While the usual text is easily translated, it is the technical terms that have to be translated correctly,” explained Garkel.
The terms and their translations also go through rigorous vetting by validators or subject-matter experts. The AICTE currently has 30 discipline-wise validators who check the accuracy of translated terms. Plus, the translated material – in text, audio or image format – is also cross-checked by validators at specific ministries.
The tool is also being upgraded with glossaries and definitional dictionaries for humanities and social sciences, commerce, agriculture, veterinary sciences, homoeopathy, architecture, administration and more areas in all regional languages.
“Technical words have different meanings and context in different disciplines. For example, ‘accounting’ can mean different things in business studies, law, science, economics or English literature. For this, our deep learning tool has in-built glossaries and dictionaries that can help,” explained AICTE’s chief coordinating officer, Buddha Chandrashekhar who leads the team of interns.
In March 2022, the council entered into an agreement with the Commission for Scientific and Technical Terminology — an autonomous body under the education ministry which produces the definitive glossaries of scientific and technical terminology – to develop teaching and learning processes, faculty development, content creation, and textbooks.
As a project driven only by interns, it has been an excellent opportunity for hands-on learning and expanding knowledge, according to Garkel.
“When I had applied for the internship, I had no hands-on knowledge of how AI-ML worked in production systems. I had developed a model or two in the university but that was just to test the accuracy of the data set – beginner-level stuff,” he said. “At the college, we experimented with simple machine learning algorithms but here we had to make a specific encoding-decoding architecture. To deploy a production code such that it handles all types of requests is a bigger challenge. The tool is open for public access. To learn more about managing these applications, scale up when the user base expands. All that can be learnt when you are part of a project like this.”
“Coming up with tech-based solutions is not always rocket science. It is mostly about thinking out-of-the-box to solve a problem. We always look for creative thinking to solve any problem. When I select my interns, I just assess them based on how they can write a lengthy code in just two lines. A creative solution is always superior to PhDs and stacks of degrees. Anyone is capable of it as long as they are creative,” Chandrashekhar explained.
The AICTE internship for engineering students has also harnessed skills in other fields, such as management. Malla Sainadh Ram Narasimha, now employed by the AICTE was one of the first interns to join the project in 2021. Peer-learning is an essential aspect of the internship and it allowed Sainadh to map his career.
“As an MBA student, I had applied for the online digital marketing internship and worked for six months in digital marketing projects. Most of my team members were web developers with specialisation in AI. Interactions with my peers piqued my interest in the area and I interned for another six months as web developer in the Anuvadini project in the final year of the MBA programme,” said Sainadh, who graduated with an integrated BTech and MBA from Amity University, Gurgaon.
“I slowly learnt a few basics from my team members. Having studied electronics and communications, I had some knowledge of web development. I also had a certificate in machine learning which helped me match up with this team and learn AI,” he recalled.
Officially, India recognises 22 languages with 12 scripts. Translation at scale for such a diverse set of languages of lakhs of speakers is only possible through the application of deep tech - machine translation. With time, according to Chandrashekhar, the tool will adapt to more Indian languages and even regional dialects.
“We have plans to give access to five to 10 crore students and teachers who can test it and make it better with each use,” he said. “The project is currently run on 40 servers that will be eventually scaled up nation-wide. We are also looking into the possibility of a national language competition which will allow anyone across India to feed in nuanced information like words and its translation in various dialects. This will help us to create a large data set that can improve the tool further.”
If you want to share your experience at work, write to us at email@example.com. To know more about The Workplace itself here's a handy note: Let’s talk work…
If you want to share your experience at work, talk about hiring trends or discuss internships, write to us at firstname.lastname@example.org. To know more about The Workplace itself, here's a handy note: Let’s talk work…
To get in touch, write to us at email@example.com.