SALIDlab (South Asian Languages, Images, and Data Lab) is an interdisciplinary research lab in South Asian Studies. In collaboration with the National Center for Supercomputing Applications (NCSA) and various Grainger College of Engineering departments, SALIDlab conducts computational research with data, methods, and texts/artifacts related to South Asia. The PI of SALIDlab is Prof. Rini B. Mehta (rbhttchr@illinois.edu).

Current Projects (2024-25)

Sanskrit Grammar Engine

Sanskrit Grammar Engine: 

SALIDlab is developing tools and methods for encoding Sanskrit grammar rules derived from Panini’s treatise. Interested students should know the Sanskrit language and be fluent in Python (preferably Python plus Django, Flask, FastAPI) or JS. 

Contact Prof. Mehta if you are interested in joining the Sanskrit research group.

 

Sanskrit Group
  
 Pari Kulkarni is a freshman majoring in Computer Science & Economics with a keen interest in programming, languages, and cultures. She loves traveling and photography. Pari has Sanskrit since 5th grade and is captivated by its logical structure. She is excited to merge her passion for technology and Sanskrit in this project, building a grammar engine based on Panini’s rules.
 Rohan Kapur is a Computer Engineering major with a strong interest in the intersection of language and computation. Previously, he led the development of NLP pipelines at BAM Money, working on a variety of language models. Rohan is passionate about Hindustani Classical Music and enjoys Underwater Hockey.

 

Cinema, Language, and Democracy: South Asian Cinema Data Project

South Asia produces 1/3 of global cinema. Bollywood or Mumbai(Bombay)-based Hindi cinema from India is the best-known, but it is only one among many industries that produce hundreds of films yearly. Familiarity with IMDb datasets, SPARQL, PHP+MySQL, or other NoSQL is preferable. The Cinema, Language, and Democracy project aims to build resources to study South Asia's multilingual and multilocal cinemas, including Lahore-based Lollywood, Bangladeshi cinema, and Sri Lankan cinema. Telugu, Tamil, and other Indian industries will be researched in their historical contexts. Students interested in participating in this exciting project should write to Prof. Rini B. Mehta (rbhttchr@illinois.edu). 

Transliteration and Corpus-building

Adarsh Krishnan (SPIN Intern, NCSA) is working with Prof. Mehta to solve a transliteration problem in Bengali to Roman and Devanagari. They are also building a Bengali literary corpus based on texts from the late 19th and early 20th centuries.

Script Animation

Jewel Domingo is working on creating digital alphabet with animated strokes for use in SALIDcamp instruction of South Asian langauges.