Pankaj Choudhury

Logo

View My GitHub Profile

Research Scholar

Centre for Linguistic Science and Technology

Indian Institute of Technology, Guwahati

πŸ“§ pankajchoudhury[AT]iitg.ac.in

LinkedIn Β  | Β  HuggingFace Β  | Β  Google Scholar


πŸ‘€ About Me

I am a Ph.D. scholar at the Centre for Linguistic Science and Technology, Indian Institute of Technology Guwahati, specializing in the intersection of Natural Language Processing (NLP), Computer Vision (CV), also known as Vision-Language Understanding (VLU).

My research primarily focuses on developing Automatic Image Captioning systems for low-resource languages, with a special emphasis on Assamese language. While mainstream AI technologies are often optimized for resource-rich languages like English, my work seeks to bridge this gap by designing models that are linguistically aware, data-efficient, and culturally inclusive.

Over the course of my doctoral work, I have contributed to:

I have collaborated on multiple projects in LLM training, multimodal AI, and dataset creation. Beyond technical contributions, I have also served as a faculty resource person and industry trainer, delivering sessions on Generative AI, Machine Learning, and Deep Learning for academic institutions and professional training programs.

My long-term vision is to build scalable, inclusive AI frameworks that not only advance research but also create meaningful real-world impact by enabling AI for everyone, regardless of language or resource availability.


πŸŽ“ Education


πŸ’Ό Experience


πŸ—£οΈ Invited Talks


πŸ’‘ Research Interests

My research focuses on the intersection of Natural Language Processing (NLP), Computer Vision (CV) known as Vision–Language Understanding (VLU), with an emphasis on low-resource Indian languages.

My doctoral thesis specifically focuses on Automatic Image Caption generation for low-resource Assamese languages. Automatic image caption generation is the task of producing natural language descriptions. My primary research goal is to extend this process to low-resource languages, where models must not only understand visual content but also generate semantically and syntactically accurate descriptions in a linguistically rich setting. Through my work on image captioning, I aim to design computationally efficient multimodal AI systems that are linguistically aware and culturally relevant.

Key areas of interest


πŸ“– Publications

πŸ“œ Journal Publications

1. P. Choudhury, S. Nair, P. Guha, S. Nandi
Image Captioning in Low Resource Assamese Language with Semantic Information Prior and Spatially Encoded Transformer Model
Expert Systems with Applications, 2025, Vol. 297, p.129479
Link

2. P. Choudhury, P. Guha, S. Nandi
Exploring Semantic Attributes for Image Caption Synthesis in Low-Resource Assamese Language
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2025
Link

3. P. Choudhury, P. Guha, S. Nandi
Impact of Language-Specific Training on Image Caption Synthesis: A Case Study on Low-Resource Assamese Language
International Journal of Asian Language Processing (IJAPL), 2024
Link


🎀 Conference Publications

1. P. Choudhury, P. Guha, S. Nandi
Relevance of Language-Specific Training on Image Caption Synthesis for Low Resource Assamese Language
International Conference on Asian Language Processing (IALP), Singapore, 2023, pp. 13–18
Link

2. P. Choudhury, P. Guha, S. Nandi
Image Caption Synthesis for Low Resource Assamese Language using Bi-LSTM with Bilinear Attention
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation (PACLIC), 2023, pp. 743–752
Link

3. P. Choudhury, Y. Aggarwal, P. Jadhav, P. Guha, S. Nandi
AC-Lite: A Lightweight Image Captioning Model for Low-Resource Assamese Language
Accepted at CVIP-2025 (to appear) Link

4. N. Rahman, P. Choudhury, P. Guha, A. Anand, S. Nandi
Visual Question Answering in Low-Resource Assamese Language – Datasets and Evaluation
9th International Conference on Computer Vision and Image Processing (CVIP), Springer LNCS, 2024, pp. 159–174 Link

5. Y. Aggarwal, P. Choudhury, P. Guha
Face Detection in Challenging Scenes with a Customized Backbone 8th International Conference on Computer Vision & Image Processing (CVIP-2023), pp. 468–482
Link

6. C. Kirti, P. Choudhury, A. Anand, P. Guha
An Annotated Corpus for Realis Event Detection in Short Stories Written in English and Low Resource Assamese Language
20th International Conference on Natural Language Processing (ICON), 2023, pp. 72–81
Link

7. M. P. Lahkar, A. Gogoi, P. Choudhury
AsCul: Annotated Dataset and a Deep Learning based Framework for Assamese Cultural Object Detection
International Conference on Intelligent Systems, Advanced Computing and Communication (ISACC), 2023, pp. 1–5
Link