Arnav Samal

I am an undergraduate student, majoring in computer science and engineering at the National Institute of Technology Rourkela, India. I am broadly interested in deep learning and machine learning research, specifically multi-modal learning and interpretability. Currently, I am working on concept-based models (CBMs) in vision and language, under Prof. Vineeth Balasubramanian at the Machine Learning & Vision Group, IITH.

Previously, I had the privilege of spending a wonderful summer as a Research Intern at the Indian Institute of Technology, Hyderabad, India, under the guidance of Dr. Konda Reddy Mopuri. During this time, and in the subsequent months of remote collaboration, I conducted research on Explainability in Vision Transformers. My work involved analyzing post-hoc explanation techniques and exploring token pruning methods to enhance the interpretability of image classification tasks.

For more details, refer to my resume or drop me an email.

Accepted into the highly competitive ACM-IKDD Uplink 2025 program, where I will be working with Prof. Vineeth Balasubramanian on Reasoning in Vision-Language Models back at IIT Hyderabad!
Accepted into the prestigious RISC 2025 program at the University of Maryland.
Ranked 9th in the Computer Science and Engineering department at NIT Rourkela.
Ranked 5th in the Capsule Vision Challenge 2024 as Team Seq2Cure, organized by CVIP 2024.
Selected among 170 from 20,000+ applicants for the SURE program at IIT Hyderabad.
Achieved 2nd place in HackFest 24, a hackathon organized by ML4E for undergraduate students.
Recognized as a Kaggle Expert in Datasets & Notebooks.

SketchWarp (ongoing)
Developed a self-supervised learning framework in PyTorch for dense photo-to-sketch correspondences, enabling automatic image-to-sketch warping. Designed and implemented training and evaluation pipelines inspired by the “Learning Dense Correspondences between Photos and Sketches” paper.
code | paper

NeurIPS Ariel Data Challenge 2024
Developed a pipeline for predicting spectral values in the NeurIPS Ariel Data Challenge 2024 using time-series calibration, spatial aggregation, and gradient-based phase detection. Ranked 257/1,152 by applying Nelder-Mead optimization and cubic polynomial fitting to model planetary transits from raw sensor data.
code | kaggle

Paper Implementations
Implemented significant AI and machine learning research papers, including transformers—GPT variants, BERT, Vision Transformers (ViT)—as well as LoRA and neural style transfer. I actively implement new papers and continuously update this repository.
code

Measuring Patch Importance in ViT's (Vanilla & Attention Rollout)
Analyzed patch importance in Vision Transformers using attention scores of the [CLS] token across MHSA mechanims in all blocks, visualizing the distribution of top-k patch tokens. Implemented Attention Rollout to propagate attention through layers, creating interpretable visualizations of information flow and enhancing understanding of self-attention mechanisms.
code

ML & Vision Lab

2025

ACM-IKDD

S2025

IIT Hyderabad

S2024 & S2025

RespAI Lab

2025

DiL Lab

2024

NIT Rourkela

2022 - Present