I am an undergraduate student, majoring in computer science and engineering at the National Institute of Technology Rourkela, India. I am broadly interested in deep learning and machine learning research, specifically multi-modal learning and interpretability. Currently, I am working on concept bottleneck models (CBMs) in vision and language, particularly in medical imaging tasks, under Prof. Vineeth Balasubramanian at Lab1055 (AI & Vision-Language Group), IITH.

Previously, I had the privilege of spending a wonderful summer as a Research Intern at the Indian Institute of Technology, Hyderabad, India, under the guidance of Dr. Konda Reddy Mopuri. During this time, and in the subsequent months of remote collaboration, I conducted research on Explainability in Vision Transformers. My work involved analyzing post-hoc explanation techniques and exploring token pruning methods to enhance the interpretability of image classification tasks.

For more details, refer to my CV or drop me an email.

News & Honors

Publications

CapsoNet: A CNN-Transformer Ensemble for Multi-Class Abnormality Detection in Video Capsule Endoscopy
Arnav Samal, Ranya Batsyas
arXiv preprint | Oct, 2024
pdf | abstract | code

Selected Projects

SketchWarp
Developed a self-supervised learning framework in PyTorch for dense photo-to-sketch correspondences, enabling automatic image-to-sketch warping. Designed and implemented training and evaluation pipelines inspired by the “Learning Dense Correspondences between Photos and Sketches” paper.
code | paper

NeurIPS Ariel Data Challenge 2024
Developed a pipeline for predicting spectral values in the NeurIPS Ariel Data Challenge 2024 using time-series calibration, spatial aggregation, and gradient-based phase detection. Ranked 257/1,152 by applying Nelder-Mead optimization and cubic polynomial fitting to model planetary transits from raw sensor data.
code | kaggle

Paper Implementations
Implemented significant AI and machine learning research papers, including transformers (such as GPT variants, BERT, ViTs) as well as LoRA and neural style transfer. I actively implement new papers and continuously update this repository.
code

Measuring Patch Importance in ViT's (Vanilla & Attention Rollout)
Analyzed patch importance in Vision Transformers using attention scores of the [CLS] token across MHSA mechanims in all blocks, visualizing the distribution of top-k patch tokens. Implemented Attention Rollout to propagate attention through layers, creating interpretable visualizations of information flow and enhancing understanding of self-attention mechanisms.
code

Lab1055
2025
ACM-IKDD
S2025
IIT Hyderabad
S2024 & S2025
RespAI Lab
2025
DiL Lab
2024
NIT Rourkela
2022 - Present