Abhishek Jha

I'm computer science graduate student at New York University, Courant. I completed my bachelor's from Delhi Technological University in New Delhi. My interests span computer vision, robotics, bayesian machine learning, and reinforcement learning, with a focus on dexterous manipulation and multi-agent systems with building reliable, interpretable decision-making systems across modalities.

Research

I work on building robots and learning agents that can act in the real world. My current focus is dexterous manipulation, where I train robots to perform in-hand skills using reinforcement learning, combining vision and touch with 3D scene understanding and object pose estimation, and I spend a lot of time getting these policies to work on real hardware. I also work on VLA models as planners for visual navigation, where the robot follows open-vocabulary language instructions to move through unstructured environments without a pre-built map. Earlier, I worked on multi-agent coordination, focusing on safe and deadlock-free navigation of large groups of agents without centralized communication. On the theory side, I study reinforcement learning policies and try to understand what makes them sample efficient, robust under distribution shift, and easier to transfer from simulation to real hardware, and I am especially interested in improvements around regret and convergence bounds, exploration in high dimensional spaces, stability of policy gradient methods, and how representation learning shapes the policy space.

Publications

2026
	Can K Heads Explore Better Than One in Online Reinforcement Learning? Abhishek Jha, Satyapragnya Kar, Kishlay Kumar, Stephanie Milani, Rajesh Ranganath ICML 2026 (DEMO Workshop) An ensemble framework for generative RL policies that pairs each generative head with its own Q-network to prevent head collapse under Q-weighted losses, together with an ensemble disagreement action-selection rule delivering consistent multi-seed gains on MuJoCo locomotion.
2025
	Multi-Robot Navigation in Social Mini-Games: Definitions, Taxonomy, and Algorithms Rohan Chandra, Shubham Singh, Abhishek Jha, Dannon Andrade , Hriday Sainathuni , Katia Sycara arXiv 2025 (preprint) A survey of Social Mini-Games in multi-robot navigation, proposing a unified taxonomy and evaluation framework to classify existing methods and guide future research.
	PV-S3: Advancing Automatic Photovoltaic Defect Detection using Semi-Supervised Semantic Segmentation of Electroluminescence Images Abhishek Jha, Yogesh Rawat, Shruti Vyas Engineering Applications of Artificial Intelligence, Elsevier 2025 project page A semi-supervised segmentation model (PV-S3) that detects defects in photovoltaic EL images using only 20% labeled data, outperforming state-of-the-art supervised methods and reducing annotation costs by 80%.
	Decentralized Safe and Scalable Multi-Agent Control under Limited Actuation Vrushabh Zinage, Abhishek Jha, Rohan Chandra, Efstathios Bakolas ICRA 2025 project page A single decentralized control algorithm using neural ICBFs and gradient optimization that ensures safe, input-constrained, deadlock-free control of 1000+ agents in cluttered environments.
2024
	Enhancing ASD Diagnosis with Contrastive and Non-Contrastive Models from Neuroimaging Data Abhishek Jha, Ishita Mehta , Kainat Khan, Rahul Katarya ICMNWC 2024 A fine tuned transfer learning model using the SimCLR and SwAV models that predicts autism from resting-state fMRI scans, showcasing the potential of contrastive and non-contrastive models for robust neuroimaging analysis.
	Strategic Pseudo-Goal Perturbation for Deadlock-Free Multi-Agent Navigation in Social Mini-Games Abhishek Jha, Tanishq Gupta, Sumit Singh Rawat, Girish Kumar ICCRE 2024 Introduced Strategic Pseudo-Goal Perturbation (SPGP), that resolves deadlocks in multi-agent navigation by guiding agents through strategic pseudo-goals, enhancing safety and efficiency in complex scenarios.
2023
	Diagnosis support model for Autism spectrum disorder using Neuroimaging data and Xception Abhishek Jha, Kainat Khan, Rahul Katarya ELEXCOM 2023 A transfer learning model using the Xception ConvNet predicts autism from resting-state fMRI scans, demonstrating the feasibility of early diagnosis through deep learning on brain imaging data.
2022
	Real Time Analysis of Material Removal Rate and Surface Roughness for Turning of Al-6061 using ANN and GA Abhishek Jha, Baibhav Kumar, Ashok Kumar Madan IJRESM 2022 An integrated ANN and Genetic Algorithm model predicts and optimizes Material Removal Rate and surface roughness in Al 6061 turning operations, enhancing machining precision through simulation-based methods.

Projects

	Ensemble Soft Actor Critic Supervisor: Prof. Rajesh Ranganath Code An ensemble framework for generative RL policies that pairs each generative head with its own Q-network to prevent head collapse under Q-weighted losses, together with an ensemble disagreement action-selection rule delivering consistent multi-seed gains on MuJoCo locomotion.
	Visual Subgoal Planning for Long Horizon Robot Navigation Supervisor: Prof. David Fouhey Code A hierarchical visual navigation system that plans long-horizon indoor goals from natural-language instructions, using a small vision-language model to encode the current observation and goal text into query tokens, a ColBERT-style late-interaction re-ranker to pick the next visual subgoal from a precomputed topological graph, and a frozen NoMaD diffusion policy to execute it. Benchmarked across vision-language backbones from 256M to 10B parameters.
	Self-Supervised Learning Using VICReg Supervisor: Prof. Alfredo Canziani, Prof. Yann LeCun Code / Project Report / Demo Video Pretrained VICReg-based self-supervised models on a 700K-image custom dataset using ResNet-50×2 with large-batch LARS training (batch size 1024) for stable non-contrastive SSL. Tuned augmentations and the variance/invariance/covariance loss coefficients to prevent representation collapse, then evaluated representations via linear probing and finetuning on downstream image classification under ImageNet-style constraints.
	Benchmarking Deadlock Resolution in Social Mini-Games Supervisor: Prof.Rohan Chandra / Code A Benchmark and Survey of Deadlock Resolution in Multi-Robot Navigation in Social Mini-Games

Education

	New York University, Courant Institute M.S. in Computer Science 2025.09 - Present
	Delhi Technological University B.Tech in Mechanical Engineering \| GPA: 4.0/4.0 2020.12 - 2024.05

This website is created upon Jon Barron's template.