I'm computer science graduate student at New York University, Courant. I completed my bachelor's from Delhi Technological University in New Delhi. My interests span computer vision, robotics, bayesian machine learning, and reinforcement learning, with a focus on dexterous manipulation and multi-agent systems with building reliable, interpretable decision-making systems across modalities.
I work on building robots and learning agents that can act in the real world. My current focus is dexterous manipulation, where I train robots to perform in-hand skills using reinforcement learning, combining vision and touch with 3D scene understanding and object pose estimation, and I spend a lot of time getting these policies to work on real hardware. I also work on VLA models as planners for visual navigation, where the robot follows open-vocabulary language instructions to move through unstructured environments without a pre-built map. Earlier, I worked on multi-agent coordination, focusing on safe and deadlock-free navigation of large groups of agents without centralized communication. On the theory side, I study reinforcement learning policies and try to understand what makes them sample efficient, robust under distribution shift, and easier to transfer from simulation to real hardware, and I am especially interested in improvements around regret and convergence bounds, exploration in high dimensional spaces, stability of policy gradient methods, and how representation learning shapes the policy space.
An ensemble framework for generative RL policies that pairs each generative head with its own Q-network to prevent head collapse under Q-weighted losses, together with an ensemble disagreement action-selection rule delivering consistent multi-seed gains on MuJoCo locomotion.
A survey of Social Mini-Games in multi-robot navigation, proposing a unified taxonomy and evaluation framework to classify existing methods and guide future research.
A single decentralized control algorithm using neural ICBFs and gradient optimization that ensures safe, input-constrained, deadlock-free control of 1000+ agents in cluttered environments.
PV-S3: Advancing Automatic Photovoltaic Defect Detection using Semi-Supervised Semantic Segmentation of Electroluminescence Images
Abhishek Jha,
Yogesh Rawat,
Shruti Vyas Engineering Applications of Artificial Intelligence, Elsevier  
project page
/
arXiv
A semi-supervised segmentation model (PV-S3) that detects defects in photovoltaic EL images using only 20% labeled data, outperforming state-of-the-art supervised methods and reducing annotation costs by 80%.
A fine tuned transfer learning model using the SimCLR and SwAV models that predicts autism from resting-state fMRI scans, showcasing the potential of contrastive and non-contrastive models for robust neuroimaging analysis.
Introduced Strategic Pseudo-Goal Perturbation (SPGP), that resolves deadlocks in multi-agent navigation by guiding agents through strategic pseudo-goals, enhancing safety and efficiency in complex scenarios.
Diagnosis support model for Autism spectrum disorder using Neuroimaging data and Xception
Abhishek Jha,
Kainat Khan,
Rahul Katarya ELEXCOM, 2023
paper
A transfer learning model using the Xception ConvNet predicts autism from resting-state fMRI scans, demonstrating the feasibility of early diagnosis through deep learning on brain imaging data.
Real Time Analysis of Material Removal Rate and Surface Roughness for Turning of Al-6061 using ANN and GA
Abhishek Jha,
Baibhav Kumar,
Ashok Kumar Madan IJRESM, 2022
paper
An integrated ANN and Genetic Algorithm model predicts and optimizes Material Removal Rate and surface roughness in Al 6061 turning operations, enhancing machining precision through simulation-based methods.
An ensemble framework for generative RL policies that pairs each generative head with its own Q-network to prevent head collapse under Q-weighted losses, together with an ensemble disagreement action-selection rule delivering consistent multi-seed gains on MuJoCo locomotion.
Visual Subgoal Planning for Long Horizon Robot Navigation
Supervisor:Prof. David Fouhey Code
A hierarchical visual navigation system that plans long-horizon indoor goals from natural-language instructions, using a small vision-language model to encode the current observation and goal text into query tokens, a ColBERT-style late-interaction re-ranker to pick the next visual subgoal from a precomputed topological graph, and a frozen NoMaD diffusion policy to execute it. Benchmarked across vision-language backbones from 256M to 10B parameters.
Benchmarking Deadlock Resolution in Social Mini-Games
Supervisor:Prof.Rohan Chandra
/
Code
A Benchmark and Survey of Deadlock Resolution in Multi-Robot Navigation in Social Mini-Games
Autonomous navigation of turtlebot using SLAM
Code
Autonomous navigation and trajectory planning of a robot using Robot Operating System (ROS). A maze is created in gazebo for the robot to determine the best possible trajectory with collison avoidance. Probablistic localization method is used for navigation. Adaptive Monte Carlo Localization(AMCL) node and slam_gmapping package is used for localization of robot and mapping of robot. Rviz interface is used for the simulation of robot and creating the cost map for the travel of robot.
This website is created upon Jon Barron's template.