Research Projects
ESCORT Framework
CompletedEfficient Stein-variational and Sliced Consistency-Optimized Temporal belief Representation for POMDPs. A particle-based framework for capturing complex, multi-modal distributions in high-dimensional belief spaces.
Key Contributions:
- Correlation-aware projections modeling state dependencies
- Temporal consistency constraints for stable updates
- Superior performance on multi-modal distributions
NS-Gym
CompletedFirst simulation toolkit designed explicitly for non-stationary MDPs, integrated with OpenAI Gymnasium. Provides standardized benchmarks and environments for testing adaptive algorithms.
Features:
- Modular environment parameter evolution
- 6+ benchmark environments
- Compatible with standard RL algorithms
- Comprehensive evaluation metrics
Shrinking POMCP
CompletedReal-time UAV search and rescue framework combining advanced simulation with novel POMDP planning. Addresses time constraints by guiding agents toward non-sparse belief regions.
Applications:
- UAV search and rescue operations
- 3D AirSim-ROS2 integration
- Neuro-symbolic navigation
AIROAS
CompletedAnnealed Importance Resampling for Observation Adaptation Search. Addresses particle degeneracy and sample impoverishment in POMDP belief updating.
Innovations:
- Sigmoid-based tempering for tree search
- Target inefficiency ratio mechanism
- Superior performance in highly observable settings
Vehicle-to-Building Optimization
CompletedOnline decision-making system for V2B energy management using Monte Carlo Tree Search. Deployed with Nissan Advanced Technology Center.
Impact:
- 30% reduction in peak power demand
- Real-world EV testbed validation
- Handles heterogeneous charger configurations
I-TAP
ActiveIn-Context Latent Temporal Abstraction Planner combining in-context adaptation with online planning in learned temporal abstraction spaces.
Benchmarks:
- MuJoCo locomotion tasks
- High-dimensional Adroit manipulation
- Effective under partial observability
PA-MCTS
CompletedPolicy-Augmented Monte Carlo Tree Search for non-stationary environments. Combines offline learning with online search for robust decision-making.
Results:
- Outperforms AlphaZero in non-stationary settings
- Theoretical convergence guarantees
- Validated on OpenAI Gym environments
Adaptive MCTS
CompletedAdaptive Monte Carlo Tree Search that learns updated dynamics while maintaining safe exploration through dual-phase sampling strategies.
Key Features:
- Bayesian uncertainty quantification
- Risk-averse exploration
- Online adaptation to environment changes