Pace-and-Path Correction decomposes a quadratic cost minimization into orthogonal pace and path channels to correct chunked actions in VLA models, raising success rates by up to 28.8% in dynamic settings.
Hirt: Enhancing robotic control with hierarchical robot transformers.arXiv preprint arXiv:2410.05273
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7representative citing papers
ROSA introduces shared GPU-pool serving, robotics-aware abstractions for multi-model pipelines, and factory-productivity scheduling that improves output by up to 12.06x over dedicated per-robot systems.
ROAD-VLA constructs an advantage-perturbed proximal teacher in action space to convert sparse rewards into dense supervision for online VLA adaptation and reports outperformance versus PPO across seven manipulation environments.
UniFS achieves 98.3% success on LIBERO with 2.1x lower latency than prior fast-slow VLA models by stratifying VLM layer update frequencies, inverting latent interactions, and applying multi-level supervision.
Veo-3 video predictions enable approximate task-level robot trajectories in zero-shot settings but require hierarchical integration with low-level VLA policies for reliable manipulation performance.
A controllable world model trained on the DROID dataset generates consistent multi-view robot trajectories for over 20 seconds and improves generalist policy success rates by 44.7% via imagined trajectory fine-tuning.
Video Prediction Policy conditions robot action learning on future-frame predictions inside fine-tuned video diffusion models, yielding 18.6% relative gains on Calvin ABC-D and 31.6% higher real-world success rates.
citing papers explorer
-
Overcoming Dynamics-Blindness: Training-Free Pace-and-Path Correction for VLA Models
Pace-and-Path Correction decomposes a quadratic cost minimization into orthogonal pace and path channels to correct chunked actions in VLA models, raising success rates by up to 28.8% in dynamic settings.
-
ROSA: A Robotics Foundation Model Serving System for Robot Factories
ROSA introduces shared GPU-pool serving, robotics-aware abstractions for multi-model pipelines, and factory-productivity scheduling that improves output by up to 12.06x over dedicated per-robot systems.
-
ROAD-VLA: Robust Online Adaptation via Self-Distillation for Vision-Language-Action Models
ROAD-VLA constructs an advantage-perturbed proximal teacher in action space to convert sparse rewards into dense supervision for online VLA adaptation and reports outperformance versus PPO across seven manipulation environments.
-
UniFS: Unified Fast-to-Slow Hierarchical Architecture for Vision-Language-Action Models
UniFS achieves 98.3% success on LIBERO with 2.1x lower latency than prior fast-slow VLA models by stratifying VLM layer update frequencies, inverting latent interactions, and applying multi-level supervision.
-
Veo-Act: How Far Can Frontier Video Models Advance Generalizable Robot Manipulation?
Veo-3 video predictions enable approximate task-level robot trajectories in zero-shot settings but require hierarchical integration with low-level VLA policies for reliable manipulation performance.
-
Ctrl-World: A Controllable Generative World Model for Robot Manipulation
A controllable world model trained on the DROID dataset generates consistent multi-view robot trajectories for over 20 seconds and improves generalist policy success rates by 44.7% via imagined trajectory fine-tuning.
-
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
Video Prediction Policy conditions robot action learning on future-frame predictions inside fine-tuned video diffusion models, yielding 18.6% relative gains on Calvin ABC-D and 31.6% higher real-world success rates.