First empirical study of correctness bugs in torch.compile characterizes their patterns and proposes AlignGuard, which found 23 confirmed new bugs via LLM-guided test mutation.
hub Canonical reference
Llm4drive: A survey of large language models for autonomous driving.ArXiv, abs/2311.01043
Canonical reference. 100% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
roles
background 5polarities
background 5representative citing papers
TPS-Drive uses an agent-centric tokenizer supervised by a frozen 3D detection head to purify VLM spatial representations, enabling better scene forecasting and lower collision rates on nuScenes and NAVSIM benchmarks.
Training-free mechanistic interpretability locates lexical-encoding attention heads in ViT and shows that targeted interventions on them improve robustness to typographic attacks in CLIP and downstream LVLMs.
The paper fine-tunes Qwen3.5-4B as a driving VLA using serialized decision traces from rule-based planners, reporting reduced ADE and miss rate on a simulator benchmark with camera inputs.
An LLM framework using Object-Process Methodology for scene understanding and intent-aware interaction outperforms baselines in simulator tests for safety, comfort, and efficiency in mixed-traffic autonomous driving.
PALM improves long-horizon robotic manipulation success by distilling affordance representations for object interaction and predicting within-subtask progress in a VLA model.
ReSim is a controllable video world model trained on heterogeneous real and simulated driving data that achieves higher fidelity and controllability for both expert and non-expert actions, plus a Video2Reward module for estimating action quality from simulated futures.
LiloDriver uses LLMs and memory-augmented planning in a four-stage pipeline to outperform rule-based and learning-based methods on both common and rare scenarios in the nuPlan benchmark.
DriveMoE applies scene-specialized Vision MoE and skill-specialized Action MoE to a VLA baseline to achieve SOTA closed-loop performance on Bench2Drive.
DriveVLM adds vision-language models with scene description, analysis, and hierarchical planning modules to autonomous driving, paired with a hybrid DriveVLM-Dual system tested on nuScenes and SUP-AD datasets and deployed on a production vehicle.
An LLM-supported framework maps natural-language commands to distinguishable Apollo lane-change parameters for three driving styles via clustering and RAG, with experiments showing improved interpretation of implicit preferences.
On-policy GKD trains 5x smaller student LLMs to nearly match large teacher performance in AV motion planning on nuScenes while beating a dense-feedback RL baseline.
Moral alignment in LLMs improves with model size according to the power law D ∝ S^{-0.10} (R²=0.50).
Introduces structured NuScenes-S dataset and 0.9B FastDrive VLM claiming 20% higher decision accuracy and over 10x inference speedup versus larger unstructured VLMs.
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
A survey synthesizing LLM and MM-LLM uses in transportation operations, mobility services, and decision support while noting challenges like data heterogeneity and real-time needs.
citing papers explorer
-
Demystifying the Silence of Correctness Bugs in PyTorch Compiler
First empirical study of correctness bugs in torch.compile characterizes their patterns and proposes AlignGuard, which found 23 confirmed new bugs via LLM-guided test mutation.
-
TPS-Drive: Task-Guided Representation Purification for VLM-based Autonomous Driving
TPS-Drive uses an agent-centric tokenizer supervised by a frozen 3D detection head to purify VLM spatial representations, enabling better scene forecasting and lower collision rates on nuScenes and NAVSIM benchmarks.
-
Towards Robustness against Typographic Attack with Training-free Concept Localization
Training-free mechanistic interpretability locates lexical-encoding attention heads in ViT and shows that targeted interventions on them improve robustness to typographic attacks in CLIP and downstream LVLMs.
-
Neuro-Symbolic Drive: Rule-Grounded Faithful Reasoning for Driving VLAs
The paper fine-tunes Qwen3.5-4B as a driving VLA using serialized decision traces from rule-based planners, reporting reduced ADE and miss rate on a simulator benchmark with camera inputs.
-
Large Language Model based Interactive Decision-Making for Autonomous Driving
An LLM framework using Object-Process Methodology for scene understanding and intent-aware interaction outperforms baselines in simulator tests for safety, comfort, and efficiency in mixed-traffic autonomous driving.
-
PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation
PALM improves long-horizon robotic manipulation success by distilling affordance representations for object interaction and predicting within-subtask progress in a VLA model.
-
ReSim: Reliable World Simulation for Autonomous Driving
ReSim is a controllable video world model trained on heterogeneous real and simulated driving data that achieves higher fidelity and controllability for both expert and non-expert actions, plus a Video2Reward module for estimating action quality from simulated futures.
-
LiloDriver: A Lifelong Learning Framework for Closed-loop Motion Planning in Long-tail Autonomous Driving Scenarios
LiloDriver uses LLMs and memory-augmented planning in a four-stage pipeline to outperform rule-based and learning-based methods on both common and rare scenarios in the nuPlan benchmark.
-
DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving
DriveMoE applies scene-specialized Vision MoE and skill-specialized Action MoE to a VLA baseline to achieve SOTA closed-loop performance on Bench2Drive.
-
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models
DriveVLM adds vision-language models with scene description, analysis, and hierarchical planning modules to autonomous driving, paired with a hybrid DriveVLM-Dual system tested on nuScenes and SUP-AD datasets and deployed on a production vehicle.
-
A Large-Language-Model Supported Personalized Driving Framework for Lane Change in Highway Scenarios
An LLM-supported framework maps natural-language commands to distinguishable Apollo lane-change parameters for three driving styles via clustering and RAG, with experiments showing improved interpretation of implicit preferences.
-
On-Policy Distillation of Language Models for Autonomous Vehicle Motion Planning
On-policy GKD trains 5x smaller student LLMs to nearly match large teacher performance in AV motion planning on nuScenes while beating a dense-feedback RL baseline.
-
Scaling Laws for Moral Machine Judgment in Large Language Models
Moral alignment in LLMs improves with model size according to the power law D ∝ S^{-0.10} (R²=0.50).
-
Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving
Introduces structured NuScenes-S dataset and 0.9B FastDrive VLM claiming 20% higher decision accuracy and over 10x inference speedup versus larger unstructured VLMs.
-
A Survey on the Memory Mechanism of Large Language Model based Agents
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
-
Large Language Models in Transportation Systems Management and Operations: From Text Reasoning to Multi-modal Decision Support
A survey synthesizing LLM and MM-LLM uses in transportation operations, mobility services, and decision support while noting challenges like data heterogeneity and real-time needs.