ECCV'26 Autonomous Driving Fast-Slow LLM Planning

ASSCG: Just-Right Gating over Chattering for Fast-Slow LLM Planning in Autonomous Driving

Sining Ang, Yuan Chen, Haiyan Liu, Xuanyao Mao, Jason Bao, Xuliang, Bingchuan Sun, Yan Wang

Institute for AI Industry Research, Tsinghua University · University of Science and Technology of China · Beihang University · Lenovo Group Limited

A lightweight gate for deciding when a fast-slow autonomous driving planner should query an LLM, reuse cached guidance, or drop unreliable slow-system outputs under a compute budget.

67.28 nuPlan Hard20 score +2.28 absolute
~60% latency reduction -
91.4 NAVSIM PDMS +0.6 absolute
~25% average speed gain -

Demo

Adaptive querying in closed-loop driving

Problem

Slow reasoning is useful, but not always worth querying

Fixed intervals miss temporal variation

Triggering the slow module at a fixed frequency wastes compute in easy segments and may miss the moments where slow reasoning is most useful.

Difficulty proxies can chatter

Heuristic complexity estimates do not necessarily align with the marginal utility of slow guidance, causing unnecessary oscillation and mis-timed queries.

ASSCG learns the invocation policy

We cast slow-module control as sequence generation and train a frame-level controller to choose Query, Cache, or Drop actions.

Method

Adaptive Slow-System Control Gate

Architecture of the Adaptive Slow-System Control Gate
ASSCG makes frame-level Query, Cache, and Drop decisions, allowing a fast planner to selectively refresh, reuse, or suppress slow LLM guidance.

Abstract

Resource-aware slow-system invocation

Large language models can improve autonomous driving planning, but querying them online is expensive. Existing fast-slow planners often rely on hand-designed triggering rules that either over-call the slow system or call it at the wrong times.

We formulate slow-system invocation as a resource-aware sequential decision problem and propose the Adaptive Slow-System Control Gate (ASSCG), which makes frame-level Query, Cache, and Drop decisions. ASSCG uses an RWKV backbone for efficient long-horizon gating and is trained with supervised fine-tuning followed by GRPO-style compute-aware reinforcement fine-tuning.

ASSCG improves AsyncDriver on nuPlan Hard20 to 67.28 (+2.28) while reducing average end-to-end inference latency by approximately 60%. On a RecogDrive-based dual system evaluated on NAVSIM, ASSCG achieves 91.4 PDMS (+0.6) and increases average speed by approximately 25%.

Training and Evaluation

From supervised labels to compute-aware reinforcement fine-tuning

Sequence-generation gate

Each frame is represented as a gating token prediction problem, enabling long-horizon temporal modeling with an RWKV backbone.

SFT to GRPO-style optimization

ASSCG starts from supervised pseudo-labels and is then fine-tuned with a compute-aware objective that balances planning quality against slow-query cost.

Two fast-slow instantiations

We evaluate on AsyncDriver with nuPlan Hard20 closed-loop testing and on a RecogDrive-based dual system with NAVSIM PDMS and speed measurements.

Analysis

Equivalent, failure, and effective intervals

Equivalent Interval

Re-querying the slow system yields negligible closed-loop change, so cached guidance remains sufficient.

Failure Interval

Slow guidance can actively hurt the trajectory. ASSCG can choose Drop to suppress the cached slow feature.

Effective Interval

The current slow guidance improves behavior and should be reused until its value decays or context changes.

Interval analysis of fast-slow autonomous driving planning
Slow guidance is not uniformly useful across time. We analyze intervals where it is equivalent to the fast planner, intervals where it hurts, and intervals where it improves closed-loop behavior.

Introduction

Why gating matters

Motion planning remains a core challenge in autonomous driving, especially in complex, dynamic, and long-tail interactions. LLMs provide commonsense reasoning and scenario priors, but their latency and compute cost make per-frame deployment difficult.

The fast-slow paradigm offers a practical compromise: a real-time fast planner runs every frame while a slower LLM module provides high-level guidance at a lower frequency. However, fixed schedules ignore temporal variability, and heuristic difficulty triggers often fail to capture the marginal utility of slow reasoning.

ASSCG addresses this by learning when to consult the slow module, when to reuse cached guidance, and when to drop unreliable slow outputs. This turns slow-module coordination into a sequential decision problem optimized for closed-loop quality under a compute budget.

Contributions

  1. We introduce effective and failure intervals to characterize when slow guidance helps or hurts over time.
  2. We present ASSCG, a frame-level gate trained with GRPO-style reinforcement fine-tuning under query costs.
  3. We evaluate ASSCG on AsyncDriver with nuPlan Hard20 and a RecogDrive-based fast-slow planner on NAVSIM.

Results

Improved performance-efficiency trade-offs

nuPlan Hard20 closed-loop evaluation

Method Score Latency
AsyncDriver 65.00 0.80 s/frame
AsyncDriver, 5-frame interval 64.27 0.32 s/frame
AdaptiveAsyncDriver (ASSCG) 67.28 0.32 s/frame

ASSCG improves the score by +2.28 over AsyncDriver while matching the latency of a fixed 5-frame interval baseline.

NAVSIM fast-slow dual system

Method PDMS Latency
ReCogDrive 90.8 ~350 ms
RecogDrive-based dual system + ASSCG 91.4 ~270 ms

The same gating principle transfers to NAVSIM, improving PDMS by +0.6 while reducing wall-clock inference latency.

Ablations

Why the design choices matter

GRPO improves closed-loop score

Starting from supervised pseudo-labels gives a usable gate, while GRPO-style fine-tuning optimizes the policy directly for closed-loop reward under query cost.

RWKV keeps long-horizon gating efficient

Compared with Transformer-style attention, RWKV provides a compact recurrent state and lower last-frame gate latency for long scenes.

Drop is not just a speed trick

The Drop action suppresses cached slow guidance during failure intervals, improving robustness beyond simply reducing query frequency.