home›Benchmark›

ProAct: A Proactive AI Assistant Architecture for Anticipatory Computing

Introducing ProAct, a novel agent architecture that transforms idle intervals into structured cycles of anticipation and learning to enhance user experience and efficiency.

May 27, 2026

#Academic #Agents #LLM #Memory

This article delves into ProAct, a proactive AI assistant designed to anticipate user needs and acquire information during idle times. By shifting computation from peak interaction periods, ProAct aims to reduce user effort, accelerate task completion, and improve factual grounding through a closed-loop system of prediction, acquisition, and utility-aware delivery.

The Reactive Status Quo and a Proactive Alternative

Today’s AI assistants remain fundamentally reactive: they compute responses only after explicit user prompts, leaving the idle time between interactions unused. This contrasts with the psychological concept of proactive coping, where individuals anticipate future demands and prepare resources in advance.

The paper introduces ProAct, a proactive agent architecture that transforms idle intervals into structured cycles of anticipation and learning. Instead of waiting for a request, ProAct analyzes dialogue history and persistent memory to predict likely upcoming user needs, then acquires supporting evidence during idle windows. A value-aware delivery gate ensures that prepared content is surfaced only when it is genuinely useful, avoiding irrelevant interruptions.

Figure 1: Current assistants wait for explicit requests and leave idle-time compute unused.
ProAct instead uses dialogue history and persistent memory to predict likely future needs, explores high-value candidates during idle windows, and feeds the resulting knowledge back into later interactions.

This paradigm shifts substantial computation from interaction peaks to off-peak periods, aiming to reduce user effort, accelerate task completion, and improve factual grounding.

ProAct Architecture: Prediction, Acquisition, and Delivery

ProAct operates through a closed loop that couples foreground interactions with background preparation. After each user turn, the system updates its persistent memory, which stores user profiles, conversation summaries, entity facts, and previously acquired artifacts.

During the subsequent idle interval, two tightly integrated modules take over:

Future-State Prediction generates a compact set of candidate future needs by extrapolating from the recent dialogue and expanding into related topics grounded in memory. It also incorporates signals from memory maintenance, converting stale or missing knowledge into prediction targets.
Idle-Time Acquisition scores each candidate using a value function that balances user relevance, knowledge gaps, incremental value, and timeliness. Only high-scoring candidates receive search budget. The module then retrieves or reuses evidence, generates compact knowledge artifacts with provenance, and commits them to memory.

A utility-aware delivery policy decides whether each artifact should be pushed immediately to the user, queued for integration into a later response, or stored silently for future use. This gate prevents proactive work from overwhelming the user with low-value content.

Figure 2: ProAct overview.
After each foreground interaction, the agent updates persistent memory, predicts likely future needs during idle intervals, and acquires evidence for high-value candidates.
A utility-aware delivery policy then handles the resulting artifacts for future use.

Formalizing Proactive Agent Behavior

The proactive interaction is formulated as a closed-loop decision problem. Let $H_t$ be the dialogue history up to turn $t$ and $M_t$ the persistent memory state. During an idle window with budget $B_t$ , the predictor generates a set of candidate future needs:

$\mathcal{Z}_t = f_{\mathrm{pred}}(H_t, M_t)$

Each candidate $z$ is represented as $(q_z, e_z, c_z, \rho_z)$ : the anticipated need, grounding rationale, confidence, and retrieval plan.

The proactive policy $\pi$ selects candidates, allocates budget, and assigns delivery decisions $d_z$ to maximize expected future utility under interruption, budget, and hallucination constraints. Because downstream utility is unobservable at idle time, ProAct uses a candidate-level value score for acquisition gating:

$S(z) = w_r r_z + w_g g_z + w_v v_z + w_\tau \tau_z$

where $r_z$ is user relevance, $g_z$ the knowledge gap, $v_z$ incremental value, and $\tau_z$ timeliness. Only candidates with $S(z) \ge \theta_{\mathrm{val}}$ proceed to evidence acquisition. This scoring mechanism ties prediction directly to resource allocation, ensuring idle-time compute is spent only on high-value preparation.

ProActEval: A Benchmark for Proactive Assistance

Evaluating proactive agents requires more than testing reactive question-answering. The authors introduce ProActEval, a comprehensive benchmark with 200 scenarios across 40 domains. Each scenario contains a self-contained fact sheet of fictional entities and an ordered sequence of user needs with explicit predictability annotations.

Key design features:

Needs are organized into reveal groups with predictable_after links, forming a user-needs graph that the assistant never sees.
Scenarios span five cognitive archetypes (e.g., Foundational Memory, Trace and Dependency Reasoning) to cover diverse anticipatory demands.
A user simulator traverses the need sequence, skipping needs already covered proactively, thereby translating anticipation into reduced user effort.

Evaluation metrics include $T_{80}$ and $T_{100}$ (turns to reach 80% and 100% must-have coverage), User Effort (explicit user turns), Fact Accuracy, Hallucination Rate, and Anticipation Recall. An LLM-based judge assesses factual correctness and coverage without access to gold metadata.

Proactive Gains: Efficiency, Coverage, and Factual Integrity

On ProActEval, the full Directed Idle configuration (prediction-guided idle-time compute) substantially outperforms both a reactive baseline and an undirected idle variant.

| Metric | Reactive | Undirected Idle | Directed Idle | $\Delta$ vs. Reactive | |--------|----------|-----------------|---------------|------------------------| | $T_{100} \downarrow$ | 8.110 | 8.040 | 6.910 | –14.8% | | User Effort $\downarrow$ | 9.140 | 9.040 | 8.075 | –11.7% | | Hallucination Rate $\downarrow$ | 0.132 | 0.124 | 0.095 | –28.1% | | Anticipation Recall $\uparrow$ | 0.000 | 0.000 | 0.428 | +0.428 |

The ablation reveals that undirected background search alone yields negligible improvements, while predictive direction drives the gains. Compared with an adapted ProactiveAgent baseline, ProAct anticipates 703 of 1,572 predictable needs (recall 0.447) versus only 32 (0.020), demonstrating that proactive behavior must be targeted to benchmark-relevant needs to reduce user effort.

Memory Backbone and the Cost of Idle-Time Search

ProAct’s memory layer achieves state-of-the-art reflective accuracy on MemBench: 84.3% at 10k tokens and 86.3% at 100k tokens, surpassing prior systems like MemGPT and MemoryBank. This robust long-term memory is essential for grounding future-state predictions.

A search-budget analysis on a 50-scenario subset reveals a clear cost–efficiency trade-off. Increasing the idle-search budget $k$ from 4 to 16 raises Anticipation Recall from 0.253 to 0.432, but $T_{100}$ and User Effort do not improve monotonically. Once the main predictable needs are covered, additional searches chase lower-marginal needs and can alter the closed-loop conversation trajectory, sometimes even degrading end-to-end efficiency.

Figure 3: Search-budget analysis on a matched 50-scenario subset.
Panels (a)–(c) compare Directed Idle and Undirected Idle under the same budget k, with gray segments denoting matched-budget gaps.
Panel (d) reports active-token cost in thousands.

At every matched budget, Directed Idle outperforms Undirected Idle, confirming that predictive direction improves the utility of idle-time compute beyond raw search volume. The budget should be treated as an operating point, not a parameter to maximize.

Conclusion and Outlook

ProAct demonstrates that idle-time compute, when guided by future-state prediction and grounded in persistent memory, can significantly improve proactive assistance: reducing interaction turns, lowering user effort, and cutting hallucinations. The ProActEval benchmark provides a rigorous framework for measuring these capabilities across diverse, predictable need chains.

The work also highlights important limitations. Results are obtained on a closed-world synthetic benchmark; real-world deployments would require user controls, rate limits, and privacy safeguards. Proactive preparation can occasionally backfire, competing with reactive answers or pushing low-value content. The budget analysis underscores that more idle-time search does not guarantee better outcomes—efficient proactive assistance depends on accurate need prediction and value-aware delivery gating.

When applied with appropriate controls, proactive agents could reduce repetitive information-seeking, help users prepare for foreseeable follow-ups, and improve factual grounding by acquiring evidence before rushed responses are needed. This research opens a path toward AI assistants that actively anticipate and learn, rather than merely react.

Project page GitHub ArXiv paper