Skip to content

feat(planning): enhance planning configuration and observation handling#5913

Open
lorenzejay wants to merge 1 commit into
mainfrom
lorenze/fix/agent-executor-planning
Open

feat(planning): enhance planning configuration and observation handling#5913
lorenzejay wants to merge 1 commit into
mainfrom
lorenze/fix/agent-executor-planning

Conversation

@lorenzejay
Copy link
Copy Markdown
Collaborator

@lorenzejay lorenzejay commented May 23, 2026

  • Introduced attribute in to control LLM calls after each step.
  • Updated to set default to 1 when planning is enabled without explicit config.
  • Modified to support heuristic observations when LLM calls are disabled.
  • Adjusted to respect and settings for step observations.
  • Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts.

Note

Medium Risk
Changes default planning behavior by making bare planning=True bounded and skipping per-step observation LLM calls at reasoning_effort="low", which can affect plan adaptation/replanning decisions and execution logs.

Overview
Planning is now more explicitly configurable and cheaper by default. PlanningConfig adds observe_steps to control whether per-step PlannerObserver LLM calls run, and updates the docs/semantics of reasoning_effort="low" to default to a heuristic (no extra LLM).

Agent/executor defaults were adjusted to avoid unbounded or chatty planning. Agent now auto-creates a bounded PlanningConfig when using deprecated reasoning=True and sets a conservative default config for bare planning=True (reasoning_effort="low", max_attempts=1). AgentExecutor routes step observation through a new helper that chooses LLM vs heuristic observation, records whether an LLM was used in the execution log, and reuses step success metadata when available.

Tests were added/updated to cover the new observe_steps gating, heuristic observation behavior, and the new bounded defaults for planning=True.

Reviewed by Cursor Bugbot for commit b546c0a. Bugbot is set up for automated code reviews on this repo. Configure here.

Summary by CodeRabbit

  • New Features

    • Added observe_steps configuration option for customizable observation behavior during planning execution.
  • Improvements

    • Enhanced planning configuration handling with improved defaults when planning is enabled.
    • Optimized observation execution at low reasoning effort levels using lightweight heuristic approach.
    • Refined step observation behavior with conditional LLM calls based on reasoning effort settings.

Review Change Stack

- Introduced  attribute in  to control LLM calls after each step.
- Updated  to set default  to 1 when planning is enabled without explicit config.
- Modified  to support heuristic observations when LLM calls are disabled.
- Adjusted  to respect  and  settings for step observations.
- Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

📝 Walkthrough

Walkthrough

This PR implements configurable, effort-aware step observation for agent planning pipelines. It introduces an observe_steps configuration option to control when per-step LLM analysis runs, adds a lightweight heuristic observation fallback for low-effort reasoning, and updates agent initialization to conditionally set plan constraints based on provided values.

Changes

Configurable Step Observation in Plan-Execute Pipeline

Layer / File(s) Summary
Planning Configuration Schema and Documentation
lib/crewai/src/crewai/agent/planning_config.py
PlanningConfig adds new observe_steps: bool | None field (default None) to control when per-step PlannerObserver LLM calls run (medium/high efforts by default, heuristic for low, disabled if explicitly False). Documentation clarifies how reasoning_effort levels affect the decide/replan/refine pipeline and observation frequency.
Heuristic Observation Helper
lib/crewai/src/crewai/agents/planner_observer.py
PlannerObserver adds static heuristic_observation(step_success: bool, result: str = "") -> StepObservation method that constructs lightweight observation without LLM: step_completed_successfully from step_success, empty key_information_learned, remaining_plan_still_valid=True, needs_full_replan=False. Updated class docs clarify observation triggers on step completion when observe_steps is enabled.
Agent Initialization and Config Fallback
lib/crewai/src/crewai/agent/core.py, lib/crewai/src/crewai/utilities/reasoning_handler.py
Agent.post_init_setup and AgentReasoning._get_planning_config now conditionally set max_attempts only when max_reasoning_attempts is not None (instead of always passing a value). When planning=True without explicit planning_config, Agent creates a default with reasoning_effort="low" and max_attempts=1.
AgentExecutor Conditional Observation Routing
lib/crewai/src/crewai/experimental/agent_executor.py
AgentExecutor introduces three internal helpers: _should_observe_steps() determines if LLM observation should run based on planning_config.reasoning_effort and observe_steps; _step_success_from_log() infers step success from audit log; _observe_completed_step() routes to heuristic or LLM observation based on _should_observe_steps(). Both sequential and parallel plan execution now call _observe_completed_step() instead of always invoking PlannerObserver.observe(). Execution log extended with llm_observation field indicating observation source (LLM vs heuristic).
Formatting Adjustment
lib/crewai/src/crewai/utilities/agent_utils.py
Line spacing adjustment between extract_task_section and _executor_stop_words with no behavioral changes.
Test Coverage for Observation Logic
lib/crewai/tests/agents/test_agent_executor.py, lib/crewai/tests/agents/test_agent_reasoning.py
New tests validate: (1) heuristic_observation() builds correct StepObservation from step success alone; (2) _should_observe_steps() respects reasoning_effort and observe_steps configuration; (3) reasoning_effort="low" bypasses PlannerObserver LLM calls, using heuristic path instead; (4) Agent(planning=True) creates bounded planning_config with max_attempts=1 and reasoning_effort="low"; (5) PlanningConfig defaults include observe_steps=None and reasoning_effort="medium". Existing low-effort test updated to validate heuristic observation path and llm_observation=False flag.

Sequence Diagram

sequenceDiagram
  participant Step as Step Execution
  participant Executor as AgentExecutor
  participant Check as _should_observe_steps
  participant Heuristic as heuristic_observation
  participant LLM as PlannerObserver.observe
  participant Log as Execution Log
  
  Step->>Executor: step completed
  Executor->>Check: check reasoning_effort + observe_steps
  alt should observe
    Check->>LLM: reason_effort=medium/high
    LLM->>Log: llm_observation=True
  else skip observation
    Check->>Heuristic: reason_effort=low or observe_steps=False
    Heuristic->>Log: llm_observation=False
  end
  Log->>Executor: continue/replan based on observation
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

enhancement, size/M

Suggested reviewers

  • greysonlalonde

Poem

🐰 Hop along this path of reason fine,
Where steps observe in measured time,
Light and swift, or deep and wide,
Let the reasoning config be your guide!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main changes: it introduces new planning configuration attributes (observe_steps), enhances observation handling with heuristic observations, and updates the behavior across multiple planning-related components. The title directly reflects the primary purpose of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch lorenze/fix/agent-executor-planning

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/crewai/tests/agents/test_agent_executor.py (1)

1484-1490: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert that observation logs exist before validating flags.

Right now the loop can pass without checking anything if observation_logs is empty, which weakens the regression coverage for llm_observation.

Suggested test hardening
             observation_logs = [
                 log for log in executor.state.execution_log
                 if log.get("type") == "observation"
             ]
+            assert observation_logs, "Expected at least one observation log entry"
             for log in observation_logs:
                 assert log.get("reasoning_effort") == "low"
                 assert log.get("llm_observation") is False
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/agents/test_agent_executor.py` around lines 1484 - 1490, The
test currently iterates over observation_logs (derived from
executor.state.execution_log) without asserting any were found, so add a
pre-check to ensure observation_logs is non-empty (e.g., assert observation_logs
or assert len(observation_logs) > 0) before the for loop, then keep the existing
assertions that each log has reasoning_effort == "low" and llm_observation is
False to harden the regression for the llm_observation flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/agent/planning_config.py`:
- Around line 30-33: Unify the documentation for the observe_steps parameter so
False has a single, clear semantic: update the observe_steps docstring (the
block describing observe_steps in planning_config.py and the duplicate wording
around the other description at the same logical spot) to say that
observe_steps=False disables PlannerObserver LLM calls for all effort levels
(i.e., forces the lightweight heuristic path always), observe_steps=True forces
LLM observation after each step, and observe_steps=None (default) enables LLM
observation only for "medium" and "high" effort levels while "low" uses the
heuristic; reference the observe_steps parameter name and the PlannerObserver
concept when editing so both descriptions match exactly.

---

Outside diff comments:
In `@lib/crewai/tests/agents/test_agent_executor.py`:
- Around line 1484-1490: The test currently iterates over observation_logs
(derived from executor.state.execution_log) without asserting any were found, so
add a pre-check to ensure observation_logs is non-empty (e.g., assert
observation_logs or assert len(observation_logs) > 0) before the for loop, then
keep the existing assertions that each log has reasoning_effort == "low" and
llm_observation is False to harden the regression for the llm_observation flag.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1238c887-0259-49b0-a977-eccf9d6eeb7c

📥 Commits

Reviewing files that changed from the base of the PR and between c3e2001 and b546c0a.

📒 Files selected for processing (8)
  • lib/crewai/src/crewai/agent/core.py
  • lib/crewai/src/crewai/agent/planning_config.py
  • lib/crewai/src/crewai/agents/planner_observer.py
  • lib/crewai/src/crewai/experimental/agent_executor.py
  • lib/crewai/src/crewai/utilities/agent_utils.py
  • lib/crewai/src/crewai/utilities/reasoning_handler.py
  • lib/crewai/tests/agents/test_agent_executor.py
  • lib/crewai/tests/agents/test_agent_reasoning.py
💤 Files with no reviewable changes (1)
  • lib/crewai/src/crewai/utilities/agent_utils.py

Comment on lines +30 to +33
observe_steps: When True, run PlannerObserver LLM calls after each step.
When False, use a lightweight heuristic (no extra LLM call).
When None (default), LLM observation runs for "medium" and "high"
only; "low" uses the heuristic path.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unify observe_steps=False semantics in docs.

Line 31 says False uses a heuristic path, while Line 94 says it disables observation at any effort level. Those behaviors are not equivalent and can mislead configuration decisions.

✏️ Suggested wording fix
     observe_steps: bool | None = Field(
         default=None,
         description=(
             "Run PlannerObserver LLM calls after each step. "
             "None (default): LLM observation for 'medium' and 'high' only; "
             "'low' uses a heuristic (no extra LLM). "
-            "Set False to disable observation at any effort level."
+            "Set False to disable per-step LLM observation at any effort level "
+            "(heuristic observation is used instead)."
         ),
     )

Also applies to: 91-95

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/agent/planning_config.py` around lines 30 - 33, Unify
the documentation for the observe_steps parameter so False has a single, clear
semantic: update the observe_steps docstring (the block describing observe_steps
in planning_config.py and the duplicate wording around the other description at
the same logical spot) to say that observe_steps=False disables PlannerObserver
LLM calls for all effort levels (i.e., forces the lightweight heuristic path
always), observe_steps=True forces LLM observation after each step, and
observe_steps=None (default) enables LLM observation only for "medium" and
"high" effort levels while "low" uses the heuristic; reference the observe_steps
parameter name and the PlannerObserver concept when editing so both descriptions
match exactly.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b546c0a. Configure here.

key_information_learned="",
remaining_plan_still_valid=True,
needs_full_replan=False,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heuristic observation makes low-effort failure detection unreachable

Medium Severity

heuristic_observation always returns needs_full_replan=False regardless of step_success. The handle_step_observed_low handler only marks a step as failed when both step_completed_successfully is False and needs_full_replan is True. Since the heuristic can never satisfy the second condition, failed steps in the low-effort sequential path are always incorrectly marked as completed via mark_completed. This contradicts the handler's own comment about not ignoring hard failures, and is inconsistent with the parallel execution path which correctly checks only step_completed_successfully.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b546c0a. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant