feat(planning): enhance planning configuration and observation handling#5913
feat(planning): enhance planning configuration and observation handling#5913lorenzejay wants to merge 1 commit into
Conversation
- Introduced attribute in to control LLM calls after each step. - Updated to set default to 1 when planning is enabled without explicit config. - Modified to support heuristic observations when LLM calls are disabled. - Adjusted to respect and settings for step observations. - Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts.
📝 WalkthroughWalkthroughThis PR implements configurable, effort-aware step observation for agent planning pipelines. It introduces an ChangesConfigurable Step Observation in Plan-Execute Pipeline
Sequence DiagramsequenceDiagram
participant Step as Step Execution
participant Executor as AgentExecutor
participant Check as _should_observe_steps
participant Heuristic as heuristic_observation
participant LLM as PlannerObserver.observe
participant Log as Execution Log
Step->>Executor: step completed
Executor->>Check: check reasoning_effort + observe_steps
alt should observe
Check->>LLM: reason_effort=medium/high
LLM->>Log: llm_observation=True
else skip observation
Check->>Heuristic: reason_effort=low or observe_steps=False
Heuristic->>Log: llm_observation=False
end
Log->>Executor: continue/replan based on observation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/crewai/tests/agents/test_agent_executor.py (1)
1484-1490:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert that observation logs exist before validating flags.
Right now the loop can pass without checking anything if
observation_logsis empty, which weakens the regression coverage forllm_observation.Suggested test hardening
observation_logs = [ log for log in executor.state.execution_log if log.get("type") == "observation" ] + assert observation_logs, "Expected at least one observation log entry" for log in observation_logs: assert log.get("reasoning_effort") == "low" assert log.get("llm_observation") is False🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/tests/agents/test_agent_executor.py` around lines 1484 - 1490, The test currently iterates over observation_logs (derived from executor.state.execution_log) without asserting any were found, so add a pre-check to ensure observation_logs is non-empty (e.g., assert observation_logs or assert len(observation_logs) > 0) before the for loop, then keep the existing assertions that each log has reasoning_effort == "low" and llm_observation is False to harden the regression for the llm_observation flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/crewai/src/crewai/agent/planning_config.py`:
- Around line 30-33: Unify the documentation for the observe_steps parameter so
False has a single, clear semantic: update the observe_steps docstring (the
block describing observe_steps in planning_config.py and the duplicate wording
around the other description at the same logical spot) to say that
observe_steps=False disables PlannerObserver LLM calls for all effort levels
(i.e., forces the lightweight heuristic path always), observe_steps=True forces
LLM observation after each step, and observe_steps=None (default) enables LLM
observation only for "medium" and "high" effort levels while "low" uses the
heuristic; reference the observe_steps parameter name and the PlannerObserver
concept when editing so both descriptions match exactly.
---
Outside diff comments:
In `@lib/crewai/tests/agents/test_agent_executor.py`:
- Around line 1484-1490: The test currently iterates over observation_logs
(derived from executor.state.execution_log) without asserting any were found, so
add a pre-check to ensure observation_logs is non-empty (e.g., assert
observation_logs or assert len(observation_logs) > 0) before the for loop, then
keep the existing assertions that each log has reasoning_effort == "low" and
llm_observation is False to harden the regression for the llm_observation flag.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 1238c887-0259-49b0-a977-eccf9d6eeb7c
📒 Files selected for processing (8)
lib/crewai/src/crewai/agent/core.pylib/crewai/src/crewai/agent/planning_config.pylib/crewai/src/crewai/agents/planner_observer.pylib/crewai/src/crewai/experimental/agent_executor.pylib/crewai/src/crewai/utilities/agent_utils.pylib/crewai/src/crewai/utilities/reasoning_handler.pylib/crewai/tests/agents/test_agent_executor.pylib/crewai/tests/agents/test_agent_reasoning.py
💤 Files with no reviewable changes (1)
- lib/crewai/src/crewai/utilities/agent_utils.py
| observe_steps: When True, run PlannerObserver LLM calls after each step. | ||
| When False, use a lightweight heuristic (no extra LLM call). | ||
| When None (default), LLM observation runs for "medium" and "high" | ||
| only; "low" uses the heuristic path. |
There was a problem hiding this comment.
Unify observe_steps=False semantics in docs.
Line 31 says False uses a heuristic path, while Line 94 says it disables observation at any effort level. Those behaviors are not equivalent and can mislead configuration decisions.
✏️ Suggested wording fix
observe_steps: bool | None = Field(
default=None,
description=(
"Run PlannerObserver LLM calls after each step. "
"None (default): LLM observation for 'medium' and 'high' only; "
"'low' uses a heuristic (no extra LLM). "
- "Set False to disable observation at any effort level."
+ "Set False to disable per-step LLM observation at any effort level "
+ "(heuristic observation is used instead)."
),
)Also applies to: 91-95
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lib/crewai/src/crewai/agent/planning_config.py` around lines 30 - 33, Unify
the documentation for the observe_steps parameter so False has a single, clear
semantic: update the observe_steps docstring (the block describing observe_steps
in planning_config.py and the duplicate wording around the other description at
the same logical spot) to say that observe_steps=False disables PlannerObserver
LLM calls for all effort levels (i.e., forces the lightweight heuristic path
always), observe_steps=True forces LLM observation after each step, and
observe_steps=None (default) enables LLM observation only for "medium" and
"high" effort levels while "low" uses the heuristic; reference the observe_steps
parameter name and the PlannerObserver concept when editing so both descriptions
match exactly.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit b546c0a. Configure here.
| key_information_learned="", | ||
| remaining_plan_still_valid=True, | ||
| needs_full_replan=False, | ||
| ) |
There was a problem hiding this comment.
Heuristic observation makes low-effort failure detection unreachable
Medium Severity
heuristic_observation always returns needs_full_replan=False regardless of step_success. The handle_step_observed_low handler only marks a step as failed when both step_completed_successfully is False and needs_full_replan is True. Since the heuristic can never satisfy the second condition, failed steps in the low-effort sequential path are always incorrectly marked as completed via mark_completed. This contradicts the handler's own comment about not ignoring hard failures, and is inconsistent with the parallel execution path which correctly checks only step_completed_successfully.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit b546c0a. Configure here.


Note
Medium Risk
Changes default planning behavior by making bare
planning=Truebounded and skipping per-step observation LLM calls atreasoning_effort="low", which can affect plan adaptation/replanning decisions and execution logs.Overview
Planning is now more explicitly configurable and cheaper by default.
PlanningConfigaddsobserve_stepsto control whether per-stepPlannerObserverLLM calls run, and updates the docs/semantics ofreasoning_effort="low"to default to a heuristic (no extra LLM).Agent/executor defaults were adjusted to avoid unbounded or chatty planning.
Agentnow auto-creates a boundedPlanningConfigwhen using deprecatedreasoning=Trueand sets a conservative default config for bareplanning=True(reasoning_effort="low",max_attempts=1).AgentExecutorroutes step observation through a new helper that chooses LLM vs heuristic observation, records whether an LLM was used in the execution log, and reuses step success metadata when available.Tests were added/updated to cover the new
observe_stepsgating, heuristic observation behavior, and the new bounded defaults forplanning=True.Reviewed by Cursor Bugbot for commit b546c0a. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
New Features
observe_stepsconfiguration option for customizable observation behavior during planning execution.Improvements