feat(planning): enhance planning configuration and observation handling by lorenzejay · Pull Request #5913 · crewAIInc/crewAI

lorenzejay · 2026-05-23T01:18:46Z

Introduced attribute in to control LLM calls after each step.
Updated to set default to 1 when planning is enabled without explicit config.
Modified to support heuristic observations when LLM calls are disabled.
Adjusted to respect and settings for step observations.
Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts.

Note

Medium Risk
Changes default planning behavior by making bare planning=True bounded and skipping per-step observation LLM calls at reasoning_effort="low", which can affect plan adaptation/replanning decisions and execution logs.

Overview
Planning is now more explicitly configurable and cheaper by default. PlanningConfig adds observe_steps to control whether per-step PlannerObserver LLM calls run, and updates the docs/semantics of reasoning_effort="low" to default to a heuristic (no extra LLM).

Agent/executor defaults were adjusted to avoid unbounded or chatty planning. Agent now auto-creates a bounded PlanningConfig when using deprecated reasoning=True and sets a conservative default config for bare planning=True (reasoning_effort="low", max_attempts=1). AgentExecutor routes step observation through a new helper that chooses LLM vs heuristic observation, records whether an LLM was used in the execution log, and reuses step success metadata when available.

Tests were added/updated to cover the new observe_steps gating, heuristic observation behavior, and the new bounded defaults for planning=True.

^{Reviewed by Cursor Bugbot for commit b546c0a. Bugbot is set up for automated code reviews on this repo. Configure here.}

Summary by CodeRabbit

New Features
- Added observe_steps configuration option for customizable observation behavior during planning execution.
Improvements
- Enhanced planning configuration handling with improved defaults when planning is enabled.
- Optimized observation execution at low reasoning effort levels using lightweight heuristic approach.
- Refined step observation behavior with conditional LLM calls based on reasoning effort settings.

- Introduced attribute in to control LLM calls after each step. - Updated to set default to 1 when planning is enabled without explicit config. - Modified to support heuristic observations when LLM calls are disabled. - Adjusted to respect and settings for step observations. - Added tests to verify behavior of new configurations and ensure correct observation handling across different reasoning efforts.

coderabbitai · 2026-05-23T01:19:05Z

📝 Walkthrough

Walkthrough

This PR implements configurable, effort-aware step observation for agent planning pipelines. It introduces an observe_steps configuration option to control when per-step LLM analysis runs, adds a lightweight heuristic observation fallback for low-effort reasoning, and updates agent initialization to conditionally set plan constraints based on provided values.

Changes

Configurable Step Observation in Plan-Execute Pipeline

Layer / File(s)	Summary
Planning Configuration Schema and Documentation `lib/crewai/src/crewai/agent/planning_config.py`	PlanningConfig adds new `observe_steps: bool \| None` field (default None) to control when per-step PlannerObserver LLM calls run (medium/high efforts by default, heuristic for low, disabled if explicitly False). Documentation clarifies how reasoning_effort levels affect the decide/replan/refine pipeline and observation frequency.
Heuristic Observation Helper `lib/crewai/src/crewai/agents/planner_observer.py`	PlannerObserver adds static `heuristic_observation(step_success: bool, result: str = "") -> StepObservation` method that constructs lightweight observation without LLM: step_completed_successfully from step_success, empty key_information_learned, remaining_plan_still_valid=True, needs_full_replan=False. Updated class docs clarify observation triggers on step completion when observe_steps is enabled.
Agent Initialization and Config Fallback `lib/crewai/src/crewai/agent/core.py`, `lib/crewai/src/crewai/utilities/reasoning_handler.py`	Agent.post_init_setup and AgentReasoning._get_planning_config now conditionally set max_attempts only when max_reasoning_attempts is not None (instead of always passing a value). When planning=True without explicit planning_config, Agent creates a default with reasoning_effort="low" and max_attempts=1.
AgentExecutor Conditional Observation Routing `lib/crewai/src/crewai/experimental/agent_executor.py`	AgentExecutor introduces three internal helpers: _should_observe_steps() determines if LLM observation should run based on planning_config.reasoning_effort and observe_steps; _step_success_from_log() infers step success from audit log; _observe_completed_step() routes to heuristic or LLM observation based on _should_observe_steps(). Both sequential and parallel plan execution now call _observe_completed_step() instead of always invoking PlannerObserver.observe(). Execution log extended with llm_observation field indicating observation source (LLM vs heuristic).
Formatting Adjustment `lib/crewai/src/crewai/utilities/agent_utils.py`	Line spacing adjustment between extract_task_section and _executor_stop_words with no behavioral changes.
Test Coverage for Observation Logic `lib/crewai/tests/agents/test_agent_executor.py`, `lib/crewai/tests/agents/test_agent_reasoning.py`	New tests validate: (1) heuristic_observation() builds correct StepObservation from step success alone; (2) _should_observe_steps() respects reasoning_effort and observe_steps configuration; (3) reasoning_effort="low" bypasses PlannerObserver LLM calls, using heuristic path instead; (4) Agent(planning=True) creates bounded planning_config with max_attempts=1 and reasoning_effort="low"; (5) PlanningConfig defaults include observe_steps=None and reasoning_effort="medium". Existing low-effort test updated to validate heuristic observation path and llm_observation=False flag.

Sequence Diagram

sequenceDiagram
  participant Step as Step Execution
  participant Executor as AgentExecutor
  participant Check as _should_observe_steps
  participant Heuristic as heuristic_observation
  participant LLM as PlannerObserver.observe
  participant Log as Execution Log
  
  Step->>Executor: step completed
  Executor->>Check: check reasoning_effort + observe_steps
  alt should observe
    Check->>LLM: reason_effort=medium/high
    LLM->>Log: llm_observation=True
  else skip observation
    Check->>Heuristic: reason_effort=low or observe_steps=False
    Heuristic->>Log: llm_observation=False
  end
  Log->>Executor: continue/replan based on observation

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

enhancement, size/M

Suggested reviewers

greysonlalonde

Poem

🐰 Hop along this path of reason fine,
Where steps observe in measured time,
Light and swift, or deep and wide,
Let the reasoning config be your guide!

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main changes: it introduces new planning configuration attributes (`observe_steps`), enhances observation handling with heuristic observations, and updates the behavior across multiple planning-related components. The title directly reflects the primary purpose of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch lorenze/fix/agent-executor-planning

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/crewai/tests/agents/test_agent_executor.py (1)

1484-1490: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert that observation logs exist before validating flags.

Right now the loop can pass without checking anything if observation_logs is empty, which weakens the regression coverage for llm_observation.

Suggested test hardening

             observation_logs = [
                 log for log in executor.state.execution_log
                 if log.get("type") == "observation"
             ]
+            assert observation_logs, "Expected at least one observation log entry"
             for log in observation_logs:
                 assert log.get("reasoning_effort") == "low"
                 assert log.get("llm_observation") is False

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/agents/test_agent_executor.py` around lines 1484 - 1490, The
test currently iterates over observation_logs (derived from
executor.state.execution_log) without asserting any were found, so add a
pre-check to ensure observation_logs is non-empty (e.g., assert observation_logs
or assert len(observation_logs) > 0) before the for loop, then keep the existing
assertions that each log has reasoning_effort == "low" and llm_observation is
False to harden the regression for the llm_observation flag.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/crewai/src/crewai/agent/planning_config.py`:
- Around line 30-33: Unify the documentation for the observe_steps parameter so
False has a single, clear semantic: update the observe_steps docstring (the
block describing observe_steps in planning_config.py and the duplicate wording
around the other description at the same logical spot) to say that
observe_steps=False disables PlannerObserver LLM calls for all effort levels
(i.e., forces the lightweight heuristic path always), observe_steps=True forces
LLM observation after each step, and observe_steps=None (default) enables LLM
observation only for "medium" and "high" effort levels while "low" uses the
heuristic; reference the observe_steps parameter name and the PlannerObserver
concept when editing so both descriptions match exactly.

---

Outside diff comments:
In `@lib/crewai/tests/agents/test_agent_executor.py`:
- Around line 1484-1490: The test currently iterates over observation_logs
(derived from executor.state.execution_log) without asserting any were found, so
add a pre-check to ensure observation_logs is non-empty (e.g., assert
observation_logs or assert len(observation_logs) > 0) before the for loop, then
keep the existing assertions that each log has reasoning_effort == "low" and
llm_observation is False to harden the regression for the llm_observation flag.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 1238c887-0259-49b0-a977-eccf9d6eeb7c

📥 Commits

Reviewing files that changed from the base of the PR and between c3e2001 and b546c0a.

📒 Files selected for processing (8)

lib/crewai/src/crewai/agent/core.py
lib/crewai/src/crewai/agent/planning_config.py
lib/crewai/src/crewai/agents/planner_observer.py
lib/crewai/src/crewai/experimental/agent_executor.py
lib/crewai/src/crewai/utilities/agent_utils.py
lib/crewai/src/crewai/utilities/reasoning_handler.py
lib/crewai/tests/agents/test_agent_executor.py
lib/crewai/tests/agents/test_agent_reasoning.py

💤 Files with no reviewable changes (1)

lib/crewai/src/crewai/utilities/agent_utils.py

coderabbitai · 2026-05-23T01:23:40Z

+        observe_steps: When True, run PlannerObserver LLM calls after each step.
+            When False, use a lightweight heuristic (no extra LLM call).
+            When None (default), LLM observation runs for "medium" and "high"
+            only; "low" uses the heuristic path.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Unify observe_steps=False semantics in docs.

Line 31 says False uses a heuristic path, while Line 94 says it disables observation at any effort level. Those behaviors are not equivalent and can mislead configuration decisions.

✏️ Suggested wording fix

observe_steps: bool | None = Field( default=None, description=( "Run PlannerObserver LLM calls after each step. " "None (default): LLM observation for 'medium' and 'high' only; " "'low' uses a heuristic (no extra LLM). " - "Set False to disable observation at any effort level." + "Set False to disable per-step LLM observation at any effort level " + "(heuristic observation is used instead)." ), )

Also applies to: 91-95

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/crewai/src/crewai/agent/planning_config.py` around lines 30 - 33, Unify the documentation for the observe_steps parameter so False has a single, clear semantic: update the observe_steps docstring (the block describing observe_steps in planning_config.py and the duplicate wording around the other description at the same logical spot) to say that observe_steps=False disables PlannerObserver LLM calls for all effort levels (i.e., forces the lightweight heuristic path always), observe_steps=True forces LLM observation after each step, and observe_steps=None (default) enables LLM observation only for "medium" and "high" effort levels while "low" uses the heuristic; reference the observe_steps parameter name and the PlannerObserver concept when editing so both descriptions match exactly.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit b546c0a. Configure here.}

cursor · 2026-05-23T01:25:53Z

+            key_information_learned="",
+            remaining_plan_still_valid=True,
+            needs_full_replan=False,
+        )


Heuristic observation makes low-effort failure detection unreachable

Medium Severity

heuristic_observation always returns needs_full_replan=False regardless of step_success. The handle_step_observed_low handler only marks a step as failed when both step_completed_successfully is False and needs_full_replan is True. Since the heuristic can never satisfy the second condition, failed steps in the low-effort sequential path are always incorrectly marked as completed via mark_completed. This contradicts the handler's own comment about not ignoring hard failures, and is inconsistent with the parallel execution path which correctly checks only step_completed_successfully.

Additional Locations (1)

lib/crewai/src/crewai/experimental/agent_executor.py#L595-L618

^{Reviewed by Cursor Bugbot for commit b546c0a. Configure here.}

github-actions Bot added the size/L label May 23, 2026

coderabbitai Bot requested changes May 23, 2026

View reviewed changes

cursor Bot reviewed May 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(planning): enhance planning configuration and observation handling#5913

feat(planning): enhance planning configuration and observation handling#5913
lorenzejay wants to merge 1 commit into
mainfrom
lorenze/fix/agent-executor-planning

lorenzejay commented May 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 23, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 23, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lorenzejay commented May 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 23, 2026

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 23, 2026

Choose a reason for hiding this comment

Heuristic observation makes low-effort failure detection unreachable

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lorenzejay commented May 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 23, 2026 •

edited

Loading