The Thought-Action-Observation Pattern

Context

LLM agents need to both reason about tasks and take actions in environments. Pure reasoning hallucinates. Pure action-taking lacks strategic coordination.

Problem

How do you structure agent behavior so that reasoning informs action and action informs reasoning, without the overhead of explicit reasoning at every step?

Solution

Interleave three types of content in the agent’s trajectory:

Thought: Internal reasoning that doesn’t affect the environment. Used for decomposing goals, tracking progress, handling exceptions, synthesizing observations, and determining next steps. Thoughts update the agent’s working context without triggering environment responses.

Action: Task-specific operations that affect the external environment. Search queries, navigation commands, object manipulations. Actions produce observable consequences.

Observation: Environment feedback from actions. Search results, state descriptions, error messages. Observations provide grounding that prevents reasoning from drifting into hallucination.

The key insight: thoughts are “free” (they don’t cost environment interactions, but they update the context that conditions future action generation.

When to Use

Dense pattern (thought-action-observation every cycle): Knowledge tasks requiring verification, multi-hop reasoning, fact synthesis
Sparse pattern (thoughts only at decision points): Sequential decision tasks with many routine actions, where strategic reasoning matters at inflection points

Consequences

Positive:

Interpretable trajectories, humans can inspect reasoning and distinguish internal vs. external knowledge
Reduced hallucination through external grounding
Flexible reasoning density based on task structure
Human-editable behavior through thought modification

Negative:

Structural constraints can cause reasoning loops (47% of failures)
Requires well-designed action spaces
Prompt engineering for thought placement is task-specific

>heyMHK

The Thought-Action-Observation Pattern

The Thought-Action-Observation Pattern

Context

Problem

Solution

When to Use

Consequences

Properties

Graph view

Table of Contents

Backlinks