Skip to content

Customize thinking in agent#1547

Open
tkattkat wants to merge 10 commits intomainfrom
customize-thinking-in-agent
Open

Customize thinking in agent#1547
tkattkat wants to merge 10 commits intomainfrom
customize-thinking-in-agent

Conversation

@tkattkat
Copy link
Collaborator

@tkattkat tkattkat commented Jan 14, 2026

Provider Options for Agent Thinking/Reasoning

Addresses #1524

Added providerOptions configuration for agent.execute() that allows users to pass provider-specific thinking/reasoning options directly.

Usage

Pass provider-specific options directly to enable thinking capabilities:

Google Gemini:

const result = await agent.execute({
  instruction: "Solve this complex problem",
  providerOptions: {
    google: {
      thinkingConfig: {
        includeThoughts: true,
        thinkingBudget: 10000
      }
    }
  }
});

Anthropic Claude:

const result = await agent.execute({
  instruction: "Solve this complex problem",
  providerOptions: {
    anthropic: {
      thinking: {
        type: "enabled",
        budgetTokens: 10000  // required, min 1024, max 64000
      }
    }
  }
});

OpenAI:

const result = await agent.execute({
  instruction: "Solve this complex problem",
  providerOptions: {
    openai: {
      reasoningSummary: "detailed",  // "auto" | "detailed" | "concise"
      reasoningEffort: "high"        // "low" | "medium" | "high"
    }
  }
});

Available Options Per Provider

Provider Allowed Options
Google thinkingConfig: { includeThoughts?, thinkingBudget? }
Anthropic thinking: { type: "enabled", budgetTokens } | { type: "disabled" }
OpenAI reasoningSummary?, reasoningEffort?

Type Safety

The providerOptions field uses TypeScript Pick types to only allow thinking-related options:

  • GoogleThinkingOptions - only thinkingConfig
  • AnthropicThinkingOptions - only thinking
  • OpenAIThinkingOptions - only reasoningSummary and reasoningEffort

Attempting to pass other provider options (e.g., temperature, mediaResolution) will result in a compile-time error.

Notes

  • Experimental: Requires experimental: true on Stagehand init
  • Not supported in CUA mode: Throws StagehandInvalidArgumentError if used with mode: "cua"
  • Anthropic: budgetTokens is required when type: "enabled" (min 1024, max 64000)
  • Full type safety

@changeset-bot
Copy link

changeset-bot bot commented Jan 14, 2026

🦋 Changeset detected

Latest commit: 83f5bb9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch
@browserbasehq/stagehand-server Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 14, 2026

Greptile Summary

Added standardized thinking configuration to agent.execute() that maps to provider-specific thinking/reasoning options (Google thinkingConfig, Anthropic thinking, OpenAI reasoningSummary/reasoningEffort).

Key changes:

  • New ThinkingConfig interface with enableThinking, thinkingLevel, and budgetTokens fields
  • buildProviderOptions() method that translates thinking config to provider-specific formats
  • AI SDK warning suppression for Google and Anthropic (both incorrectly warn about features that work correctly)
  • Updated reasoning extraction logic to use reasoningText field from AI SDK
  • Validation that blocks thinking config in CUA mode and requires experimental flag
  • Test coverage for CUA validation

Issues found:

  • Gemini 3 model detection (gemini-3) doesn't match any models in the registry (only gemini-2.0-* and gemini-2.5-* exist)
  • Reasoning extraction fallback may capture non-reasoning text when thinking isn't enabled
  • Missing comma in TypeScript example causes syntax error

Confidence Score: 4/5

  • Safe to merge with minor fixes needed for model detection and documentation syntax
  • Implementation is well-structured with proper validation and test coverage, but has a model detection issue (gemini-3 check doesn't match registry) and a fallback logic concern that may capture non-reasoning text. The syntax error in docs needs fixing before merge.
  • packages/core/lib/v3/handlers/v3AgentHandler.ts needs attention for model detection logic (line 114) and reasoning extraction fallback (line 268-278)

Important Files Changed

Filename Overview
packages/core/lib/v3/types/public/agent.ts Added ThinkingConfig interface and AgentProviderOptions type with comprehensive documentation
packages/core/lib/v3/handlers/v3AgentHandler.ts Implemented provider-specific thinking configuration mapping and warning suppression logic
packages/core/lib/v3/agent/utils/validateExperimentalFeatures.ts Added validation to reject thinking config in CUA mode and require experimental flag

Sequence Diagram

sequenceDiagram
    participant User
    participant Agent
    participant Validator
    participant Handler
    participant AI_SDK
    participant Provider as LLM Provider

    User->>Agent: execute({ thinking: { enableThinking: true, ... } })
    Agent->>Validator: validateExperimentalFeatures()
    
    alt CUA mode enabled
        Validator-->>Agent: throw StagehandInvalidArgumentError
    else experimental: false
        Validator-->>Agent: throw ExperimentalNotConfiguredError
    else valid config
        Validator-->>Agent: validation passed
    end

    Agent->>Handler: buildProviderOptions(modelId, thinkingConfig)
    
    alt Google (gemini)
        Handler->>Handler: build google.thinkingConfig
        Handler-->>Agent: { google: { thinkingConfig: {...} }, suppressWarnings: true }
    else Anthropic (claude)
        alt budgetTokens missing
            Handler-->>Agent: throw StagehandInvalidArgumentError
        else budgetTokens provided
            Handler->>Handler: build anthropic.thinking
            Handler-->>Agent: { anthropic: { thinking: {...} }, suppressWarnings: true }
        end
    else OpenAI (o1/o3/o4/gpt)
        Handler->>Handler: build openai.reasoningSummary/reasoningEffort
        Handler-->>Agent: { openai: {...}, suppressWarnings: false }
    end

    Agent->>Handler: suppressAiSdkWarnings() if needed
    Handler->>Handler: set AI_SDK_LOG_WARNINGS = false
    
    Agent->>AI_SDK: generateText/streamText(providerOptions)
    AI_SDK->>Provider: API call with thinking config
    Provider-->>AI_SDK: response with reasoning
    AI_SDK-->>Agent: StepResult with reasoningText
    
    Agent->>Handler: extractReasoningFromStep()
    Handler->>Handler: check reasoningText, fallback to text
    Handler-->>Agent: reasoning string
    
    Agent->>Handler: restoreWarnings()
    Handler->>Handler: restore AI_SDK_LOG_WARNINGS
    
    Agent-->>User: AgentResult with reasoning
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 6 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/core/lib/v3/handlers/v3AgentHandler.ts">

<violation number="1" location="packages/core/lib/v3/handlers/v3AgentHandler.ts:149">
P2: Missing range validation for Anthropic `budgetTokens`. The PR description and types document that Anthropic requires `budgetTokens` between 1024-64000, but only existence is validated. Add bounds checking to provide a helpful error message.</violation>
</file>

<file name="packages/core/lib/v3/types/public/agent.ts">

<violation number="1" location="packages/core/lib/v3/types/public/agent.ts:360">
P3: Missing comma in JSDoc example. Developers copying this example will get a syntax error.</violation>
</file>

<file name="packages/core/tests/public-api/public-types.test.ts">

<violation number="1" location="packages/core/tests/public-api/public-types.test.ts:199">
P2: Missing `ThinkingConfig` entry in `ExpectedExportedTypes` manifest. Since this type is now used in the test cases and is publicly exported, it should be added to the type manifest at the top of the file for consistency (e.g., `ThinkingConfig: Stagehand.ThinkingConfig;`).</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Resolve conflict in v3AgentHandler.ts:
- Use 'done' tool name (renamed from 'close')
- Keep thinkingConfig provider options from branch
Copy link
Member

@pirate pirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend just passing through the config params to the LLM providers without trying to normalize the names/levels. as there are more LLM providers over time with more drift between them, we dont want to create a neverending stream of normalization work for ourselves. better to just document the recommended configs, but pass through 100% of the provider config options without trying to interpret them.

Copy link
Member

@pirate pirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really really want this to be generic without special handling for individual providers as much as possible, if you can figure that out then LGTM

it's fine to patch known stuff like mediaResolution: "MEDIA_RESOLUTION_HIGH", but for everything else it should just pass-through what the user provides


// Pass through OpenAI reasoning options directly
if (userOptions?.openai) {
return { openai: userOptions.openai };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any way we can do these generically based on provider name without having to hardcode specific providers here?

e.g. just return userOptions?

const stepReasoning = this.extractReasoningFromStep(event);
if (stepReasoning) {
state.collectedReasoning.push(stepReasoning);
this.logger({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you check if flowLogger is capturing reasoning correctly as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants