Skip to content

Conversation

@mudler
Copy link
Owner

@mudler mudler commented Jan 20, 2026

Description

This PR adds support to the openresponses API for reasoning blocks, including:

  1. Open Responses API Integration - Full support for reasoning blocks in the Open Responses API specification
  2. Reasoning Configuration System - Flexible configuration for controlling reasoning extraction behavior and override defaults
  3. Strip Reasoning Only Mode - Option to remove reasoning tags without storing the reasoning content
  4. Custom Thinking Start Tokens - Support for user-defined thinking start tokens
  5. Custom Tag Pairs - Support for user-defined reasoning tag pairs for extraction

Added a comprehensive reasoning configuration section which allows to override LocalAI's defaults with the following options:

  • disable (bool): Completely disable reasoning extraction
  • disable_reasoning_tag_prefill (bool): Disable automatic prepending of thinking start tokens
  • strip_reasoning_only (bool): Extract and remove reasoning tags but discard the reasoning text
  • thinking_start_tokens (array of strings): Custom thinking start tokens to detect in prompts (checked before defaults)
  • tag_pairs (array of objects): Custom tag pairs with start and end fields for reasoning extraction (checked before defaults)

The options:

disable: When true, completely disables reasoning extraction. Original content is returned unchanged.

disable_reasoning_tag_prefill: When true, disables automatic prepending of thinking start tokens. Use this when your model already includes reasoning tags in its output format.

strip_reasoning_only: When enabled:

  • Extracts reasoning tags from content (same as normal extraction)
  • Removes the tags from the cleaned content
  • Sets the extracted reasoning to an empty string

Use Case: Useful for models that include reasoning tags in their output where you want to remove those tags from the final response but don't need to store or process the reasoning content separately.

thinking_start_tokens: Allows users to specify custom tokens that indicate reasoning will start in the model output. Custom tokens are checked before default tokens, giving them priority.

tag_pairs: Allows users to define custom start/end tag pairs for reasoning extraction. Custom pairs are checked before default pairs, giving them priority.

Example:

name: deepseek-model
backend: llama-cpp
parameters:
  model: deepseek.gguf

reasoning:
  disable: false
  disable_reasoning_tag_prefill: false
  strip_reasoning_only: false
  thinking_start_tokens:
    - "<custom:think>"
  tag_pairs:
    - start: "<custom:think>"
      end: "</custom:think>"

Notes for Reviewers

Depends on #8132

Signed commits

  • Yes, I signed my commits.

@netlify
Copy link

netlify bot commented Jan 20, 2026

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 3043572
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/696ff5db6adca00007777549
😎 Deploy Preview https://deploy-preview-8133--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@mudler mudler added the enhancement New feature or request label Jan 20, 2026
@mudler mudler force-pushed the feat/openresponses-reasoning branch from e7244bc to 5f4b0da Compare January 20, 2026 20:34
@mudler mudler merged commit c491c6c into master Jan 20, 2026
36 of 37 checks passed
@mudler mudler deleted the feat/openresponses-reasoning branch January 20, 2026 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant