fix(core): process all URLs in web_fetch instead of only the first by bdmorgan · Pull Request #22212 · google-gemini/gemini-cli

bdmorgan · 2026-03-12T19:42:51Z

Summary

The web_fetch tool accepts up to 20 URLs but only processed urls[0] in both execute() and executeFallback() paths
Refactored executeFallback() to iterate all valid URLs via a new executeFallbackForUrl() helper
Updated execute() to rate-limit-check and validate (private IP) all URLs, not just the first
Each URL now receives a fair share of the content budget (MAX_CONTENT_LENGTH / urls.length) rather than the full limit
Abort signal is now propagated to retry logic in fallback mode

Changes

packages/core/src/tools/web-fetch.ts
- Extracted single-URL fetch logic into executeFallbackForUrl(url, perUrlContentBudget, signal)
- executeFallback() now iterates all URLs, collects content from each, and sends combined content to the fallback LLM
- execute() rate-limit and private-IP checks now iterate all URLs
- Partial failures are tolerated: if some URLs fail but others succeed, the successful content is still processed

Test plan

Pre-commit hooks (lint + prettier) passing
Preflight (npm run preflight) passing
All 52 web-fetch tests passing
Manual test with multi-URL prompts to verify all URLs are fetched

The web_fetch tool accepts up to 20 URLs but only processed urls[0] in both execute and fallback paths. Now iterates all URLs for rate-limit checks and private IP validation in execute(), and fetches all URLs in fallback mode via a new executeFallbackForUrl() helper. Each URL receives a fair share of the content budget (MAX_CONTENT_LENGTH / urls.length) rather than the full limit. Abort signal is now propagated to retry logic in fallback mode.

gemini-code-assist · 2026-03-12T19:43:06Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the web_fetch tool's capability to handle multiple URLs by ensuring that all provided URLs are processed, rather than just the initial one. It introduces robust error handling, distributes content budget fairly across URLs, and extends critical security and performance checks like rate limiting and private IP validation to cover all requests. This improves the tool's reliability and utility when dealing with multi-source web content.

Highlights

Multi-URL Processing: The web_fetch tool now correctly processes all provided URLs (up to 20) instead of only the first one, addressing a previous limitation.
Fallback Refactoring: The executeFallback() method was refactored to iterate through all valid URLs, collecting content from each and combining it for the fallback LLM.
Comprehensive Checks: Rate limiting and private IP validation are now applied to all URLs in the execute() path, enhancing security and preventing abuse.
Content Budget Distribution: The maximum content length is now divided among all fetched URLs, ensuring fair allocation and preventing a single large URL from consuming the entire budget.
Abort Signal Propagation: The abort signal is now correctly propagated to the retry logic within the fallback mechanism, allowing for proper cancellation.
Partial Failure Tolerance: The fallback mechanism now tolerates partial failures, processing content from successfully fetched URLs even if others fail, improving robustness.

Changelog

packages/core/src/tools/web-fetch.ts
- Extracted single-URL fetching logic into a new private helper method executeFallbackForUrl.
- Modified executeFallback to iterate over all valid URLs, calling executeFallbackForUrl for each, and aggregating their content.
- Implemented a per-URL content budget for executeFallbackForUrl based on the total MAX_CONTENT_LENGTH and the number of URLs.
- Updated the retryWithBackoff call in executeFallbackForUrl to propagate the signal parameter.
- Adjusted the fallbackPrompt to include content from multiple URLs and report errors for failed fetches.
- Updated the returnDisplay message in executeFallback to list all successfully fetched URLs.
- Modified the error message in executeFallback to be generic for multiple URLs.
- Changed execute to loop through all URLs for rate limit checks.
- Modified execute to check for private IPs across all URLs using urls.some().

Activity

Pre-commit hooks (lint + prettier) passing.
Preflight (npm run preflight) passing.
All 52 web-fetch tests passing.
Manual test with multi-URL prompts to verify all URLs are fetched is pending.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the web_fetch tool to correctly handle multiple URLs, iterating over all of them for rate limiting and private IP checks, and dividing the content budget. However, a high-severity indirect prompt injection vulnerability was identified in the executeFallback method, where untrusted content from fetched URLs is directly concatenated into a prompt for the fallback LLM, potentially allowing an attacker to manipulate its behavior. This issue aligns with the rule to avoid including untrusted input in LLM content. Additionally, consider improving performance by fetching URLs in parallel instead of sequentially to enhance responsiveness when multiple URLs are provided.

gemini-code-assist · 2026-03-12T19:45:44Z

packages/core/src/tools/web-fetch.ts

      const fallbackPrompt = `The user requested the following: "${this.params.prompt}".

-I was unable to access the URL directly. Instead, I have fetched the raw content of the page. Please use the following content to answer the request. Do not attempt to access the URL again.
+I was unable to access the URL(s) directly. Instead, I have fetched the raw content. Please use the following content to answer the request. Do not attempt to access the URLs again.

---
-${textContent}
---
-`;
+${contentParts.join('\n\n')}
+${errors.length > 0 ? `\nNote: Some URLs could not be fetched: ${errors.join('; ')}` : ''}`;
      const result = await geminiClient.generateContent(
        { model: 'web-fetch-fallback' },
        [{ role: 'user', parts: [{ text: fallbackPrompt }] }],


The executeFallback method is vulnerable to indirect prompt injection. It constructs a prompt for the LLM by concatenating untrusted data—specifically the content fetched from external URLs and error messages from failed fetch attempts—directly into the prompt string. An attacker who controls the content of a fetched URL or can manipulate the HTTP response (e.g., the status text) can inject malicious instructions that the LLM might follow. This could lead to the LLM outputting misleading information, performing unauthorized actions if the output is used by other tools, or exfiltrating data.

To remediate this, consider the following:

Use Structured Delimiters: Wrap untrusted content in clear, hard-to-spoof delimiters and instruct the LLM to treat everything within those delimiters as data, not instructions.

Sanitize Input: Sanitize the fetched content and error messages to remove or escape potential injection sequences.

Constrain the LLM: Use a separate, highly-constrained LLM call to summarize or extract information from the untrusted content before including it in the main prompt.

Escape User Input: Ensure that this.params.prompt is properly escaped when included in the fallbackPrompt to prevent direct injection if the user prompt contains quotes.

References

To prevent prompt injection, avoid including user-provided input in content passed to the LLM (llmContent). This principle extends to any untrusted external data, which should be handled with returnDisplay if needed for display, or sanitized/constrained if used in prompts.

gemini-code-assist · 2026-03-12T19:45:44Z

packages/core/src/tools/web-fetch.ts

+    for (const rawUrl of urls) {
+      try {
+        const textContent = await this.executeFallbackForUrl(
+          rawUrl,
+          perUrlContentBudget,
+          signal,
+        );
+        contentParts.push(
+          `--- Content from ${rawUrl} ---\n${textContent}\n---`,
+        );
+        fetchedUrls.push(rawUrl);
+      } catch (e) {
+        // eslint-disable-next-line @typescript-eslint/no-unsafe-type-assertion
+        const error = e as Error;
+        errors.push(`Error fetching ${rawUrl}: ${error.message}`);
      }
+    }


While the logic to handle multiple URLs in the fallback is correct, fetching them sequentially in a for...of loop can be inefficient and slow, especially when many URLs are provided. To improve performance, these independent network requests should be executed in parallel.

You can use Promise.allSettled to fire off all fetch requests concurrently and then process the results, which aligns well with the existing logic for handling both successful fetches and errors.

const fetchPromises = urls.map((rawUrl) => this.executeFallbackForUrl(rawUrl, perUrlContentBudget, signal), ); const results = await Promise.allSettled(fetchPromises); results.forEach((result, index) => { const rawUrl = urls[index]; if (result.status === 'fulfilled') { contentParts.push( `--- Content from ${rawUrl} ---\n${result.value}\n---`, ); fetchedUrls.push(rawUrl); } else { // eslint-disable-next-line @typescript-eslint/no-unsafe-type-assertion const error = result.reason as Error; errors.push(`Error fetching ${rawUrl}: ${error.message}`); } });

github-actions · 2026-03-12T19:46:54Z

Size Change: +1.26 kB (0%)

Total Size: 26.6 MB

Filename	Size	Change
`./bundle/gemini.js`	26.1 MB	+1.26 kB (0%)

ℹ️ View Unchanged

Filename	Size
`./bundle/node_modules/@google/gemini-cli-devtools/dist/client/main.js`	221 kB
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/_client-assets.js`	227 kB
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/index.js`	11.5 kB
`./bundle/node_modules/@google/gemini-cli-devtools/dist/src/types.js`	132 B
`./bundle/sandbox-macos-permissive-open.sb`	890 B
`./bundle/sandbox-macos-permissive-proxied.sb`	1.31 kB
`./bundle/sandbox-macos-restrictive-open.sb`	3.36 kB
`./bundle/sandbox-macos-restrictive-proxied.sb`	3.56 kB
`./bundle/sandbox-macos-strict-open.sb`	4.82 kB
`./bundle/sandbox-macos-strict-proxied.sb`	5.02 kB

_{compressed-size-action}

gemini-code-assist bot reviewed Mar 12, 2026

View reviewed changes

gemini-cli bot added the status/need-issue Pull requests that need to have an associated issue. label Mar 12, 2026

bdmorgan mentioned this pull request Mar 12, 2026

feat(core): implement Stage 1 improvements for webfetch tool #21313

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): process all URLs in web_fetch instead of only the first#22212

fix(core): process all URLs in web_fetch instead of only the first#22212
bdmorgan wants to merge 1 commit intomainfrom
fix/web-fetch-process-all-urls

bdmorgan commented Mar 12, 2026

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 12, 2026

Uh oh!

gemini-code-assist bot Mar 12, 2026

Uh oh!

github-actions bot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bdmorgan commented Mar 12, 2026

Summary

Changes

Test plan

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant