Skip to content

Allow filter_input to work again#1658

Open
withinboredom wants to merge 6 commits intomainfrom
fix/1657
Open

Allow filter_input to work again#1658
withinboredom wants to merge 6 commits intomainfrom
fix/1657

Conversation

@withinboredom
Copy link
Member

There was a small logic bug in frankenphp.c where if should_filter_var is true, it would never execute the next part of the if-statement, and thus never call sapi_module.input_filter(). This PR simply adjusts the logic to ensure if should_filter_var is true, then call sapi_module.input_filter(), otherwise, only enter the body if should_filter_var is false.

closes #1657

@AlliBalliBaba
Copy link
Contributor

AlliBalliBaba commented Jun 18, 2025

filter_input in combination with $_SERVER and $_ENV is disabled by default since 1.3 if nothing is explicitly set in the php.ini. It makes global registration very inefficient, even though it's probably not used 99% of the time with $_SERVER and $_ENV.

It still works if any filter is defined explicitly in the php.ini (also filter.default=unsafe_raw), it probably makes sense to document this.

with any filter_vars:
image

without filter_vars:
image

The slowdown comes from the fact that filter_var allocates the whole $_SERVER array 2 additional times if I remember correctly.

I imagine there will be some kind of optimization coming in PHP 9, since the ini directive is also apparently deprecated.

@withinboredom
Copy link
Member Author

withinboredom commented Jun 18, 2025

Hmmm. I was going to look into that. filter.default is deprecated and going away in php 9, so I'm not sure if that's a good idea to rely on it. (And if filter.default is not set, there is still a default filter IIRC).

@AlliBalliBaba
Copy link
Contributor

Yeah the default filter is unsafe_raw, which doesn't do anything without flags (it will just copy $_SERVER in case filter_input(INPUT_SERVER) is called during the request).

The slowdown is probably not worth it, and I'd recommend using filter_var($_SERVER["REQUEST_METHOD"] ?? "" directly if this stays unoptimized in PHP 9. But hard to say, maybe PHP 9 will even have a request object.

@withinboredom
Copy link
Member Author

withinboredom commented Jun 18, 2025

Hmmm. So, if you don’t set a default right now, filter_input doesn’t do anything in FrankenPHP? It just returns an empty string? I’m not sure if that’s better or worse for libraries/frameworks that expect it to at least return unsafe-raw if the extension is available.

filter_input is the only way to get around hackers/frameworks/libraries messing with $_SERVER (and other super globals) in memory. It doesn’t access them at all, is immutable, and should return the "raw/filtered" data.

@AlliBalliBaba
Copy link
Contributor

filter_input still works with all globals, only for $_SERVER and $_ENV filter.default has to be explicitly defined in the php.ini, else it will always return NULL.

I think this is definitely the correct default for worker mode, since worker mode necessarily has to register $_SERVER on every request. Production grade $_SERVER arrays often end up being multiple KB in size and there would be no way to avoid those additional allocations, even though all frameworks supporting a worker mode don't make use of filter_input(INPUT_SERVER).

That being said, I would be fine with enabling it in 'regular mode', if you think the performance hit is worth being more compatible with FPM defaults.

As a sidenote:
Searching for $_SERVER on github yields 3.4 million results, while searching for filter_input(INPUT_SERVER only yields 6300. There's also an (unrelated) note of caution in the manual against using it with $_SERVER.

@withinboredom
Copy link
Member Author

I have strong opinions about neutering built-in extensions just to eek out a little extra performance. If users want greater performance, they can build php with --disable-filter or we should build it without that extension available, so users have to install it themselves.

Otherwise, if the extension is available, we should behave just like any SAPI, even if it hurts performance (or make changes to php-src so that the performance impact is mitigated).

Searching for $_SERVER on github yields 3.4 million results, while searching for filter_input(INPUT_SERVER only yields 6300.

As mentioned earlier, filter_input is the only way to ensure nothing has modified input data (handy in untrusted environments like WordPress plugins). Just because most code is built to run in trusted environments (i.e. an application), doesn’t make the point moot.

There's also an (unrelated) note of caution in the manual against using it with $_SERVER.

That bug was fixed ~15 years ago. However, many frameworks also inject things into $_ENV and $_SERVER, such as .env file loaders, making it somewhat still relevant in those frameworks.

wdyt @dunglas?

@dunglas
Copy link
Member

dunglas commented Jun 21, 2025

I agree with @withinboredom rationale. However it would be nice to fix that in php-src.

@AlliBalliBaba
Copy link
Contributor

I guess I disagree, at least for worker mode.

I think the defaults for worker mode should be fast, performance is the whole point of its existence.
Building PHP without --filter isn't really an option, many libraries and applications still rely on the filter extension (it's a good extension).

Only the filter_input(INPUT_SERVER combination is niche enough to not be worth the slower default. I also think this should be optimized in php-src, first the deprecated global filters need to be removed entirely in PHP 9 though.

@withinboredom
Copy link
Member Author

withinboredom commented Jun 22, 2025

However it would be nice to fix that in php-src.

@AlliBalliBaba Can you give this branch+fork of php-src + this branch of FrankenPHP a go? (you should be able to apply the diff to any branch of PHP as well).

I think the defaults for worker mode should be fast, performance is the whole point of its existence.

It shouldn't sacrifice correctness for performance. The whole point is to make it faster while still being correct. Otherwise, we could just delete half PHP to make it faster. (edit to add: though I actually think I agree with you -- I was mostly just arguing on principles. Worker mode is already quirky.)

@AlliBalliBaba
Copy link
Contributor

Looks like this branch would at least avoid 1 $_SERVER copy, I'll give it a go 👍

@AlliBalliBaba
Copy link
Contributor

Hard to say from looking at the flamegraphs alone, but I think your branch makes this a bit more efficient. Last time I looked at it, it was actually copying the whole memory twice over, I think your branch only copies it once. Haven't tested it for correctness, but this probably also speeds up all other SAPIs.

I still think it would be fine to break with FPM default behavior in worker mode.

your fork + this branch
image

main
image

@dunglas
Copy link
Member

dunglas commented Jun 27, 2025

Should we merge this one?

@AlliBalliBaba
Copy link
Contributor

filter_input is also on the chopping block for deprecations in 8.5 (not yet approved though)

@withinboredom
Copy link
Member Author

Let's see how that RFC shakes out then. I'm just getting back from a short vacation and have a bajillion emails to read through -- I hope it doesn't get removed, but rather it may be better to replace it with something else.

@withinboredom
Copy link
Member Author

Looks like none of the filter deprecations passed.

@henderkes
Copy link
Contributor

given that they are not deprecated, we should probably merge this. @dunglas ?

@AlliBalliBaba
Copy link
Contributor

IMO we have the superior defaults. You can still enable the global filter, but you have to explicitly set it in the php.ini.

filter.default=unsafe_raw # only unsafe_raw is not deprecated

Enabling global filters will create 2 additional deep-copies of each superglobal, which seems like it is too pricey for a feature that 99% of people don't use.

If you really think it should be enabled by default, we should at least keep the current logic in worker mode, where very likely not a single soul is using filter_input(INPUT_SERVER, ...).

@henderkes
Copy link
Contributor

Perhaps, instead of changing it in the code, we could change the default Caddyfile to include the php_ini setting then. The only thing important is to stay compatible with fpm applications by default. Would you be happy with that instead?

size_t new_val_len = val_len;
if (!should_filter_var ||
sapi_module.input_filter(PARSE_SERVER, key, &val, new_val_len,
if (sapi_module.input_filter(PARSE_SERVER, key, &val, new_val_len,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also be fine with just keeping the current logic in worker mode:

Suggested change
if (sapi_module.input_filter(PARSE_SERVER, key, &val, new_val_len,
if ((is_worker_thread && !should_filter_var) || sapi_module.input_filter(PARSE_SERVER, key, &val, new_val_len,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

filter_input(INPUT_SERVER, ...) not populated

4 participants