Skip to content

[CORE-13202] config: introduce sasl mechanisms override#27458

Merged
IoannisRP merged 2 commits intoredpanda-data:devfrom
IoannisRP:CORE-13202/sasl-mechanisms-override
Sep 16, 2025
Merged

[CORE-13202] config: introduce sasl mechanisms override#27458
IoannisRP merged 2 commits intoredpanda-data:devfrom
IoannisRP:CORE-13202/sasl-mechanisms-override

Conversation

@IoannisRP
Copy link
Copy Markdown
Contributor

@IoannisRP IoannisRP commented Sep 4, 2025

implements: CORE-13202

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.2.x
  • v25.1.x
  • v24.3.x

Release Notes

Features

  • Introduce new "sasl_mechanisms_override" configuration that allows per-listener controls over the sasl mechanisms.

@IoannisRP IoannisRP self-assigned this Sep 4, 2025
@IoannisRP IoannisRP force-pushed the CORE-13202/sasl-mechanisms-override branch from c23baa3 to 826d0c2 Compare September 4, 2025 10:48
@IoannisRP IoannisRP marked this pull request as ready for review September 4, 2025 10:50
@IoannisRP IoannisRP requested a review from a team as a code owner September 4, 2025 10:50
@IoannisRP IoannisRP requested review from a team, michael-redpanda and pgellert and removed request for a team September 4, 2025 10:50
"sasl_mechanisms_overrides",
"A list of sasl mechanisms overrides, defined by listener. Same "
"limitations "
"apply as in `sasl_mechanisms`.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"apply as in `sasl_mechanisms`.",
"apply as for `sasl_mechanisms`.",

Should this also mention enterprise_sasl_mechanism?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also mention enterprise_sasl_mechanism?

We don't even describe what is an enterprise value for the sasl_mechanisms 😅
Probably something to deal with in a follow up pr

Copy link
Copy Markdown
Contributor

@asimms41 asimms41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few suggestions.

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Sep 4, 2025

CI test results

test results on build#71775
test_class test_method test_arguments test_kind job_url test_status passed reason
RandomNodeOperationsTest test_node_operations {"cloud_storage_type": 1, "compaction_mode": "adjacent_merge", "enable_failures": true, "mixed_versions": true, "with_iceberg": false} integration https://buildkite.com/redpanda/redpanda/builds/71775#01991495-fa31-4597-8733-5f7005a76590 FLAKY 20/21 upstream reliability is '97.77070063694268'. current run reliability is '95.23809523809523'. drift is 2.53261 and the allowed drift is set to 50. The test should PASS
RandomNodeOperationsTest test_node_operations {"cloud_storage_type": 2, "compaction_mode": "chunked_sliding_window", "enable_failures": true, "mixed_versions": true, "with_iceberg": false} integration https://buildkite.com/redpanda/redpanda/builds/71775#01991495-fa32-4a24-988e-49275316b117 FLAKY 19/21 upstream reliability is '99.21875'. current run reliability is '90.47619047619048'. drift is 8.74256 and the allowed drift is set to 50. The test should PASS
test results on build#71804
test_class test_method test_arguments test_kind job_url test_status passed reason
CloudStorageScrubberTest test_scrubber {"cloud_storage_type": 2} integration https://buildkite.com/redpanda/redpanda/builds/71804#0199193d-0195-4b01-b829-46ce56a0fc4c FLAKY 18/21 upstream reliability is '99.3975903614458'. current run reliability is '85.71428571428571'. drift is 13.6833 and the allowed drift is set to 50. The test should PASS
JavaCompressionTest test_upgrade_java_compression {"compression_type": "gzip"} integration https://buildkite.com/redpanda/redpanda/builds/71804#01991937-83fb-4d25-a296-0be94afbba92 FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "gzip"} integration https://buildkite.com/redpanda/redpanda/builds/71804#0199193d-0195-4b01-b829-46ce56a0fc4c FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "lz4"} integration https://buildkite.com/redpanda/redpanda/builds/71804#01991937-83fc-4d3e-8179-edc8e2f7147e FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "lz4"} integration https://buildkite.com/redpanda/redpanda/builds/71804#0199193d-0196-45fa-88af-8c53c5b468c6 FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "snappy"} integration https://buildkite.com/redpanda/redpanda/builds/71804#01991937-83fd-4bc8-bbb1-0b34f4a7773f FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "snappy"} integration https://buildkite.com/redpanda/redpanda/builds/71804#0199193d-0197-4140-81cf-d1689f21e590 FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "zstd"} integration https://buildkite.com/redpanda/redpanda/builds/71804#01991937-83fd-4226-8c2e-6214a17ae885 FAIL 0/21 The test has failed across all retries
JavaCompressionTest test_upgrade_java_compression {"compression_type": "zstd"} integration https://buildkite.com/redpanda/redpanda/builds/71804#0199193d-0197-49ac-bfd5-9a8e3b9d0bed FAIL 0/21 The test has failed across all retries
RandomNodeOperationsTest test_node_operations {"cloud_storage_type": 2, "compaction_mode": "adjacent_merge", "enable_failures": true, "mixed_versions": true, "with_iceberg": false} integration https://buildkite.com/redpanda/redpanda/builds/71804#01991937-83fd-4bc8-bbb1-0b34f4a7773f FLAKY 20/21 upstream reliability is '99.19028340080972'. current run reliability is '95.23809523809523'. drift is 3.95219 and the allowed drift is set to 50. The test should PASS
test results on build#71893
test_class test_method test_arguments test_kind job_url test_status passed reason
DatalakeE2ETests test_json_schema_unicode {"catalog_type": "rest_hadoop", "cloud_storage_type": 1, "query_engine": "trino"} integration https://buildkite.com/redpanda/redpanda/builds/71893#01992dfb-36ff-43f1-96d0-bb9b77ad6711 FLAKY 20/21 upstream reliability is '99.64912280701755'. current run reliability is '95.23809523809523'. drift is 4.41103 and the allowed drift is set to 50. The test should PASS
RandomNodeOperationsTest test_node_operations {"cloud_storage_type": 2, "compaction_mode": "adjacent_merge", "enable_failures": true, "mixed_versions": true, "with_iceberg": false} integration https://buildkite.com/redpanda/redpanda/builds/71893#01992e00-073b-4aad-9703-2f7769805a6e FLAKY 19/21 upstream reliability is '99.06976744186046'. current run reliability is '90.47619047619048'. drift is 8.59358 and the allowed drift is set to 50. The test should PASS
test results on build#72162
test_class test_method test_arguments test_kind job_url test_status passed reason
PartitionBalancerTest test_recovery_mode_rebalance_finish null integration https://buildkite.com/redpanda/redpanda/builds/72162#01993eb1-1eb6-4e10-89d0-7dc60112966a FLAKY 14/21 upstream reliability is '94.81865284974094'. current run reliability is '66.66666666666666'. drift is 28.15199 and the allowed drift is set to 50. The test should PASS
SimpleEndToEndTest test_relaxed_acks {"write_caching": false} integration https://buildkite.com/redpanda/redpanda/builds/72162#01993eb1-1eb3-43b2-91fe-1da449021db6 FLAKY 20/21 upstream reliability is '98.83720930232558'. current run reliability is '95.23809523809523'. drift is 3.59911 and the allowed drift is set to 50. The test should PASS
test results on build#72245
test_class test_method test_arguments test_kind job_url test_status passed reason
SchemaEvolutionE2ETests test_legal_schema_evolution {"catalog_type": "rest_hadoop", "cloud_storage_type": 1, "produce_mode": "avro", "query_engine": "trino", "test_case": "add_column"} integration https://buildkite.com/redpanda/redpanda/builds/72245#01994ce8-4890-4d08-81f4-a55b79247d77 FLAKY 20/21 upstream reliability is '99.64285714285714'. current run reliability is '95.23809523809523'. drift is 4.40476 and the allowed drift is set to 50. The test should PASS
SchemaEvolutionE2ETests test_old_schema_writer {"catalog_type": "rest_hadoop", "cloud_storage_type": 1, "produce_mode": "avro", "query_engine": "trino", "test_case": "promote_column"} integration https://buildkite.com/redpanda/redpanda/builds/72245#01994ce8-4890-4d08-81f4-a55b79247d77 FLAKY 19/21 upstream reliability is '100.0'. current run reliability is '90.47619047619048'. drift is 9.52381 and the allowed drift is set to 50. The test should PASS

@IoannisRP IoannisRP force-pushed the CORE-13202/sasl-mechanisms-override branch from 826d0c2 to 6aa412a Compare September 5, 2025 08:59
@IoannisRP
Copy link
Copy Markdown
Contributor Author

Changes in force-push:

  • update config description

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Sep 5, 2025

Retry command for Build#71804

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/compatibility/java_compression_test.py::JavaCompressionTest.test_upgrade_java_compression@{"compression_type":"lz4"}
tests/rptest/tests/compatibility/java_compression_test.py::JavaCompressionTest.test_upgrade_java_compression@{"compression_type":"zstd"}
tests/rptest/tests/compatibility/java_compression_test.py::JavaCompressionTest.test_upgrade_java_compression@{"compression_type":"snappy"}
tests/rptest/tests/compatibility/java_compression_test.py::JavaCompressionTest.test_upgrade_java_compression@{"compression_type":"gzip"}

@IoannisRP IoannisRP requested a review from asimms41 September 8, 2025 09:44
asimms41
asimms41 previously approved these changes Sep 8, 2025
Copy link
Copy Markdown
Contributor

@asimms41 asimms41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Can we maybe expand that PLAIN test to validate that if PLAIN is in the global list but not in the overrides that client attempting to connect via PLAIN fails?

inline constexpr auto enterprise_sasl_mechanisms
= std::to_array<std::string_view>({gssapi, oauthbearer});

bool enterprise_sasl_mechanism(const ss::sstring& sasl_mech);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two nits:

  • docs
  • maybe should be is_enterprise_sasl_mechanism

bool enterprise_sasl_mechanism(const ss::sstring& sasl_mech);

// Checks if `sasl_mech` is enabled in sasl_mechanisms config
bool has_sasl_mechanism(const std::string_view sasl_mech);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: to make this easier to use in tests, maybe also supply a reference to the config store you wish to check against rather than always using config::shard_local_cfg()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config_store deals in base_property which doesn't know anything about the concrete types of the property values. We would have to serialize the property (which is the only way to inspect a property value through a base_property) and do string manipulations to see if sasl_mech exists or not in it.

@IoannisRP
Copy link
Copy Markdown
Contributor Author

changes in force-push:

  • update config/sasl_mechanisms.h api (naming + comments)
  • extend PLAIN sasl dt test to check disabling PLAIN through an override

bool is_enterprise_sasl_mechanism(const ss::sstring& sasl_mech);

// Checks if `sasl_mech` is enabled in sasl_mechanisms config
bool has_sasl_mechanism(const std::string_view sasl_mech);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why const? string_view is already intended to be read-only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same difference between a const ptr* and const ptr* const.

It is not about the underlying data that the string_view is pointing at. It is about the string_view itself.
I am a big fan of "const-everything", in general, even when it doesn't make a significant (any?) difference.

Following the same logic, i tend to prefer void foo(const int) over void foo(int) (assuming the value doesn't change during the function execution).

pgellert
pgellert previously approved these changes Sep 10, 2025
Copy link
Copy Markdown
Contributor

@pgellert pgellert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +29 to +31
const auto overrides = config::shard_local_cfg().sasl_mechanisms_overrides()
| std::views::transform(
&sasl_mechanisms_override::sasl_mechanisms);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check here if a particular listener exists at all? I think it's fine not to, but wondering if others have an opinion here.

@IoannisRP IoannisRP force-pushed the CORE-13202/sasl-mechanisms-override branch from 014229b to 9ed3043 Compare September 12, 2025 15:04
@IoannisRP
Copy link
Copy Markdown
Contributor Author

changes in force-push

  • rebase to dev (includes security report feature)

changes in force-push:

  • implement changes to security report to include the override behavior

This new config is a way to control the cluster's sasls_mechanisms
on a per-listener basis. For a given kafka listener, if there is
an entry in the sasl_mechanisms_overrides config, the value from
sasl_mechanisms will be ignored and the override will be used in its
place.

For any functionality that needs to know if a sasl mechanisms is active,
like for the enterprise features check, it is considered active if it
is used in any listener override.
@IoannisRP IoannisRP force-pushed the CORE-13202/sasl-mechanisms-override branch from 9ed3043 to 5a79084 Compare September 15, 2025 09:23
@IoannisRP
Copy link
Copy Markdown
Contributor Author

IoannisRP commented Sep 15, 2025

changes in force-push:

  • update config descriptions
  • change config visibility from tunable to user

Copy link
Copy Markdown
Contributor

@pgellert pgellert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@IoannisRP IoannisRP merged commit bb7f9af into redpanda-data:dev Sep 16, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants