kafka: Implement Topic IDs for Metadata, Fetch, CreateTopics#26968
kafka: Implement Topic IDs for Metadata, Fetch, CreateTopics#26968michael-redpanda merged 11 commits intoredpanda-data:devfrom
Conversation
da669f8 to
9aca560
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR implements Kafka Fetch API version 13 (v13) support to enable Topic ID functionality in Redpanda. Topic IDs provide an alternative mechanism to identify topics that is more robust than topic names. The changes span multiple components including handlers, test infrastructure, protocol types, and client libraries.
Key changes implemented:
- Extends metadata and fetch handlers to support up to API version 12/13 respectively
- Implements Topic ID to topic name resolution in metadata and fetch operations
- Updates fetch sessions and caching to use kafka-internal topic partition identifiers (kitp)
Reviewed Changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/rptest/tests/compatibility/sarama_test.py | Adds version 2.6.0 to compatibility test matrix for broader protocol coverage |
| tests/rptest/tests/compatibility/sarama_produce_test.py | Adds version 2.6.0 to produce test matrix |
| src/v/pandaproxy/error.cc | Maps unknown_topic_id error to partition_not_found for proxy error handling |
| src/v/kafka/server/tests/metadata_test.cc | Adds comprehensive tests for metadata v12+ including topic ID resolution and auto-creation |
| src/v/kafka/server/tests/fetch_test.cc | Updates fetch tests to support topic IDs and adds leader epoch validation tests |
| src/v/kafka/server/tests/fetch_session_test.cc | Updates fetch session tests to use new kitp (kafka internal topic partition) types |
| src/v/kafka/server/tests/fetch_bench.cc | Adds performance benchmarks across different API versions |
| src/v/kafka/server/tests/BUILD | Adds cluster test fixture dependency for enhanced test capabilities |
| src/v/kafka/server/handlers/metadata.h | Bumps metadata handler max supported version to 12 |
| src/v/kafka/server/handlers/metadata.cc | Implements topic ID lookup and validation in metadata requests |
| src/v/kafka/server/handlers/fetch.h | Bumps fetch handler max supported version to 13 and enhances read_result |
| src/v/kafka/server/handlers/fetch.cc | Implements comprehensive fetch v13 support with topic ID resolution |
| src/v/kafka/server/fetch_session_cache.cc | Updates fetch session cache to use kitp identifiers |
| src/v/kafka/server/fetch_session.h | Introduces kitp types for topic partition identification with topic IDs |
| src/v/kafka/protocol/types.h | Updates uuid type to use underlying uuid_t |
| src/v/kafka/protocol/types.cc | Updates uuid string conversion to work with new underlying type |
| src/v/kafka/protocol/errors.h | Adds unknown_topic_id error code (100) |
| src/v/kafka/protocol/errors.cc | Implements unknown_topic_id error handling and retriability |
| src/v/kafka/client/errors.h | Marks unknown_topic_id as retriable error |
| src/v/kafka/client/direct_consumer/fetcher.h | Documents topic identifier support |
| src/v/kafka/client/direct_consumer/fetcher.cc | Caps client fetch API version at 12 for compatibility |
| src/v/cluster/types.h | Adds comparison operator to leader_term for testing |
| src/v/cluster/tests/cluster_test_fixture.h | Fixes namespace collision with ss::do_until |
| src/v/cluster/metadata_cache.h | Adds topic ID to name lookup method |
Comments suppressed due to low confidence (1)
src/v/kafka/server/handlers/fetch.cc:857
- [nitpick] The variable 'errored_partitions' now contains leader information but the name doesn't reflect this change. Consider renaming to 'errored_partitions_with_leaders' or 'partition_errors'.
std::vector<std::tuple<
Retry command for Build#69620please wait until all jobs are finished before running the slash command |
CI test resultstest results on build#69620
test results on build#69633
test results on build#70939
test results on build#71235
test results on build#71447
test results on build#71512
test results on build#71676
test results on build#71992
test results on build#72067
test results on build#72088
|
9aca560 to
b44a9ea
Compare
Retry command for Build#69633please wait until all jobs are finished before running the slash command |
IoannisRP
left a comment
There was a problem hiding this comment.
What about the linearizable_barrier that you mentioned in the previous PR?
| wait_for(10s, [&] { | ||
| auto [app_ptr, _] = get_leader(ntp); | ||
| return app_ptr != nullptr; | ||
| }); | ||
|
|
||
| auto [app_ptr, partition] = get_leader(ntp); |
There was a problem hiding this comment.
nth (nice to have):
it looks like our testing utils are missing a wait_for that returns the result of the last invocation (like the wait_until variant in rptests)
b44a9ea to
b12bdd0
Compare
dotnwat
left a comment
There was a problem hiding this comment.
If kiip type is good to go, you can merge like the first 10 commits in a separate PR i think as-is. even ignoring kiip, quite a few of the early commits can go in. please consider plucking out uncontroverisal commits into separate PRs to keep the flow going and reduce the size of PRs.
Good point: #27249 |
Retry command for Build#72027please wait until all jobs are finished before running the slash command |
|
/ci-repeat 1 |
Retry command for Build#72048please wait until all jobs are finished before running the slash command |
be7a010 to
7e77b6b
Compare
|
/microbench |
|
Output of |
|
Performance change detected in https://buildkite.com/redpanda/redpanda/builds/72070#019938b2-abf0-48c3-8d91-0656490fbf2c: See https://redpandadata.atlassian.net/wiki/x/LQAqLg for docs |
Instead of constructing a default ktp and then returning its members, resulting in a stack-use-after-scope, return a reference to *this, via as_ktp(), which was always the intention. Signed-off-by: Ben Pope <ben@redpanda.com>
This fix is required: twmb/franz-go#1091 Signed-off-by: Ben Pope <ben@redpanda.com>
This incvludes many fixes related to metadata handling. Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Populate the current leader id and epoch under certain error conditions. LastFetchedEpoch is only for IBP, so no change required. Signed-off-by: Ben Pope <ben@redpanda.com>
Topic Identifiers need explciit support. Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
This reverts (some of) commit 22f55b9.
Clients expect all brokers in the cluster to advertise the same API version support. To avoid confusing clients during rolling upgrades, we must ensure that API version bumps are applied consistently across all brokers via feature flags. The relevant features are: topic_ids – Enables assigning unique IDs to topics. topic_ids_api – Holds back API versions that depend on topic_ids until the feature is fully enabled. Signed-off-by: Ben Pope <ben@redpanda.com>
|
The test failures: It took over 99s, but it did start. There are no timesatmps in the log, but it did start, and then was sent SIGTERM. These are unrelated to this PR. |
7e77b6b to
2fc8d2b
Compare
|
Changes in force-push
|
It's a bit tricky to untangle Topic ID support, so this combines previous PRs:
Closes #26235
Closes #26926
Fixes CORE-9879
Fixes CORE-10028
Fixes CORE-9880
Fixes CORE-9881
Backports Required
Release Notes
Features