Skip to content

CORE-12937 schema_registry: core import mode support#27010

Merged
pgellert merged 7 commits intoredpanda-data:devfrom
pgellert:sr_import/core-support
Jul 31, 2025
Merged

CORE-12937 schema_registry: core import mode support#27010
pgellert merged 7 commits intoredpanda-data:devfrom
pgellert:sr_import/core-support

Conversation

@pgellert
Copy link
Copy Markdown
Contributor

@pgellert pgellert commented Jul 28, 2025

Support enabling import mode in the Schema Registry.

In import mode, users can register schemas with the following differences from read write mode:

  • users can specify the id and version of the schema
  • re-inserting a schema definition with a different schema id is allowed
  • no compatibility checks are performed

Fixes https://redpandadata.atlassian.net/browse/CORE-12937

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.2.x
  • v25.1.x
  • v24.3.x
  • v24.2.x

Release Notes

  • none

@pgellert pgellert requested a review from a team July 28, 2025 10:30
@pgellert pgellert self-assigned this Jul 28, 2025
@pgellert pgellert requested review from michael-redpanda and removed request for a team July 28, 2025 10:30
@pgellert pgellert requested review from a team, kbatuigas and r-vasquez as code owners July 28, 2025 10:30
@pgellert pgellert requested a review from BenPope July 28, 2025 10:30
Copy link
Copy Markdown
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

I didn't see tests that our fake IMPORT in READ_WRITE mode has changed at all? I seem to recall that the reference implementation has some restrictions on specifying id/version during READ_WRITE.

Does this PR completely replace #23750?

I seem to recall an error code being produced from the reference implementation about version needing to be sequential? Perhaps about preventing version gaps? There was a specific error code for it, if I recall.

assert result_raw.status_code == requests.codes.ok

self.logger.debug("Set global mode to IMPORT")
self.logger.debug("Set global mode to FORWARD")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This confused me a bit, since FORWARD is a supported config for compatibility. I'd be tempted to name it something like UNSUPPORTED.

result_raw = self.sr_client.get_schemas_ids_id_versions(id=1)
assert result_raw.status_code == 200

@cluster(num_nodes=3)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding a check that with global IMPORT, subject READ_WRITE behaves as intended (override?)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to have a follow up PR that removes our pseudo-import-mode behaviour in READ_WRITE mode. Let me defer adding testing for that to the follow up PR as well. I think it makes sense to test that behaviour both in global and subject-level READ_WRITE mode (with global IMPORT perhaps).

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Jul 28, 2025

Retry command for Build#69746

please wait until all jobs are finished before running the slash command

/ci-repeat 1
tests/rptest/tests/rpk_registry_test.py::RpkRegistryTest.test_registry_mode

@vbotbuildovich
Copy link
Copy Markdown
Collaborator

vbotbuildovich commented Jul 28, 2025

CI test results

test results on build#69746
test_class test_method test_arguments test_kind job_url test_status passed reason
EndToEndCloudTopicsTxTest test_write null integration https://buildkite.com/redpanda/redpanda/builds/69746#019850c4-b48e-41b2-8ada-add8af132313 FLAKY 20/21 upstream reliability is '94.77726574500768'. current run reliability is '95.23809523809523'. drift is -0.46083 and the allowed drift is set to 50. The test should PASS
RandomNodeOperationsTest test_node_operations {"cloud_storage_type": 2, "compaction_mode": "sliding_window", "enable_failures": false, "mixed_versions": true, "with_iceberg": false} integration https://buildkite.com/redpanda/redpanda/builds/69746#019850c4-b48d-4650-be2e-2a917ce54997 FLAKY 19/21 upstream reliability is '99.38837920489296'. current run reliability is '90.47619047619048'. drift is 8.91219 and the allowed drift is set to 50. The test should PASS
RpkRegistryTest test_registry_mode null integration https://buildkite.com/redpanda/redpanda/builds/69746#019850c4-b48b-4bc6-a61a-41fec1a0eecf FAIL 0/21 The test has failed across all retries
RpkRegistryTest test_registry_mode null integration https://buildkite.com/redpanda/redpanda/builds/69746#019850c4-bf8f-4fb2-853e-38c1ad8ddcaa FAIL 0/21 The test has failed across all retries
test results on build#69925
test_class test_method test_arguments test_kind job_url test_status passed reason
RandomNodeOperationsTest test_node_operations {"cloud_storage_type": 2, "compaction_mode": "adjacent_merge", "enable_failures": false, "mixed_versions": true, "with_iceberg": false} integration https://buildkite.com/redpanda/redpanda/builds/69925#01985a69-c4de-4ca9-9bf8-f48d9115d846 FLAKY 20/21 upstream reliability is '99.24242424242425'. current run reliability is '95.23809523809523'. drift is 4.00433 and the allowed drift is set to 50. The test should PASS

@pgellert pgellert changed the title schema_registry: core import mode support CORE-12937 schema_registry: core import mode support Jul 29, 2025
@r-vasquez
Copy link
Copy Markdown
Contributor

rpk bits lgtm, we need to change this though: https://github.com/redpanda-data/redpanda/blob/dev/tests/rptest/tests/rpk_registry_test.py#L654-L657 😅

We wanted to test this because rpk was the one who made the check because the underlying library already accepted the IMPORT mode.

pgellert added 7 commits July 29, 2025 17:39
Don't reject IMPORT mode when parsing a request, instead, allow enabling
it.

The preconditions to enable import mode are:
- globally: the schema registry must be empty
- subject-level: the subject must be empty

Soft-deleted schema versions are also considered. The reference
implementation has a feature where it allows moving to import mode even
if there are soft-deleted schemas, and it hard-deletes these schemas
while moving into import mode. We can consider implementing this later
on (left as a TODO).
Test that the preconditions of enabling import mode are correctly
checked by the `PUT /mode` and `PUT /mode/{subject}` endpoints.
The error "Overwrite new schema with id 3 is not permitted." should use
error code 42205 to match the reference implementation.
Support IMPORT mode when handling `POST /subjects/{subject}/versions`
requests, that is, when registering schemas.

The expected behaviour is that in import mode:
* users can specify the id and version of the schema
* re-inserting a schema definition with a different schema id is allowed
* no compatibility checks are performed
Test the behaviour of `POST /subjects/{subject}/versions` in import
mode.
@pgellert pgellert force-pushed the sr_import/core-support branch from 66fedfb to 43c01e7 Compare July 30, 2025 07:25
@pgellert
Copy link
Copy Markdown
Contributor Author

force-push:

  • address feedback
  • fix rpk test failure
  • add 6bd36ae -- change error status from 42207 to 42205 when the error is "Overwrite new schema with id {} is not permitted."

@pgellert
Copy link
Copy Markdown
Contributor Author

I didn't see tests that our fake IMPORT in READ_WRITE mode has changed at all? I seem to recall that the reference implementation has some restrictions on specifying id/version during READ_WRITE.

I have a separate ticket for changing the READ_WRITE mode to align with the reference implementation, so that will be the next PR.

Does this PR completely replace #23750?

Yes. Thanks for linking to it, now I noticed that we should change the error code from 42207 to 42205 when returning an error on overwriting a schema, so I've included that in this PR as well (6bd36ae).

I seem to recall an error code being produced from the reference implementation about version needing to be sequential? Perhaps about preventing version gaps? There was a specific error code for it, if I recall.

In IMPORT mode the version can be arbitrary, the requirement on the version is only a requirement in READ_WRITE mode. The error you're thinking of is {"error_code":42201,"message":"Version is not one more than previous version"}. I'll implement that in the next PR.

@pgellert pgellert requested a review from BenPope July 30, 2025 07:33
Comment on lines +418 to +420
// TODO: relax the above restrictions to
// 1. Allow soft-deleted schemas to exist, but
// 2. Hard delete them before moving to import mode
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this would be conditional on force?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code would come here where the todo is, so it would be skipped if force=true.

co_await check_mutable(sub);

const auto mode = co_await _store.get_mode(sub, default_to_global::yes);
if (schema.id < 0 && mode != mode::read_write) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any significance of the id being negative but not invalid_schema_id?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, any negative number is treated the same as invalid_schema_id / not specified.

@pgellert pgellert merged commit 4bd6246 into redpanda-data:dev Jul 31, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants