Skip to content

Block Offboarding of Storage Consumer with OBCs [hold]#3696

Closed
shirady wants to merge 1 commit intored-hat-storage:mainfrom
shirady:block-offboarding-for-remote-obc
Closed

Block Offboarding of Storage Consumer with OBCs [hold]#3696
shirady wants to merge 1 commit intored-hat-storage:mainfrom
shirady:block-offboarding-for-remote-obc

Conversation

@shirady
Copy link
Copy Markdown
Contributor

@shirady shirady commented Feb 16, 2026

Describe the problem

Part of RHSTOR-6230
Expanding the offboarding to return an error when an storage client has OBC.

Explain the changes

  1. Implement in OffboardConsumer when a storage client has OBCs (by listing with the UUID consumer).
    Note: to avoid getting the consumer in a call and then checking its namespace, I used the list on all the namespaces in the provider-cluster.
  2. Auto-generated unit test in the file services/provider/server/server_test.go.

Testing Instructions:

Unit Tests

Please run:

  1. go test -v -run TestOffboardConsumer ./services/provider/server/ -count=1
=== RUN   TestOffboardConsumer
=== RUN   TestOffboardConsumer/block_offboarding_when_OBCs_exist
I0218 10:12:09.424464   58517 server.go:437] "StorageConsumer has OBCs, will not continue with offboarding" logger="OffboardConsumer" consumer="consumer-uid-123"
=== RUN   TestOffboardConsumer/succeed_offboarding_when_no_OBCs
I0218 10:12:09.425818   58517 server.go:441] "Starting OffboardConsumer RPC" logger="OffboardConsumer" consumer="consumer-uid-456" request="storageConsumerUUID:\"consumer-uid-456\""
I0218 10:12:09.426712   58517 server.go:447] "Successfully Offboarded Client from StorageConsumer with the provided UUID" logger="OffboardConsumer" consumer="consumer-uid-456"
--- PASS: TestOffboardConsumer (0.05s)
    --- PASS: TestOffboardConsumer/block_offboarding_when_OBCs_exist (0.05s)
    --- PASS: TestOffboardConsumer/succeed_offboarding_when_no_OBCs (0.00s)
PASS
ok  	github.com/red-hat-storage/ocs-operator/v4/services/provider/server	1.552s

Manual Test

  1. With the provider-consumer setup after onboarding, I used a script to create OBC for the storage client.
  2. Used the web console of the client cluster: installed operators -> click on OpenShift Data Foundation Client -> Storage Client tab.
  3. From the storage client list, click on the option to delete the storage client. Notice the status is Offboarding.
  4. Delete the OBC and look at the logs to see the failure when we had the OBC and the success after it was deleted.

From ocs-operator logs:

I0216 14:49:02.908867 1 server.go:437] "StorageConsumer has OBCs, will not continue with offboarding" logger="OffboardConsumer" consumer="32040621-4c56-4de7-9b43-ee3fe3ee7cef"

I0216 14:50:00.218008 1 server.go:447] "Successfully Offboarded Client from StorageConsumer with the provided UUID" logger="OffboardConsumer" consumer="32040621-4c56-4de7-9b43-ee3fe3ee7cef"

From ocs-client-operator logs:

2026-02-16T14:45:56Z ERROR Reconciler error {"controller": "storageclient", "controllerGroup": "ocs.openshift.io", "controllerKind": "StorageClient", "StorageClient": {"name":"test-storage-client"}, "namespace": "", "name": "test-storage-client", "reconcileID": "f0e59125-7d9d-446d-8aac-4d79b513b2d4", "error": "failed to offboard consumer for storageclient test-storage-client: failed to offboard consumer: rpc error: code = FailedPrecondition desc = storageConsumer has OBCs, will not continue with offboarding"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
/src/remote_source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:353
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
/src/remote_source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:300
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.1
/src/remote_source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:202

2026-02-16T14:50:00Z INFO StorageClient is offboarded {"controller": "storageclient", "controllerGroup": "ocs.openshift.io", "controllerKind": "StorageClient", "StorageClient": {"name":"test-storage-client"}, "namespace": "", "name": "test-storage-client", "reconcileID": "330d0030-a4d3-4f3a-97a3-4747f7bb4ea9", "StorageClient": "test-storage-client"}

@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 16, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Feb 16, 2026

Hi @shirady. Thanks for your PR.

I'm waiting for a red-hat-storage member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@shirady
Copy link
Copy Markdown
Contributor Author

shirady commented Feb 16, 2026

I think I found a bug not related to the suggested changes.
I saw that the storage client was deleted, but the storage consumer still exists (I could see that the last heartbeat was 20 minutes ago).

@shirady shirady changed the title RHSTOR-6230 | OffBoarding of storage consumer with OBCs RHSTOR-6230 | Offboarding of Storage Consumer with OBCs Feb 16, 2026
@shirady shirady marked this pull request as ready for review February 18, 2026 08:21
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 18, 2026
@shirady
Copy link
Copy Markdown
Contributor Author

shirady commented Feb 18, 2026

@rewantsoni , @leelavg, please take a look
Please notice that I saw a bug that I would want to dicuss on (see link).

@rewantsoni
Copy link
Copy Markdown
Member

I saw that the storage client was deleted, but the storage consumer still exists (I could see that the last heartbeat was 20 minutes ago).

@shirady This is the expected behaviour, the user creates the day 2 storage consumers, and they are responsible for deleting them.

Comment thread services/provider/server/server.go Outdated
Comment thread services/provider/server/server.go
@shirady
Copy link
Copy Markdown
Contributor Author

shirady commented Feb 19, 2026

I saw that the storage client was deleted, but the storage consumer still exists (I could see that the last heartbeat was 20 minutes ago).

@shirady This is the expected behaviour, the user creates the day 2 storage consumers, and they are responsible for deleting them.

This would mean that the storage admin would need to use force deletion?

@rewantsoni
Copy link
Copy Markdown
Member

This would mean that the storage admin would need to use force deletion?

Not force deletion but deletion of storageconsumer would work in this case

Comment thread services/provider/server/server.go Outdated
@shirady
Copy link
Copy Markdown
Contributor Author

shirady commented Feb 25, 2026

Not force deletion but deletion of storageconsumer would work in this case

I used the deletion, and it worked, thanks.

@shirady shirady force-pushed the block-offboarding-for-remote-obc branch from 826e688 to e167d67 Compare March 8, 2026 12:50
@shirady shirady changed the title RHSTOR-6230 | Offboarding of Storage Consumer with OBCs Offboarding of Storage Consumer with OBCs Mar 8, 2026
@rewantsoni
Copy link
Copy Markdown
Member

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Mar 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 9, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rewantsoni, shirady
Once this PR has been reviewed and has the lgtm label, please assign obnoxxx for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rewantsoni rewantsoni removed the lgtm Indicates that a PR is ready to be merged. label Mar 9, 2026
@shirady shirady force-pushed the block-offboarding-for-remote-obc branch from e167d67 to aa32eba Compare March 9, 2026 14:15
Signed-off-by: shirady <57721533+shirady@users.noreply.github.com>
@shirady shirady force-pushed the block-offboarding-for-remote-obc branch from aa32eba to d970971 Compare March 9, 2026 14:31
@shirady
Copy link
Copy Markdown
Contributor Author

shirady commented Mar 9, 2026

@nb-ohad please take a look

Comment on lines +449 to +460
obcList := &nbv1.ObjectBucketClaimList{}
if err := s.client.List(
ctx,
obcList,
&client.MatchingLabels{
storageConsumerUUIDLabelKey: req.StorageConsumerUUID,
},
client.Limit(1),
); err != nil {
logger.Error(err, "failed to list OBCs for offboarding")
return nil, status.Errorf(codes.Internal, "failed to offboard storageConsumer with the provided UUID. %v", err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe it make sense implemented here.
The storage admin has no course of action, or even understanding if the obcs are needed or not, as the owner of these obcs is the application owner.

It make more sense to have code on the client side that prevent the call to offboard in case the client does not match the right conditions. And as a matter of fact we already have an established pattern for that in the client operator:

https://github.com/red-hat-storage/ocs-client-operator/blob/main/internal/controller/storageclient_controller.go#L540-L559

Copy link
Copy Markdown
Contributor Author

@shirady shirady Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that the source of truth related to the OBC is the provider so I created it as part of the check of offboarding (under the assumption that I'm blocking it).
Anyway, I like the suggestion to do it in ocs-client-operator to prevent the offboarding (fail fast).
I raised a PR red-hat-storage/ocs-client-operator#531

After reviewing PRs red-hat-storage/ocs-client-operator#520 and red-hat-storage/ocs-client-operator#531 we can discuss if it might be needed if not, I'll close this PR.

@nb-ohad
Copy link
Copy Markdown
Contributor

nb-ohad commented Mar 9, 2026

This PR should not get merged unless @shirady (or anyone else) can provide a good reply on: #3696 (comment)

in the meantime putting this PR on hold
/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 9, 2026
@shirady shirady changed the title Offboarding of Storage Consumer with OBCs Block Offboarding of Storage Consumer with OBCs [hold] Mar 10, 2026
@shirady
Copy link
Copy Markdown
Contributor Author

shirady commented Mar 25, 2026

The PR to prevent offboarding when there are OBCs related to the storage client was merged - red-hat-storage/ocs-client-operator#531. Closing this PR.

@shirady shirady closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants