feat: Implement MultipartUploadBackend on TieredStorage by lcian · Pull Request #458 · getsentry/objectstore

lcian · 2026-05-04T15:36:25Z

Last implementation of MultipartUploadBackend in objectstore-service.
Now TieredStorage requires a MultipartUploadBackend in its long_term slot.
All methods just pretty much forward to long_term, with the exception of complete_multipart which needs to use the Changelog for consistency and deal with the tombstones, with logic very similar to put_object except that we need an additional metadata read call.

Close FS-339

codecov · 2026-05-04T15:37:58Z

Codecov Report

❌ Patch coverage is 87.91946% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.95%. Comparing base (24b35be) to head (c21fe73).

Files with missing lines	Patch %	Lines
objectstore-service/src/backend/testing.rs	0.00%	20 Missing ⚠️
objectstore-service/src/backend/mod.rs	0.00%	8 Missing ⚠️
objectstore-test/src/server.rs	0.00%	5 Missing ⚠️
objectstore-service/src/backend/tiered.rs	98.85%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #458      +/-   ##
==========================================
+ Coverage   86.94%   86.95%   +0.01%     
==========================================
  Files          77       77              
  Lines       11394    11685     +291     
==========================================
+ Hits         9906    10161     +255     
- Misses       1488     1524      +36

Components	Coverage Δ
Rust Backend	`91.73% <89.41%> (-0.08%)`	⬇️
Rust Client	`80.10% <ø> (ø)`
Python Client	`86.36% <ø> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

linear-code · 2026-05-05T10:57:59Z

FS-339 Implement multipart upload for Tiered backend

lcian · 2026-05-07T08:51:01Z

+    fn try_into(self) -> Result<UploadId, Self::Error> {
+        use base64::Engine;
+        let json =
+            serde_json::to_vec(&self).map_err(|e| Error::serde("encoding multipart token", e))?;


Maybe there's a smarter way to concatenate the 2 with less size overhead than JSON...
But any separator could theoretically also appear in the ID returned by the inner backend.

cursor · 2026-05-07T09:07:04Z

+            .await?
+            .ok_or_else(|| {
+                Error::generic("completed multipart object not found in long-term storage")
+            })?;


Transient metadata read failure causes irrecoverable upload loss

Medium Severity

After the inner complete_multipart succeeds (consuming the multipart upload irreversibly), the guard is advanced to Written, and then get_metadata is called. If this call fails (transient network error or returns None), the ? propagates an error and the guard drops in Written phase. The cleanup logic then deletes the successfully assembled object from long-term storage. Unlike put_long_term — where the metadata is available as a parameter and no extra call exists between the LT write and CAS commit — here the client cannot retry complete_multipart because the upload has been consumed by the inner backend. They must re-upload all parts from scratch, which can be very costly for large uploads. Encoding the expiration_policy in the TieredUploadId at initiation time would eliminate this failure point entirely.

^{Reviewed by Cursor Bugbot for commit 409be20. Configure here.}

It's very unlikely that this fails. I don't want to encode more information in the token.

another option is to move the inner.long_term.get_metadata() call earlier in the function before we create our change guard. that way if it fails we can at least 5XX and the user can retry before we've made any persistent changes

looking at the change guard impl

objectstore/objectstore-service/src/backend/changelog.rs

Lines 304 to 330 in 1615f71

async fn cleanup(self) {

let current = match self.phase {

// For `Recovered`, we must first check the state of the tombstone.

ChangePhase::Recovered => self.read_tombstone().await,

ChangePhase::Recorded => self.change.old.clone(),

// For `Written`, the CAS outcome is unknown — read HV to determine it.

ChangePhase::Written => self.read_tombstone().await,

ChangePhase::Lost => self.change.old.clone(),

ChangePhase::Updated => self.change.new.clone(),

ChangePhase::Completed => return, // unreachable

};

if current != self.change.old

&& let Some(ref old) = self.change.old

{

self.cleanup_lt(old).await;

}

if current != self.change.new

&& let Some(ref new) = self.change.new

{

self.cleanup_lt(new).await;

}

self.cleanup_log().await;

self.mark_completed();

}

thinking through the remaining possible errors and how the change guard would handle them:

if inner.long_term.complete_multipart() returns an error:

the guard will drop in the Recorded without cleaning anything up. the user can retry, unless the underlying backend is in an unretryable state

if inner.high_volume.compare_and_write() returns an error and didn't persist anything:

the guard will drop in the Written state. it will fetch the current tombstone ID, see that our new ID doesn't match, and delete the new ID from LT storage.

the user has to redo the whole multipart upload, but the system should be sane. we need to make sure the error communicates this

this is the same issue as what the bot describes, but this instance is unavoidable

if inner.high_volume.compare_and_write() returns an error but persisted the tombstone:

the guard will drop in the Written state. it will fetch the current tombstone ID, see that our new ID does match, and exit without cleaning anything up.

incidentally i see our change log doesn't yet have a durable backend implemented. i guess that's what FS-310 is still open for

incidentally i see our change log doesn't yet have a durable backend implemented. i guess that's what FS-310 is still open for

Exactly.

another option is to move the inner.long_term.get_metadata() call earlier in the function before we create our change guard. that way if it fails we can at least 5XX and the user can retry before we've made any persistent changes

We cannot call get_metadata yet because the object doesn't exist. S3 and GCS don't have APIs to retrieve the metadata for an incomplete multipart upload.
That's why we need to materialize the object first. And therefore, we need to create the guard beforehand, to avoid leaving that blob orphaned.

The alternative would be to include the expiration policy in the UploadId we give back to the user, but that introduces all kinds of conditions you have to handle, what if the user passes a different expiration policy at creation/completion? Also, we would need to validate the expiration policy in two places, etc. etc. -- in my opinion the first call should be the source of truth for this.

But you're right that this warrants some more thought.
With the current implementation, the client can basically never retry complete_multipart effectivly, because the dropped ChangeGuard will probably delete the new blob even if it completed successfully 🤔

The core issue is that ChangeGuard::Recorded assumes that every LT write uses a unique key, while in this case we would retry multiple times on the same key (revision).

matt-codecov

stamp w comments

matt-codecov · 2026-05-11T19:04:08Z

 use tokio_util::task::task_tracker::TaskTrackerToken;

-use crate::backend::common::{Backend, HighVolumeBackend, TieredMetadata};
+use crate::backend::common::{HighVolumeBackend, MultipartUploadBackend, TieredMetadata};


is every long-term storage backend required to impl MultipartUploadBackend? should we rename it LongTermBackend trait to parallel the HighVolumeBackend and make sure both have a trait bound requiring that Backend be implemented too?

Yes exactly, it's implemented as follows:

objectstore/objectstore-service/src/backend/common.rs

Line 69 in 1615f71

pub trait MultipartUploadBackend: Backend + fmt::Debug + Send + Sync + 'static {

So, it already has a trait bound on Backend.

I guess a more abstracted name for it could be LongTermBackend, yeah. Let me change it in a follow-up.

matt-codecov · 2026-05-11T19:20:14Z

+            .await?;
+
+        if error.is_some() {
+            return Ok(error);


is Ok appropriate here? is this an endpoint that returns 200 OK always and you have to read the response body to interpret errors?

Yes that's intended, that's also how S3/GCS implement this and I mentioned it in the design doc.
That's because the complete call can take a while (I guess it's assembling the object in the backend), so S3 takes exactly this approach of a 200 and error in the body, therefore the most general solution is for us to mirror it.
Otherwise, the client might very easily run into a timeout if we don't return a response/headers within X s.

But probably this warrants a comment explaining how it works.

matt-codecov · 2026-05-11T20:14:14Z

+            .await?
+            .ok_or_else(|| {
+                Error::generic("completed multipart object not found in long-term storage")
+            })?;


another option is to move the inner.long_term.get_metadata() call earlier in the function before we create our change guard. that way if it fails we can at least 5XX and the user can retry before we've made any persistent changes

looking at the change guard impl

objectstore/objectstore-service/src/backend/changelog.rs

Lines 304 to 330 in 1615f71

async fn cleanup(self) {

let current = match self.phase {

// For `Recovered`, we must first check the state of the tombstone.

ChangePhase::Recovered => self.read_tombstone().await,

ChangePhase::Recorded => self.change.old.clone(),

// For `Written`, the CAS outcome is unknown — read HV to determine it.

ChangePhase::Written => self.read_tombstone().await,

ChangePhase::Lost => self.change.old.clone(),

ChangePhase::Updated => self.change.new.clone(),

ChangePhase::Completed => return, // unreachable

};

if current != self.change.old

&& let Some(ref old) = self.change.old

{

self.cleanup_lt(old).await;

}

if current != self.change.new

&& let Some(ref new) = self.change.new

{

self.cleanup_lt(new).await;

}

self.cleanup_log().await;

self.mark_completed();

}

thinking through the remaining possible errors and how the change guard would handle them:

if inner.long_term.complete_multipart() returns an error:

the guard will drop in the Recorded without cleaning anything up. the user can retry, unless the underlying backend is in an unretryable state

if inner.high_volume.compare_and_write() returns an error and didn't persist anything:

the guard will drop in the Written state. it will fetch the current tombstone ID, see that our new ID doesn't match, and delete the new ID from LT storage.

the user has to redo the whole multipart upload, but the system should be sane. we need to make sure the error communicates this

this is the same issue as what the bot describes, but this instance is unavoidable

if inner.high_volume.compare_and_write() returns an error but persisted the tombstone:

the guard will drop in the Written state. it will fetch the current tombstone ID, see that our new ID does match, and exit without cleaning anything up.

incidentally i see our change log doesn't yet have a durable backend implemented. i guess that's what FS-310 is still open for

Move ChangeGuard creation before get_metadata so that if the metadata read fails after complete_multipart succeeds, the guard cleans up the assembled blob. Previously, Manual-expiration objects could be orphaned permanently. Also fix redundant rustdoc link.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit d6eb979. Configure here.}

cursor · 2026-05-12T11:44:53Z

+
+        if error.is_some() {
+            return Ok(error);
+        }


Cleanup guard can race with retries on fixed revision key

Medium Severity

The change guard is recorded before calling complete_multipart on the long-term backend. If completion fails (body error or transport error), the guard drops in Recorded phase and spawns a background cleanup that deletes the physical key via delete_object. Unlike put_long_term — where each call generates a fresh revision key via new_long_term_revision — here the physical key is fixed (baked into the TieredUploadId at initiation time). If a client retries complete_multipart with the same upload_id and the retry succeeds before the old cleanup task executes delete_object, the cleanup deletes the newly completed object, leaving the subsequently written HV tombstone pointing to a non-existent LT blob (data loss).

Additional Locations (1)

objectstore-service/src/backend/changelog.rs#L303-L330

^{Reviewed by Cursor Bugbot for commit d6eb979. Configure here.}

lcian · 2026-05-15T09:49:02Z

This is pending resolution of #458 (comment)

lcian requested a review from a team as a code owner May 4, 2026 15:36

lcian changed the title ~~feat(gcs): implement MultipartUploadBackend using GCS XML API~~ feat: Implement MultipartUploadBackend on TieredStorage May 4, 2026

lcian changed the base branch from main to lcian/feat/multipart-upload-other-backends May 4, 2026 15:37

sentry Bot reviewed May 4, 2026

View reviewed changes

Comment thread objectstore-service/src/backend/tiered.rs Outdated

This comment was marked as outdated.

Sign in to view

lcian marked this pull request as draft May 4, 2026 15:50

lcian commented May 7, 2026

View reviewed changes

Comment thread objectstore-service/src/backend/tiered.rs

lcian commented May 7, 2026

View reviewed changes

lcian marked this pull request as ready for review May 7, 2026 08:54

This comment was marked as off-topic.

Sign in to view

lcian requested a review from matt-codecov May 7, 2026 08:55

sentry Bot reviewed May 7, 2026

View reviewed changes

Comment thread objectstore-service/src/backend/tiered.rs

cursor Bot reviewed May 7, 2026

View reviewed changes

lcian mentioned this pull request May 7, 2026

feat(server): Add multipart upload endpoints #463

Open

lcian commented May 8, 2026

View reviewed changes

Comment thread objectstore-service/src/backend/tiered.rs Outdated

matt-codecov approved these changes May 11, 2026

View reviewed changes

lcian force-pushed the lcian/feat/multipart-upload-other-backends branch from e8a193d to 803c658 Compare May 12, 2026 10:43

Base automatically changed from lcian/feat/multipart-upload-other-backends to main May 12, 2026 11:22

lcian added 8 commits May 12, 2026 13:30

feat: Implement MultipartUploadBackend on TieredStorage

8713145

fix: resolve broken rustdoc link to MultipartUploadBackend

4159e84

better config

e087dae

rearrange

7bbf0fe

uniformize hooks implementation

ae731e1

improvements

0cffb7c

lints

d6eb979

lcian force-pushed the lcian/feat/multipart-upload-tiered branch from 409be20 to d6eb979 Compare May 12, 2026 11:32

cursor Bot reviewed May 12, 2026

View reviewed changes

fix

c21fe73

	async fn cleanup(self) {
	let current = match self.phase {
	// For `Recovered`, we must first check the state of the tombstone.
	ChangePhase::Recovered => self.read_tombstone().await,
	ChangePhase::Recorded => self.change.old.clone(),
	// For `Written`, the CAS outcome is unknown — read HV to determine it.
	ChangePhase::Written => self.read_tombstone().await,
	ChangePhase::Lost => self.change.old.clone(),
	ChangePhase::Updated => self.change.new.clone(),
	ChangePhase::Completed => return, // unreachable
	};

	if current != self.change.old
	&& let Some(ref old) = self.change.old
	{
	self.cleanup_lt(old).await;
	}

	if current != self.change.new
	&& let Some(ref new) = self.change.new
	{
	self.cleanup_lt(new).await;
	}

	self.cleanup_log().await;
	self.mark_completed();
	}

Uh oh!

Conversation

lcian commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

This comment was marked as outdated.

linear-code Bot commented May 5, 2026

Uh oh!

Uh oh!

lcian May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

cursor Bot May 7, 2026

Choose a reason for hiding this comment

Transient metadata read failure causes irrecoverable upload loss

Uh oh!

lcian May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

matt-codecov left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lcian May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 12, 2026

Choose a reason for hiding this comment

Cleanup guard can race with retries on fixed revision key

Uh oh!

lcian commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lcian commented May 4, 2026 •

edited

Loading

codecov Bot commented May 4, 2026 •

edited

Loading

lcian May 7, 2026 •

edited

Loading

lcian May 7, 2026 •

edited

Loading

lcian May 12, 2026 •

edited

Loading