Skip to content

aws: support multipart copy for objects larger than 5GB#561

Closed
james-rms wants to merge 9 commits intoapache:mainfrom
james-rms:jrms/aws-multipart-copy
Closed

aws: support multipart copy for objects larger than 5GB#561
james-rms wants to merge 9 commits intoapache:mainfrom
james-rms:jrms/aws-multipart-copy

Conversation

@james-rms
Copy link
Copy Markdown
Contributor

@james-rms james-rms commented Dec 3, 2025

Which issue does this PR close?

Rationale for this change

Today, users that attempt to copy a >5GB object in S3 using object_store will see this error:

Server returned non-2xx status code: 400 Bad Request: 
<Error><Code>InvalidRequest</Code><Message>
The specified copy source is larger than the maximum allowable size for a copy source: 5368709120
</Message></Error>

The way to get around this problem per AWS's docs is to do the copy in several parts using multipart copies. This PR adds that functionality to the AWS client.

It adds two additional configuration parameters:

    /// The size threshold above which copy uses multipart copies under the hood. defaults to 5GB.
    multipart_copy_threshold: u64
    /// When using multipart copies, the part size used. Defaults to 5GB.
    multipart_copy_part_size: u64

The defaults are chosen to minimise surprise: if people are used to copies not requiring several requests, we don't switch to that method until it's absolutely necessary, and when necessary, we use as few parts as possible.

What changes are included in this PR?

See above.

Are there any user-facing changes?

Yes - these configuration parameters should be covered by the docstring changes.

@james-rms james-rms force-pushed the jrms/aws-multipart-copy branch 3 times, most recently from 4719ef4 to 09cef9b Compare December 4, 2025 12:11
@james-rms james-rms force-pushed the jrms/aws-multipart-copy branch from 09cef9b to acc8cc4 Compare December 4, 2025 12:14
@james-rms james-rms marked this pull request as ready for review December 4, 2025 12:26
@tustvold
Copy link
Copy Markdown
Contributor

tustvold commented Dec 4, 2025

I think this probably warrants a higher level ticket to discuss how we should support this, as a start it would be good to understand how other stores, i.e. GCS and Azure handle this, so that we can develop an abstraction that makes sense.

In particular I wonder if adding this functionality would make more sense as part of the multipart upload functionality? This of course depends on what other stores support.

In general filing an issue first to get consensus on an approach is a good idea before jumping in on an implementation

@james-rms
Copy link
Copy Markdown
Contributor Author

Great, created #563.

@james-rms
Copy link
Copy Markdown
Contributor Author

@tustvold updated to pass CI and a small tweak to avoid overflow.

Copy link
Copy Markdown
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @james-rms and @tustvold -- the high level idea seems reasonable to me, but I think this code needs tests (maybe unit tests?) or something otherwise we may break the functionality inadvertently in some future refactor

@james-rms james-rms force-pushed the jrms/aws-multipart-copy branch from a8d535c to 1fdf7d6 Compare January 6, 2026 03:16
@james-rms
Copy link
Copy Markdown
Contributor Author

@tustvold I've refactored slightly for unit tests and added a couple of integration tests as well. Please take another look when you have some time.

Comment thread src/aws/mod.rs Outdated
mode,
extensions: _,
} = options;
// Determine source size to decide between single CopyObject and multipart copy
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way we can avoid this, e.g. we try CopyObject normally and on error fallback to multipart? Otherwise this adds an additional S3 roundtrip to every copy request.

Comment thread src/aws/mod.rs
Comment thread src/aws/mod.rs Outdated
Copy link
Copy Markdown
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and well tested, sorry for the delay in reviewing, been absolutely swamped.

I think we need to find a way to avoid regressing the common case of files smaller than 5GB, e.g. by first attempting CopyObject and then falling back if it errors (I am presuming S3 gives a sensible error here).

@james-rms
Copy link
Copy Markdown
Contributor Author

james-rms commented Jan 14, 2026

I think we need to find a way to avoid regressing the common case of files smaller than 5GB, e.g. by first attempting CopyObject and then falling back if it errors (I am presuming S3 gives a sensible error here).

This is what we get back from S3:

<Error>
  <Code>InvalidRequest</Code>
  <Message>The specified copy source is larger than the maximum allowable size for a copy source: 5368709120</Message>
  <RequestId>...</RequestId>
  <HostId>...</HostId>
</Error>

As explained by AWS:

This error might occur for the following reasons:

An unpaginated ListBuckets request is made from an account that has an approved general purpose bucket quota higher than 10,000. You must make paginated requests to list the buckets in an account with more than 10,000 buckets.

The request is using the wrong signature version. Use AWS4-HMAC-SHA256 (Signature Version 4).

An access point can be created only for an existing bucket.

The access point is not in a state where it can be deleted.

An access point can be listed only for an existing bucket.

The next token is not valid.

At least one action must be specified in a lifecycle rule.

At least one lifecycle rule must be specified.

The number of lifecycle rules must not exceed the allowed limit of 1000 rules.

The range for the MaxResults parameter is not valid.

SOAP requests must be made over an HTTPS connection.

Amazon S3 Transfer Acceleration is not supported for buckets with non-DNS compliant names.

Amazon S3 Transfer Acceleration is not supported for buckets with periods (.) in their names.

The Amazon S3 Transfer Acceleration endpoint supports only virtual style requests.

Amazon S3 Transfer Acceleration is not configured on this bucket.

Amazon S3 Transfer Acceleration is disabled on this bucket.

Amazon S3 Transfer Acceleration is not supported on this bucket. For assistance, contact [Support](https://aws.amazon.com/contact-us/).

Amazon S3 Transfer Acceleration cannot be enabled on this bucket. For assistance, contact [Support](https://aws.amazon.com/contact-us/).

Conflicting values provided in HTTP headers and query parameters.

Conflicting values provided in HTTP headers and POST form fields.

CopyObject request made on objects larger than 5GB in size.

So the real question is if we make a CopyObject call, can we assume that any InvalidRequest that comes back is because the object was >5GB in size. Given the documentation I think that's OK? but i'm not really clear that it will remain OK going forward. I'll push a commit that does this and let you decide on the approach.

@tustvold
Copy link
Copy Markdown
Contributor

I guess another option would be to make this disabled by default and therefore opt-in... How do other clients handle this?

@james-rms
Copy link
Copy Markdown
Contributor Author

james-rms commented Jan 15, 2026

Go S3 SDK leaves this up to the caller to figure out: https://pkg.go.dev/github.com/aws/aws-sdk-go-v2/service/s3#Client.CopyObject

Go Cloud SDK seems to not support copies >5GB, because copy calls are forwarded directly to the CopyObject API, and it doesn't expose a multipart copy method: https://github.com/google/go-cloud/blob/a52bb6614a70209265758ad7a795a4a3931fbe0b/blob/s3blob/s3blob.go#L856

james-rms and others added 2 commits January 15, 2026 20:00
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
@james-rms
Copy link
Copy Markdown
Contributor Author

@tustvold i've implemented this as disabled by default (>5GB copies will just fail) and opt-in. By default there is no pre-copy head request.

@james-rms
Copy link
Copy Markdown
Contributor Author

@tustvold any more on this one?

@tustvold
Copy link
Copy Markdown
Contributor

I will try to find some time to take a look at the weekend, I'm afraid I am currently absolutely swamped at work

@james-rms
Copy link
Copy Markdown
Contributor Author

No worries, and thanks for putting up with my regular pings on this one!

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Jan 28, 2026

@crepererum and I were talking about this today and maybe he will have some time to review as well

Comment thread src/aws/mod.rs
.await?;

let mut parts = Vec::new();
for (idx, payload) in self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically we could do perform these requests concurrently (at least up to a certain amount), but I think the serial implementation is "good enough" for now. It's certainly better than the status quo (i.e. not being able to copy the object at all).

Comment thread src/aws/builder.rs
/// Threshold (bytes) above which copy uses multipart copy. If not set, all copies are performed
/// as single requests.
multipart_copy_threshold: Option<ConfigValue<u64>>,
/// Preferred multipart copy part size (bytes). If not set, defaults to 5 GiB.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a question I have but that I also think should be included in the doc-string: if the object was created using a multi-part upload, are there any requirements on the alignment of the copy-parts or is this irrelevant?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not explicitly documented (though there is a note around range requests). There's every chance it's faster to line up the part boundaries exactly, but I haven't tested this directly.

I hadn't thought about it and this bothers me. It doesn't feel good to be opaquely putting part boundaries into a copied object that the user has no visibility into. This also points to the right API being one where the user creates parts directly and keeps track of the boundaries.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW #121

Comment thread src/aws/mod.rs
Comment on lines +404 to +405
self.copy_multipart(from, to, size, CompleteMultipartMode::Overwrite)
.await?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.copy_multipart(from, to, size, CompleteMultipartMode::Overwrite)
.await?;
tracing::debug!(%from, %to, "multipart copy");
self.copy_multipart(from, to, size, CompleteMultipartMode::Overwrite)
.await?;

(you may need to check if Path implements Display or not)

Reasoning: we are changing the client behavior dynamically based on data size and if a server misbehaves due to that, it would be nice if we would at least leave some breadcrumbs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, i'll take a look.

@james-rms james-rms closed this Mar 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable AWS client to copy objects >5GB in size

4 participants