Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@cliff-openai
Copy link

@cliff-openai cliff-openai commented Aug 14, 2025

What is the purpose of this change?

This change adds first-class metadata support to the Azure Blob backend, including headers, user metadata, tags, and modtime overrides, and wires it through uploads and server-side copies.

There is a behavior change in that rclone will now set the "mtime" custom metadata when doing server side copies to azure and the --metadata argument is given.

Highlights

  • Map standard headers: cache-control, content-disposition, content-encoding, content-language, content-type → corresponding x-ms-blob-* HTTP headers.
  • Map user metadata: any non-reserved keys (excluding x-ms-*) are sent as blob user metadata. Keys are normalized to lowercase for consistency.
  • Support tags: parse x-ms-tags as a comma-separated list of key=value pairs and apply them on uploads and copies.
  • Support mtime override: accept mtime in metadata (RFC3339/RFC3339Nano) to override the stored modtime persisted in user metadata.

Upload integration

  • Both singlepart and multipart uploads call a common mapper to translate --metadata, --metadata-set, and --metadata-mapper output into Azure HTTP headers, user metadata, and tags.
  • Ensures mtime is recorded in user metadata, honoring overrides.
  • Preserves and merges computed MD5 when enabled.

Tests

  • Add backend/azureblob/azureblob_internal_test.go covering:
    • Singlepart/multipart uploads with metadata mapping.
    • Singlepart/multipart copies with metadata mapping and Content-Type fallback behavior.
    • Validation helpers asserting headers and user metadata on the blob.

Was the change discussed in an issue or in the forum before?

Not directly, there is #8399 which seems abandoned (a smaller, less-complete version of this change).

I am willing to create an issue for this if it would be helpful.

Checklist

  • I have read the contribution guidelines.
  • I have added tests for all changes in this PR if appropriate. (although I haven't tested every corner case, let me know if I should add more testing)
  • I have added documentation for the changes if appropriate.
  • All commit messages are in house style.
  • I'm done, this Pull Request is ready for review :-)

This change adds first-class metadata support to the Azure Blob backend,
including headers, user metadata, tags, and modtime overrides, and wires
it through uploads and server-side copies.

There is a behavior change in that rclone will now set the "mtime"
custom metadata when doing server side copies to azure and the
`--metadata` argument is given.

Highlights
- Map standard headers: cache-control, content-disposition, content-encoding,
  content-language, content-type → corresponding x-ms-blob-* HTTP headers.
- Map user metadata: any non-reserved keys (excluding x-ms-*) are sent as
  blob user metadata. Keys are normalized to lowercase for consistency.
- Support tags: parse `x-ms-tags` as a comma-separated list of key=value
  pairs and apply them on uploads and copies.
- Support mtime override: accept `mtime` in metadata (RFC3339/RFC3339Nano)
  to override the stored modtime persisted in user metadata.

Upload integration
- Both singlepart and multipart uploads call a common mapper to translate
  `--metadata`, `--metadata-set`, and `--metadata-mapper` output into
  Azure HTTP headers, user metadata, and tags.
- Ensures `mtime` is recorded in user metadata, honoring overrides.
- Preserves and merges computed MD5 when enabled.

Tests
- Add backend/azureblob/azureblob_internal_test.go covering:
  - Singlepart/multipart uploads with metadata mapping.
  - Singlepart/multipart copies with metadata mapping and Content-Type
    fallback behavior.
  - Validation helpers asserting headers and user metadata on the blob.
@ncw
Copy link
Member

ncw commented Aug 15, 2025

I gave this a quick look - looks like good work - thank you :-)

It doesn't seem to set any Flags in Features, eg

rclone/backend/s3/s3.go

Lines 3974 to 3976 in d9a36ef

ReadMetadata: true,
WriteMetadata: true,
UserMetadata: true,

It also isn't setting the MetadataInfo in the backend initialization

rclone/backend/s3/s3.go

Lines 228 to 230 in d9a36ef

MetadataInfo: &fs.MetadataInfo{
System: systemMetadataInfo,
Help: `User metadata is stored as x-amz-meta- keys. S3 metadata keys are case insensitive and are always returned in lower case.`,

Can you set those things, then run the integration tests and see if they pass as the integration tests won't do anything to do with metadata unless the feature flags are set.

@cliff-openai
Copy link
Author

Thanks for the early feedback @ncw !
Sorry that I didn't notice that you had sent feedback until just now.

I think that maybe I did as you asked, and that it passes tests.

@cliff-openai cliff-openai marked this pull request as ready for review September 3, 2025 05:53
@roucc
Copy link
Member

roucc commented Oct 30, 2025

Hi @cliff-openai, I have ran the test_all -remotes TestAzureBlob: command (after upgrading it to V2), and have gotten 1 test failure:

"go test -v -timeout 1h0m0s -remote TestAzureBlob: -test.run '^TestIntegration$/^FsMkdir$/^FsPutFiles$'" - Starting (try 5/5)
=== RUN   TestIntegration
    fstests.go:438: Using remote "TestAzureBlob:"
2025/10/30 16:19:57 NOTICE: TestAzureBlob: Starting server
=== RUN   TestIntegration/FsMkdir
=== RUN   TestIntegration/FsMkdir/FsPutFiles
    fstests.go:146: 
        	Error Trace:	/home/dougal/rclone/fstest/fstests/fstests.go:146
        	            				/home/dougal/rclone/fstest/fstests/fstests.go:162
        	            				/home/dougal/rclone/fstest/fstests/fstests.go:224
        	            				/home/dougal/rclone/fstest/fstests/fstests.go:955
        	Error:      	Received unexpected error:
        	            	PUT https://rclone.blob.core.windows.net/rclone-test-wagukaf7yefi/file name.txt
        	            	--------------------------------------------------------------------------------
        	            	RESPONSE 400: 400 The metadata specified is invalid. It has characters that are not permitted.
        	            	ERROR CODE: InvalidMetadata
        	            	--------------------------------------------------------------------------------
        	            	<?xml version="1.0" encoding="utf-8"?><Error><Code>InvalidMetadata</Code><Message>The metadata specified is invalid. It has characters that are not permitted.
        	            	RequestId:835c1d3b-601e-0060-07b9-49c062000000
        	            	Time:2025-10-30T16:19:57.1785915Z</Message></Error>
        	            	--------------------------------------------------------------------------------
        	Test:       	TestIntegration/FsMkdir/FsPutFiles
        	Messages:   	Put
=== NAME  TestIntegration/FsMkdir
    fstests.go:2817: Warning: this should produce fs.ErrorDirNotFound
--- FAIL: TestIntegration (0.20s)
    --- FAIL: TestIntegration/FsMkdir (0.12s)
        --- FAIL: TestIntegration/FsMkdir/FsPutFiles (0.08s)
FAIL
exit status 1
FAIL	github.com/rclone/rclone/backend/azureblob	0.203s
"go test -v -timeout 1h0m0s -remote TestAzureBlob: -test.run '^TestIntegration$/^FsMkdir$/^FsPutFiles$'" - Finished ERROR in 692.895239ms (try 5/5): exit status 1: Failed [TestIntegration/FsMkdir/FsPutFiles]

Any ideas?
Other than this it is ready to merge, thank you!

Specifically, Azure Blob Storage only allows metadata keys to contain
letters, numbers, and underscores.
@cliff-openai
Copy link
Author

Gah, this took me too long to figure out.

The problem is that I had the fix in my local worktree and I had just never committed/pushed it.

You can see the fix/description in 488cfe0, but ultimately the problem is that the test was using a metadata key of rclone-test, and Azure Blob Storage does not support hyphens in metadata key values.

Copy link
Member

@ncw ncw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put a note inline as to why the GitHub CI is failing - we just need to adjust the way your tests are called slightly

Thanks

}

// Standalone runner for metadata path tests to allow easy filtering with -run
func TestAzureMetadataPaths(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests are failing because there is no valid config for azure blob when running in the GitHub CI.

We do the integration tests on a separate machine with valid configs for all providers (and a whole suite of other tests).

So this test needs to Skip if there isn't a valid config.

However I think you probably need to delete TestAzureMetadataPaths completely and just call testAzureMetadataPaths from InternalTest

That will ensure it is only called when there is a valid config.

You should be able to filter these tests (with a longer path admittedly) with -run just fine.

@cliff-openai
Copy link
Author

Thanks! I’ve dropped the standalone TestAzureMetadataPaths helper and now call testMetadataPaths only from InternalTest, so the metadata suite stays within the integration harness and hopefully skips the CI config requirement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants