Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@alanshaw
Copy link
Contributor

@alanshaw alanshaw commented Oct 3, 2025

The container is specified to be CBOR encoded, but the tokens within the container are themselves an array of bytes.

It means that you can add the same tokens to the container in a different order and get a different encoded representation.

This PR updates the wording to REQUIRE that the tokens are ordered bytewise, which, when followed will ensure that there is only 1 set of bytes that correspond to the container contents. i.e. adding determinism.

The container is specified to be CBOR encoded, but the tokens within the container are themselves an _array_ of bytes.

It means that you can add the _same_ tokens to the container in a _different_ order and get a _different encoded representation_.

This PR updates the wording to recommend that the tokens are ordered bytewise, which, when followed will ensure that there is only 1 set of bytes that correspond to the container contents. i.e. adding determinism.

Signed-off-by: ash <[email protected]>
@cla-bot
Copy link

cla-bot bot commented Oct 3, 2025

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have the users @alanshaw on file. In order for us to review and merge your code, please contact the project maintainers to get yourself added.

@cla-bot
Copy link

cla-bot bot commented Oct 3, 2025

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have the users @alanshaw on file. In order for us to review and merge your code, please contact the project maintainers to get yourself added.

@MichaelMure
Copy link
Member

One downside of this approach is that it forces the writer to accumulate the tokens in memory to sort before writing.

I understand this solve your "have a unique ID for a container regardless of ordering", but I'm not too sure it's the right solution. Have you considered my suggestion of taking a hash of each token bytes and XOR them together?

@MichaelMure
Copy link
Member

Imho it's not an important detail, but I'm just wondering if that's actually something that need to be fixed.

Copy link
Member

@hugomrdias hugomrdias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM specially with the SHOULD and not the MUST.

@MichaelMure @expede any comments?

@MichaelMure
Copy link
Member

One downside of this approach is that it forces the writer to accumulate the tokens in memory to sort before writing.

The go version accumulate in memory, so it's not like it's going to change much over there.

@alanshaw
Copy link
Contributor Author

alanshaw commented Oct 3, 2025

One downside of this approach is that it forces the writer to accumulate the tokens in memory to sort before writing.

No force, it's a recommendation.

I understand this solve your "have a unique ID for a container regardless of ordering", but I'm not too sure it's the right solution. Have you considered my suggestion of taking a hash of each token bytes and XOR them together?

I'm not really looking for a work around. I would prefer this was a requirement (but I guess here we are post fact) and I would prefer that a layer of the stack that does not know how to speak UCAN be able to determine if it has seen this set of bytes or not. i.e. it should not have to decode the bytes and do some custom computation.

Determism is a desirable quality to have from a UCAN data structure. All other aspects of UCAN are deterministic, largely due to CBOR, but when not (e.g. proofs array), we specify the order they should be in. It seems a shame to lose determinism at the last mile.

@MichaelMure
Copy link
Member

I guess here we are post fact

no objection from me to make it MUST

I would prefer that a layer of the stack that does not know how to speak UCAN be able to determine if it has seen this set of bytes or not. i.e. it should not have to decode the bytes and do some custom computation.

Note that my solution doesn't mean decoding the UCAN, just the CBOR.

@cla-bot
Copy link

cla-bot bot commented Dec 3, 2025

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have the users @alanshaw on file. In order for us to review and merge your code, please contact the project maintainers to get yourself added.

@cla-bot
Copy link

cla-bot bot commented Dec 3, 2025

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have the users @alanshaw on file. In order for us to review and merge your code, please contact the project maintainers to get yourself added.

@alanshaw
Copy link
Contributor Author

alanshaw commented Dec 3, 2025

Ok I have updated to make this a requirement i.e. a MUST.

@alanshaw alanshaw changed the title Add recommendation to sort tokens Add requirement to sort tokens Dec 3, 2025
Copy link
Member

@MichaelMure MichaelMure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but let's wait a little bit to see if @expede has an opinion?

@hugomrdias
Copy link
Member

@expede already said in the last ucan call shes ok with this

@MichaelMure MichaelMure merged commit e2d733d into ucan-wg:main Dec 3, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants