Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

jmayclin
Copy link
Contributor

@jmayclin jmayclin commented Jul 8, 2025

Description of changes:

This PR adds a new crate, aws-kms-tls-auth. This will be used to provide the following two items.

  • KmsPskProvider: used by the client as a ConnectionInitializer to set psks on the connection
  • KmsPskReceiver: used by the server as a ClientHelloCallback to decrypt the client psks and set them on the connection.

This PR is laying the ground work for that

  • parsing functionality required to retrieve the PSK identities
  • CI setup to run tests/lint.

Call-outs:

s2n-tls parser vs new parser: We could do some of the client hello parsing with s2n-tls, using methods like s2n_client_hello_get_extenion_by_id. However this doesn't remove the need for all of the codec (EncodeValue, DecodeValue) traits, and we'd still have the parse the actual extension ourself. Also, adding the PSK extension to the public s2n-tls API is a one-way door, and it is a feature that most customer should not use.

eventually parser stuff will be in a different crate: The entirety of client_hello_parser, codec, and prefixed_list should live in a different crate. In fact they already live in a different crate in one of my personal repos but we haven't gotten around to open sourcing them yet.

crate home: I expect this crate to live in it's own repo under the awslabs org. For now I will be putting it alongside the standard and extended workspaces in s2n-tls/bindings/rust/. It must be in it's own workspace because the AWS-SDK require an MSRV 1.88, which breaks our standard "6 month" MSRV promise.

Testing:

  • fuzz: We add fuzz tests for the two methods which are accepting untrusted wire input. s2n-tls already parses this so it's presumably not too toxic, but better safe than sorry.
  • unit: I added a basic unit test that parses the client hello and PSKs from s2n-tls.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions github-actions bot added the s2n-core team label Jul 8, 2025
@jmayclin jmayclin marked this pull request as ready for review July 8, 2025 17:03
@jmayclin jmayclin requested review from goatgoose and maddeleine July 8, 2025 17:03
@jmayclin jmayclin changed the title feat: add kms tls psk utilities feat: add aws-kms-tls-auth Jul 9, 2025
@jmayclin jmayclin requested review from lrstewart and removed request for maddeleine July 9, 2025 17:25
@jmayclin jmayclin changed the title feat: add aws-kms-tls-auth feat(aws-kms-tls-auth): add codec and parsing Jul 9, 2025
@jmayclin jmayclin requested a review from goatgoose July 9, 2025 18:55
- vec with capacity
- fix codec module documentation
- client hello/psk parsing unit test
- remove protocol
- fix fuzz test ci name
- remove blob new method
@jmayclin jmayclin requested a review from goatgoose July 9, 2025 20:37
jmayclin added 4 commits July 9, 2025 21:07
- update unit test name
- public export of client hello and other structs for fuzz testing
- use s2n-tls to retrieve PSK extension
Comment on lines +62 to +63
- name: run psk-identities fuzzer
run: cargo fuzz run psk_client_hello -- -max_total_time=30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's our longest CI job rn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currently ~1 hour.

Comment on lines +55 to +59
//! 1. client -> [A], server -> [A]
//! 2. client -> [A], server -> [A, B]
//! 3. client -> [A, B], server -> [A, B]
//! 4. client -> [B], server -> [A, B]
//! 5. client -> [B], server -> [B]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the client need [A,B] in this process? Couldn't it just be:

Suggested change
//! 1. client -> [A], server -> [A]
//! 2. client -> [A], server -> [A, B]
//! 3. client -> [A, B], server -> [A, B]
//! 4. client -> [B], server -> [A, B]
//! 5. client -> [B], server -> [B]
//! 1. client -> [A], server -> [A]
//! 2. client -> [A], server -> [A, B]
//! 4. client -> [B], server -> [A, B]
//! 5. client -> [B], server -> [B]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but for most fleet owners there is no way to atomically switch all clients from A to B, so I figure it was useful from that perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you shouldn't need atomic. From my understanding, a client will only ever use one key (nothing prompts it to choose) so it'll either use A or B, and not care about any other values in a list. So the server needs the list to handle a mixed fleet of some clients using A and some using B, but each individual client only ever uses A or B?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the server needs the list to handle a mixed fleet of some clients using A and some using B, but each individual client only ever uses A or B
Yup, this is totally correct.

What I wanted to communicate was

The server must continue to hold A until no clients are holding A. Only then can it be safely dropped

When someone is thinking about modelling a safe deployment of this, they absolutely must be tracking how the client change is rolling out through a pipeline. There will always be some stage during which some portion of the client are using A, and some portion of the clients are using B. During this stage the sever must hold both A and B.

Copy link
Contributor

@lrstewart lrstewart Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess my problem is that client -> [A, B] represents clients with A OR B, but server -> [A, B] represents servers with A AND B :/ You're using the same representation to mean different things

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I was going for "the client population has A AND B", but can see how that is a bit funky.

I'll switch the representation to clients -> [A] [B], server -> [A & B] as a followup

@jmayclin jmayclin enabled auto-merge (squash) July 10, 2025 01:21
@jmayclin jmayclin merged commit dc99a17 into aws:main Jul 10, 2025
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants