Clarify that FLOAT4E2M1 can be in int32_data#6640
Conversation
Signed-off-by: Justin Chu <[email protected]>
2275006 to
bf8266c
Compare
|
@yuanyao-nv Please review and verify that the changes are desired and accurate. Thanks! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## main #6640 +/- ##
==========================================
+ Coverage 57.44% 57.46% +0.02%
==========================================
Files 507 507
Lines 31612 31584 -28
Branches 3058 3046 -12
==========================================
- Hits 18158 18151 -7
+ Misses 12610 12607 -3
+ Partials 844 826 -18 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Justin Chu <[email protected]>
|
Thanks for the clarification. The int4 packing description clarifies packing 2 int4 into 1 byte, but it doesn't clarify whether such bytes are then packed into int32 (4 each except for the last one) or each byte encoded as a single int32 or whether both are allowed? I guess the same question applies to float8 as well, but the previous description seems to suggest that a single float8 is represented as a single int32 ... |
|
I understand it as each byte encoded as a single int32. Please feel free to suggest wording. @yuanyao-nv do you have comments or further clarifications? |
I agree it's not clear from the proto docstring whether bytes are then packed into int32. But looking at the helper function implementations (eg |
Wondering about its efficiency. Looking at this description of protobuf encoding, I assume that this ends up getting represented as a Ok, so there is a difference (may be 5 bytes vs 6 bytes to represent 8 INT4s). Not huge, but 15% difference could be significant for GB-size tensors. So, maybe we should allow both? (What would likely be even better is packing 7 INT4s into an int32, which can always be represented using 4 bytes ... but that would impact the packing/unpacking logic more significantly than the above.) But, looking into it more, I guess raw_data would be the best anyway for representing the packed INT4? If so, maybe we don't have to worry about this, but remind users that raw_data preferable for this case |
|
From the protobuf docs, https://protobuf.dev/programming-guides/encoding/#packed yes it seems reasonable to say that raw_data is most reasonable, because that gives us 4bytes/8 int4s right? So that should be the most efficient? |
|
I agree with Rama's assessment as well, that raw should be more efficient since no extra byte is needed. |
That's reasonable. There's only one toggle in the helper apis. I think simplicity helps here. |
|
Please feel free to suggest wording. I tried to rephrase but wasn't happy with what I got. I was trying to convey that the all values should be reinterpreted as int32 without further packing. |
|
I used chat gpt's help in rephrasing it |
Signed-off-by: Justin Chu <[email protected]>
587231e to
00668ee
Compare
|
@yuanyao-nv take a final look? Thanks! |
|
LGTM. Thanks! |
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: titaiwangms <[email protected]>
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: titaiwangms <[email protected]>
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: titaiwangms <[email protected]>
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: titaiwangms <[email protected]>
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: titaiwangms <[email protected]>
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: Andreas Fehlner <[email protected]>
### Description Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage. Updated text in the spec for int32_data for better readability and accuracy. ### Motivation and Context Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version. --------- Signed-off-by: Justin Chu <[email protected]> Signed-off-by: seungwoo-ji <[email protected]>
Description
Clarify in spec proto that FLOAT4E2M1 can be in int32_data, according to test usage.
Updated text in the spec for int32_data for better readability and accuracy.
Motivation and Context
Previously the spec was incomplete according to the added tests. FLOAT4E2M1 was not yet released so the change should not require a new IR version.