-
Notifications
You must be signed in to change notification settings - Fork 577
Description
Hiya. there have been a few PRs which allowed prost to use the bytes::Bytes type for the protobuf bytes type, allowing them to just be slices of a bigger Bytes buffer. Strings however dont quite work. https://docs.rs/bytestring/latest/bytestring/ is just a wrapper around bytes::Bytes with added UTF-8 validity checks. it feels like it would be a good addition (as a features, same as enabling using bytes::Bytes instead of Vec<u8> for byte types. we should be able to reuse all the code from that feature.
hopefully we can go from something like
message Test {
optional string s = 1;
repeated string rs = 2;
}to
struct Test {
#[prost(string = "bytestring", tag = "1")]
pub s: Option<ByteString>,
#[prost(string = "bytestring", tag = "2")]
pub rs: Vec<ByteString>,
}to being able zero-copy this when deserializing from Bytes.
Would that be something we are interested in?
edit: did a little benching with proto that contains a lot of strings.
when decoding from bytes::Bytes, much faster:
decode 1000000 items
stdstring:6181ms
bytestring: 2922ms
when decoding from `&[u8], somewhat slower:
decode 1000000 items
stdstring:6349ms
bytestring: 6974ms
because in the &[u8] case we need to copy anyway, and then i guess all the extra indirection of using Bytes shows.
seems to me like quite a big performance boost for string-heavy scenarios.