Thanks to visit codestin.com
Credit goes to github.com

Skip to content

support zero-copy string deserialization via ByteString. #752

@VladimirBramstedt

Description

@VladimirBramstedt

Hiya. there have been a few PRs which allowed prost to use the bytes::Bytes type for the protobuf bytes type, allowing them to just be slices of a bigger Bytes buffer. Strings however dont quite work. https://docs.rs/bytestring/latest/bytestring/ is just a wrapper around bytes::Bytes with added UTF-8 validity checks. it feels like it would be a good addition (as a features, same as enabling using bytes::Bytes instead of Vec<u8> for byte types. we should be able to reuse all the code from that feature.
hopefully we can go from something like

message Test {
  optional string s = 1;
  repeated  string rs = 2;
}

to

struct Test {
     #[prost(string = "bytestring", tag = "1")]
     pub s: Option<ByteString>,
    #[prost(string = "bytestring", tag = "2")]
    pub rs: Vec<ByteString>,
}

to being able zero-copy this when deserializing from Bytes.

Would that be something we are interested in?

edit: did a little benching with proto that contains a lot of strings.
when decoding from bytes::Bytes, much faster:

decode 1000000 items
stdstring:6181ms
bytestring: 2922ms

when decoding from `&[u8], somewhat slower:
decode 1000000 items
stdstring:6349ms
bytestring: 6974ms

because in the &[u8] case we need to copy anyway, and then i guess all the extra indirection of using Bytes shows.

seems to me like quite a big performance boost for string-heavy scenarios.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions