Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

stevevls
Copy link
Contributor

@stevevls stevevls commented Nov 1, 2022

The Position gives a means to index into the Tokenizer's underlying byte slice. This enables use cases where the caller is planning on making edits to the JSON document but wants to leverage the copy func to optimize data movement and/or to copy remaining bytes if the caller wants to exit the tokenizing loop early.

The Position gives a means to index into the Tokenizer's underlying byte slice.
This enables use cases where the caller is planning on making edits to the JSON
document but wants to leverage the copy func to optimize data movement.
Comment on lines +142 to +151
t.Position += i
}

if len(t.json) == 0 {
t.Reset(nil)
return false
}

lenBefore := len(t.json)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a more robust implementation would be to use a defer?

lenBefore := len(t.json)
defer func() { t.Position += lenBefore - len(t.json) }()

Unless there are measurable cost on benchmarks, I'd go with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. From this spot on, I didn't see any early returns. Are you thinking more along the lines of future proofing the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the benchmark results. The impact of the defer statement is small but measurable on my local machine. Not sure why all the percent changes are showing up as zero. :( Thoughts?

name                                                   old time/op   new time/op   delta
Tokenizer/github.com/segmentio/encoding/json/null-8     25.9ns ± 0%   27.1ns ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/true-8     26.0ns ± 0%   27.6ns ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/false-8    26.9ns ± 0%   28.2ns ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/number-8   33.6ns ± 0%   34.9ns ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/string-8   31.7ns ± 0%   33.4ns ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/object-8   1.11µs ± 0%   1.24µs ± 0%   ~     (p=1.000 n=1+1)

name                                                   old speed     new speed     delta
Tokenizer/github.com/segmentio/encoding/json/null-8    154MB/s ± 0%  148MB/s ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/true-8    154MB/s ± 0%  145MB/s ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/false-8   186MB/s ± 0%  177MB/s ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/number-8  327MB/s ± 0%  315MB/s ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/string-8  378MB/s ± 0%  360MB/s ± 0%   ~     (p=1.000 n=1+1)
Tokenizer/github.com/segmentio/encoding/json/object-8  613MB/s ± 0%  550MB/s ± 0%   ~     (p=1.000 n=1+1)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. From this spot on, I didn't see any early returns. Are you thinking more along the lines of future proofing the code?

I was thinking of moving this to the top of the function, but let's not sweat it if it's having a negative impact on the performance of the code 👍

The delta not showing might be due to not running the benchmarks with a -count large enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @achille-roussel @stevevls I added #126 recently to keep track of the position of the tokenizer, but without any performance impact. The caller does have to retain the input byte slice so that it can convert bytes remaining back to an index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants