Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

inteon
Copy link
Member

@inteon inteon commented Jun 16, 2025

Replaces the github.com/json-iterator/go dependency with encoding/json/v2.
Performance is not yet great (feel free to push improvements/ create new PRs based on this PR):

# based on 8273db415281d117376643df2325c1fff36a8c41
$ go test -benchmem -run=^$ -bench ^BenchmarkFieldSet/serialize.*$ sigs.k8s.io/structured-merge-diff/v6/fieldpath -count=6 > pr.txt
$ go test -benchmem -run=^$ -bench ^BenchmarkFieldSet/serialize.*$ sigs.k8s.io/structured-merge-diff/v6/fieldpath -count=6 > master.txt
$ benchstat master.txt pr.txt

goos: linux
goarch: amd64
pkg: sigs.k8s.io/structured-merge-diff/v6/fieldpath
cpu: Intel(R) Core(TM) Ultra 7 165H
                            β”‚  master.txt  β”‚               pr.txt               β”‚
                            β”‚    sec/op    β”‚   sec/op     vs base               β”‚
FieldSet/serialize-20-8       5.949Β΅ Β±  8%   6.495Β΅ Β± 1%   +9.19% (p=0.002 n=6)
FieldSet/deserialize-20-8     18.02Β΅ Β±  7%   14.51Β΅ Β± 2%  -19.47% (p=0.002 n=6)
FieldSet/serialize-50-8       19.97Β΅ Β±  7%   18.68Β΅ Β± 7%   -6.47% (p=0.015 n=6)
FieldSet/deserialize-50-8     42.15Β΅ Β±  8%   40.97Β΅ Β± 3%        ~ (p=0.818 n=6)
FieldSet/serialize-100-8      73.39Β΅ Β±  6%   66.18Β΅ Β± 7%   -9.83% (p=0.002 n=6)
FieldSet/deserialize-100-8    127.2Β΅ Β±  4%   133.7Β΅ Β± 6%   +5.09% (p=0.002 n=6)
FieldSet/serialize-500-8      401.3Β΅ Β±  7%   357.4Β΅ Β± 5%  -10.94% (p=0.002 n=6)
FieldSet/deserialize-500-8    668.2Β΅ Β±  6%   681.2Β΅ Β± 5%        ~ (p=0.818 n=6)
FieldSet/serialize-1000-8     855.9Β΅ Β± 10%   810.0Β΅ Β± 4%   -5.35% (p=0.026 n=6)
FieldSet/deserialize-1000-8   1.533m Β±  4%   1.494m Β± 9%        ~ (p=0.394 n=6)
geomean                       111.5Β΅         106.5Β΅        -4.45%

                            β”‚  master.txt   β”‚               pr.txt                β”‚
                            β”‚     B/op      β”‚     B/op      vs base               β”‚
FieldSet/serialize-20-8         2350.0 Β± 0%     517.0 Β± 0%  -78.00% (p=0.002 n=6)
FieldSet/deserialize-20-8     11.278Ki Β± 0%   5.840Ki Β± 0%  -48.21% (p=0.002 n=6)
FieldSet/serialize-50-8        6.375Ki Β± 0%   1.407Ki Β± 0%  -77.93% (p=0.002 n=6)
FieldSet/deserialize-50-8      24.30Ki Β± 0%   16.41Ki Β± 0%  -32.45% (p=0.002 n=6)
FieldSet/serialize-100-8      20.426Ki Β± 0%   4.806Ki Β± 0%  -76.47% (p=0.002 n=6)
FieldSet/deserialize-100-8     74.18Ki Β± 0%   56.74Ki Β± 0%  -23.52% (p=0.002 n=6)
FieldSet/serialize-500-8      112.01Ki Β± 1%   24.04Ki Β± 0%  -78.54% (p=0.002 n=6)
FieldSet/deserialize-500-8     360.9Ki Β± 0%   276.2Ki Β± 0%  -23.46% (p=0.002 n=6)
FieldSet/serialize-1000-8     226.75Ki Β± 1%   54.03Ki Β± 0%  -76.17% (p=0.002 n=6)
FieldSet/deserialize-1000-8    788.7Ki Β± 0%   613.0Ki Β± 0%  -22.28% (p=0.002 n=6)
geomean                        46.16Ki        18.24Ki       -60.48%

                            β”‚  master.txt  β”‚               pr.txt               β”‚
                            β”‚  allocs/op   β”‚  allocs/op   vs base               β”‚
FieldSet/serialize-20-8         9.000 Β± 0%    1.000 Β± 0%  -88.89% (p=0.002 n=6)
FieldSet/deserialize-20-8       285.0 Β± 0%    206.0 Β± 0%  -27.72% (p=0.002 n=6)
FieldSet/serialize-50-8        14.000 Β± 0%    1.000 Β± 0%  -92.86% (p=0.002 n=6)
FieldSet/deserialize-50-8       832.0 Β± 0%    590.0 Β± 0%  -29.09% (p=0.002 n=6)
FieldSet/serialize-100-8       32.000 Β± 0%    1.000 Β± 0%  -96.88% (p=0.002 n=6)
FieldSet/deserialize-100-8     2.784k Β± 0%   2.068k Β± 0%  -25.72% (p=0.002 n=6)
FieldSet/serialize-500-8      143.000 Β± 1%    1.000 Β± 0%  -99.30% (p=0.002 n=6)
FieldSet/deserialize-500-8     14.27k Β± 0%   10.34k Β± 0%  -27.52% (p=0.002 n=6)
FieldSet/serialize-1000-8     307.500 Β± 0%    1.000 Β± 0%  -99.67% (p=0.002 n=6)
FieldSet/deserialize-1000-8    31.54k Β± 0%   22.57k Β± 0%  -28.44% (p=0.002 n=6)
geomean                         373.4         47.52       -87.27%

closes #202

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 16, 2025
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jun 16, 2025
@dims
Copy link
Member

dims commented Jun 17, 2025

xref: kubernetes/kubernetes#132312

@dims
Copy link
Member

dims commented Jun 18, 2025

For the pull-structured-merge-diff-test failure, please add this to fix the error @inteon

diff --git a/internal/cli/main_test.go b/internal/cli/main_test.go
index 3e409ede0673..8beed5c6cf55 100644
--- a/internal/cli/main_test.go
+++ b/internal/cli/main_test.go
@@ -21,6 +21,7 @@ import (
        "encoding/json"
        "io/ioutil"
        "path/filepath"
+       "strings"
        "testing"
 )

@@ -135,7 +136,7 @@ func (tt *testCase) checkOutput(t *testing.T, got []byte) {
                t.Fatalf("couldn't read expected output %q: %v", tt.expectedOutputPath, err)
        }

-       if a, e := string(got), string(want); a != e {
+       if a, e := strings.TrimSpace(string(got)), strings.TrimSpace(string(want)); a != e {
                t.Errorf("output didn't match expected output: got:\n%v\nwanted:\n%v\n", a, e)
        }
 }

@inteon inteon force-pushed the use_json_v2 branch 2 times, most recently from a4b6871 to bdce391 Compare June 18, 2025 10:14
@inteon
Copy link
Member Author

inteon commented Jun 18, 2025

For the pull-structured-merge-diff-test failure, please add this to fix the error @inteon
...

I fixed the test failure.

@dims
Copy link
Member

dims commented Jun 18, 2025

/assign @BenTheElder @liggitt

@liggitt
Copy link
Contributor

liggitt commented Jun 19, 2025

/assign @jpbetz
who is the primary apimachinery approver on this bit and was deeply involved in the initial performance-driven use of json-iterator in these bits

@liggitt
Copy link
Contributor

liggitt commented Jun 19, 2025

For the pull-structured-merge-diff-test failure, please add this to fix the error @inteon

I suspect using a json marshal function (like MarshalWrite) that doesn't append a newline would be a more efficient way to accomplish that

return nil, fmt.Errorf("parsing JSON: %v", err)
}

k := rawKey.String()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is rawKey.String() the same as decoding to a string, in terms of interpreting escape sequences, etc?

Comment on lines -330 to -339
{
JSON: `1.0`,
IntoType: reflect.TypeOf(json.Number("")),
Want: json.Number("1.0"),
},
{
JSON: `1`,
IntoType: reflect.TypeOf(json.Number("")),
Want: json.Number("1"),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious if it's ok to drop these... were they added to try to catch a specific issue?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: inteon
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpbetz. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@liggitt
Copy link
Contributor

liggitt commented Jun 23, 2025

The .../deserialize... benchmarks actually don't look terrible now... I'd be willing to accept that performance drop in pursuit of correctness / safety.

The serialize benchmarks still look pretty rough. Need to see what we can improve there.

@liggitt
Copy link
Contributor

liggitt commented Jun 24, 2025

did you run the full set of benchmarks to see how we looked across all of them?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 28, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 30, 2025
@liggitt
Copy link
Contributor

liggitt commented Jul 1, 2025

Thanks for the updates, how are the overall benchmarks looking (not just the subset in the description)?

As you were adjusting the implementation, were there any unit tests it would make sense to add to catch edges the previous implementations handled we want to ensure the new one does as well? I'm thinking specifically of things like:

  • handling of extra data in the input bytes/buffer when decoding/deserializing (e.g. "somevalue"extrastuff or {"key":"value"}extrastuff)
  • handling of special characters in strings that need escaping where the raw bytes would not be the same as the decoded or encoded/escaped bytes
  • handling of ignorable whitespace when decoding

@jpbetz
Copy link
Contributor

jpbetz commented Jul 1, 2025

First off- it's amazing to see this happening and the benchmarks are VERY promising. Thanks @inteon!

To get this to the finish line, and merge, what should our criteria be?

I chatted offline with @liggitt briefly and some of the criteria we discussed was:

  • Golang releases a stable json/v2 (The alternative would be to add an internal copy of json-experiment to this repo like kube-openapi has, but I don't know if it's worth it given how close json/v2 is to stable)
  • github.com/kubernetes/kubernetes CI test stability is not negatively impacted
  • This passes a scale test (SIG instrumentation)
  • We are confident on the correctness (triple check the implementation, shore up with additional functional tests)

Intuitively, it seems like the deserialization is already sufficiently fast. I suspect we need to optimize serialization a bit further since we serialize managed fields on all updates (not just patches). That said, I'm willing to be data driven here. If we can show downstream scale and performance is acceptable, I'm willing to accept a higher serialization perf regression in order to migrate to json/v2.

Thoughts, concerns?

inteon added 6 commits August 19, 2025 12:50
Signed-off-by: Tim Ramlot <[email protected]>
Signed-off-by: Tim Ramlot <[email protected]>
Signed-off-by: Tim Ramlot <[email protected]>
Signed-off-by: Tim Ramlot <[email protected]>
@inteon
Copy link
Member Author

inteon commented Aug 30, 2025

Update check the new numbers in my PR description, upgrading encoding/json/v2 did improve performance!

@inteon
Copy link
Member Author

inteon commented Aug 30, 2025

Did some further tuning and got the # allocations lower than on master.

@BenTheElder
Copy link
Member

Golang releases a stable json/v2 (The alternative would be to add an internal copy of json-experiment to this repo like kube-openapi has, but I don't know if it's worth it given how close json/v2 is to stable

even with stable json/v2, we might need to temporarily use a fork, otherwise we kubernetes branches that aren't on that go version yet can't update SMD.

but we should encapsulate it and plan to eliminate it when we're ready to require that minimum go version

even kubernetes master isn't on 1.25 yet

@liggitt
Copy link
Contributor

liggitt commented Sep 2, 2025

Did some further tuning and got the # allocations lower than on master.

Am I reading correctly that B/op and allocs/op are ~equivalent or better than master on pretty much all benchmarks? If so, that's amazing progress!

Paired with a close review and functional/correctness test coverage to make sure the new approach behaves identically to the old version (especially in terms of what it accepts/rejects/produces in edge cases like leading/trailing/non-normalized/invalid inputs), this looks really promising.

@lalitc375
Copy link
Contributor

Amazing work @inteon in reducing the number of allocs per operation to 1. I did a similar analysis over your change, and saw similar performance. The change should have zero to negligible impact on Kube API server performance. We just have to make sure this new library behaves the same as the existing implementation functionally, which I think existing tests should be able to do(?).

@liggitt
Copy link
Contributor

liggitt commented Sep 4, 2025

We just have to make sure this new library behaves the same as the existing implementation functionally, which I think existing tests should be able to do(?).

I'm not sure how detailed the existing tests are at all the edge cases of valid and invalid variants on input (handling of escaped values in keys, whitespace before/after/between tokens, valid and invalid syntax, etc), and byte-for-byte assertions about output. Since this needed to effectively rewrite some of the encoding/decoding paths, we need to make sure we have test coverage for those things.

@lalitc375
Copy link
Contributor

We just have to make sure this new library behaves the same as the existing implementation functionally, which I think existing tests should be able to do(?).

I'm not sure how detailed the existing tests are at all the edge cases of valid and invalid variants on input (handling of escaped values in keys, whitespace before/after/between tokens, valid and invalid syntax, etc), and byte-for-byte assertions about output. Since this needed to effectively rewrite some of the encoding/decoding paths, we need to make sure we have test coverage for those things.

There are not enough tests for unicode and escape characters . I have added those tests in #300. Including these new tests, We should detect regression in Serialization and Deserialization code in future.

@liggitt
Copy link
Contributor

liggitt commented Sep 9, 2025

Excellent, #300 looks like a great step forward for test coverage of normalized encoding. We'll probably want similar additions for:

  1. decoding of valid-but-non-normalized values working properly (insignificant leading / trailing / interspersed whitespace, or non-canonical escaped values, etc) and capturing what in-memory values are produced by the decoding
  2. decoding of invalid values erroring properly

@inteon
Copy link
Member Author

inteon commented Sep 11, 2025

After upgrading github.com/go-json-experiment/json and rerunning the benchmarks, all benchmarks now outperform the benchmarks on master.

@liggitt
Copy link
Contributor

liggitt commented Sep 11, 2025

Huh… did something change on s-m-d master? The latest benchmark update looks like some of the relative improvement came from master getting worse...

@liggitt
Copy link
Contributor

liggitt commented Sep 11, 2025

oh, maybe the test changes in #300 impacted the master benchmark numbers

@dims
Copy link
Member

dims commented Sep 17, 2025

k/k master is at golang v1.25.1 fyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove use of json-iterator / reflect2
7 participants