Fix array diffs #45

GGabriele · 2022-01-19T13:08:53Z

As highlighted in various issues (#24, #30), the library is currently failing at computing correct and reliable diffs for arrays, and more specifically whenever the size of the "left" array is less than the "right" one:

case 1: len(left) < len(right) (bogus diffs)

$ cat before.json
{
   "array": [
      "x-posting-ip"
   ]
 }

$ cat after.json
{
   "array": [
      "accept-encoding",
      "x-forwarded-for"
   ]
 }

$ go run main.go before.json after.json
 {
   "array": [
-    0: "x-posting-ip"
+    0: "accept-encoding"
+    0: "accept-encoding"
   ]
 }

case 1 (continuation): len(left) < len(right) (unreliable diffs)

$ cat before.json
{
   "array": [
      "blabla"
   ]
 }

$ cat after.json
{
   "array": [
      "accept-encoding",
      "x-forwarded-for"
   ]
 }

$ go run main.go before.json after.json
 {
   "array": [
+    0: "accept-encoding"
   ]
 }

case 2: len(left) == len(right) (OK)

$ cat before.json
{
   "array": [
      "blabla",
      "blabla"
   ]
 }

$ go run main.go before.json after.json
 {
   "array": [
-    0: "blabla",
+    0: "accept-encoding",
-    1: "blabla"
+    1: "x-forwarded-for"
   ]
 }

case 3: len(left) > len(right) (diffs are OK, but there is a lingering duplicate element at the bottom of the array)

$ cat before.json
{
   "array": [
      "blabla",
      "blabla",
      "blabla"
   ]
 }

$ go run main.go before.json after.json
 {
   "array": [
-    0: "blabla",
+    0: "accept-encoding",
-    0: "blabla",
-    1: "blabla"
+    1: "x-forwarded-for"
     2: "blabla"                   // shouldn't be here
   ]
 }

Both issues highlighted in case 1 are the most serious ones, while the one in case 3 is mostly visually confusing.

I believe the cause of these issues are two:

when you loop through the maybe matrixes here, you are missing the last element of each row/column
the way the similarity and its score is calculated for strings is not reliable

What I'm proposing here is a simple fix for the (1) and to use strutil to calculate the similarity for (2).

$ cat before.json
{
   "array": [
      "x-posting-ip"
   ]
 }

$ cat after.json
{
   "array": [
      "accept-encoding",
      "x-forwarded-for"
   ]
 }

$ go run main.go before.json after.json
 {
   "array": [
-    0: "x-posting-ip"
+    0: "accept-encoding"
+    1: "x-forwarded-for"
   ]
 }

$ cat before.json
{
   "array": [
      "blabla"
   ]
 }

$ go run main.go before.json after.json
 {
   "array": [
-    0: "blabla"
+    0: "accept-encoding"
+    1: "x-forwarded-for"
   ]
 }

Changes don't break current test cases:

$ go test -race ./...
# gojsondiff/jd
jd/main.go:42:4: Println arg list ends with redundant newline
# gojsondiff/jp
jp/main.go:24:4: Println arg list ends with redundant newline
ok  	gojsondiff	1.919s
ok  	gojsondiff/formatter	1.104s

The code is quite complex, so I may be missing some obvious things, so please let me know if that's the case!

I'm also introducing go.mod in the same PR, which is resulting in a giant bulk of changes. The actual code changes are very minimal and all under the gojsondiff.go file.

GGabriele added 2 commits January 19, 2022 13:49

Fix array diffs

869f647

use go.mod

ba2c8f6

GGabriele mentioned this pull request Feb 22, 2022

fix: array diffs are incorrect/unreliable Kong/gojsondiff#1

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix array diffs #45

Fix array diffs #45

Uh oh!

GGabriele commented Jan 19, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix array diffs #45

Are you sure you want to change the base?

Fix array diffs #45

Uh oh!

Conversation

GGabriele commented Jan 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GGabriele commented Jan 19, 2022 •

edited

Loading