-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Labels
Description
Describe the bug
Whilst doing some hacking on the JSON schema inference logic, I noticed that there is logic to "promote" scalars to lists when the two are mixed. For example, if one row has "foo": [1, 2, 3] and another has "foo": 4, then "foo" will be inferred to have type List(Int64) rather than reporting a mismatch.
However, this behaviour doesn't appear to be present in the actual JSON reader.
To Reproduce
I added this simple test case that uses the already available "mixed_arrays.json" test file:
#[test]
fn test_json_mixed_arrays() {
let reader = read_file("test/data/mixed_arrays.json", None);
reader.collect::<Result<Vec<_>, _>>().unwrap();
}It returns the following error:
thread 'reader::tests::test_json_mixed_arrays' (2600) panicked at arrow-json\src\reader\mod.rs:2047:47:
called `Result::unwrap()` on an `Err` value: JsonError("whilst decoding field 'b': expected [ got 4")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test reader::tests::test_json_mixed_arrays ... FAILED
Expected behavior
Either:
- The schema inference fails (i.e. no scalar promotion logic), or
- The test succeeds
Personally, I would prefer the former, as this "scalar-to-list promotion" logic feels like an unconventional and unexpected feature to have for a JSON parser.
Reactions are currently unavailable