Thanks to visit codestin.com
Credit goes to github.com

Skip to content

JSON reader doesn't support scalar-to-list promotion, even though schema inference does #9484

@Rafferty97

Description

@Rafferty97

Describe the bug
Whilst doing some hacking on the JSON schema inference logic, I noticed that there is logic to "promote" scalars to lists when the two are mixed. For example, if one row has "foo": [1, 2, 3] and another has "foo": 4, then "foo" will be inferred to have type List(Int64) rather than reporting a mismatch.

However, this behaviour doesn't appear to be present in the actual JSON reader.

To Reproduce
I added this simple test case that uses the already available "mixed_arrays.json" test file:

    #[test]
    fn test_json_mixed_arrays() {
        let reader = read_file("test/data/mixed_arrays.json", None);
        reader.collect::<Result<Vec<_>, _>>().unwrap();
    }

It returns the following error:

thread 'reader::tests::test_json_mixed_arrays' (2600) panicked at arrow-json\src\reader\mod.rs:2047:47:
called `Result::unwrap()` on an `Err` value: JsonError("whilst decoding field 'b': expected [ got 4")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test reader::tests::test_json_mixed_arrays ... FAILED

Expected behavior
Either:

  • The schema inference fails (i.e. no scalar promotion logic), or
  • The test succeeds

Personally, I would prefer the former, as this "scalar-to-list promotion" logic feels like an unconventional and unexpected feature to have for a JSON parser.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions