Warning
This repository has been moved to datafusion-contrib/datafusion-orc
Read Apache ORC in Rust.
- Read ORC files
- Read stripes (the conversion from proto metadata to memory regions)
- Decode stripes (the math of decode stripes into e.g. booleans, runs of RLE, etc.)
- Decode ORC data to Arrow Datatypes (Async/Sync)
| Column Encoding | Read | Write | Rust Type | Arrow DataType |
|---|---|---|---|---|
| SmallInt, Int, BigInt | ✓ | i16, i32, i64 | Int16, Int32, Int64 | |
| Float, Double | ✓ | f32, f64 | Float32, Float64 | |
| String, Char, and VarChar | ✓ | string | Utf8 | |
| Boolean | ✓ | bool | Boolean | |
| TinyInt | ✗ | |||
| Binary | ✓ | Vec | Binary | |
| Decimal | ✗ | |||
| Date | ✓ | chrono::NavieDate | Date32 | |
| Timestamp | ✓ | chrono::NavieDateTime | Timestamp(Nanosecond,_) | |
| Struct | ✗ | |||
| List | ✗ | |||
| Map | ✗ | |||
| Union | ✗ |
| Compression | Support |
|---|---|
| None | ✓ |
| ZLIB | ✓ |
| SNAPPY | ✗ |
| LZO | ✗ |
| LZ4 | ✗ |
| ZSTD | ✓ |