Tags: pikulev/hario-core
Tags
Release v0.4.0 (#4) v0.4.0 BREAKING: The old flatten (which stringifies nested structures) is now called stringify. BREAKING: The new flatten has different behavior—please update your code if you relied on the old flatten logic. BREAKING: Pipeline now requires a list of transformers and a PipelineConfig instance (no more id_fn/id_field in constructor). BREAKING: Pipeline.process now expects a list of dicts (e.g., from HarLog.model_dump()["entries"]). New: Introduced a new flatten transformer that fully flattens nested HAR entries into a flat dict, with customizable key separator and flexible array handling via array_handler. Designed for advanced analytics and BI. New: PipelineConfig class for configuring batch size, processing strategy (sequential/thread/process/async), and max_workers. New: Parallel and batch processing strategies for large HAR files (process, thread, async). New: Benchmarks and benchmarking scripts for pipeline performance (see benchmarks/). New: All transformers (flatten, normalize_sizes, normalize_timings, set_id) are now implemented as picklable callable classes, fully compatible with multiprocessing. New: set_id transformer for assigning IDs to entries using any function (e.g., by_field, uuid). Internal: Test suite and samples updated for new API and real-world HAR compatibility. * feat(transform): rename flatten to stringify * feat(transform): add new flatten fn * feat(transform/flatten): improve array handler * fix(__init__): add flatten * feat: use orjson to perform better * feat: transforms dict in-place, change Transformer protocol, rm stringify * chore: clean lint * feat(transforms): make picklable * chore(benchmark): add cpu_heavy bench * feat(pipeline): process strategies + config * chore(benchmarks): useful bench * feat(pipeline): dict as input, new process protocol * chore: ignore bench csv results * feat: reorganize imports * docs: upd readme * docs: upd api ref + 0.4.0 changelog * ref: fmt * docs: upd changelog v0.4.0 * docs: translate comments