-
Couldn't load subscription status.
- Fork 9
Description
I was thinking whether it would make sense for yato to expose a somewhat structured (that then can be parsed by other tools) plan from a given set of SQL files.
Idea would be, for example: run yato on a given set of SQL files at a given time, and perform the actual computations elsewhere / at some different time. Ideally the network would still be represented in the structured plan, so that on triggering re-computation of a given table (say due to changes in input data) only the downstream nodes can then be executed.
Idea is mainly that planning is the complex part, while executor can be re-implemented in other flavors and it's relatively simple.
Potential usage:
- a duckdb extension that given a plan executes it within duckdb (whatever client), basically:
INSTALL yato_executor FROM ...;
LOAD yato_executor;
PRAGMA run_yato_plan('path/to/compute_stuff.yato');- running the plan in tighter environment where you might not trust
yatoor you can't run Python
Idea is that once the plan is known / verified, all is good AND that on changes only downstream nodes require recomputations.
Example demo that could be done with this: yato to compute SQL dependencies once, then generate a Web page with the mermaid diagram AND the relevant SQL statements, to be executed in the browser via DuckDB-Wasm. Then on changing the queries or the inputs only relevant tables are recomputed.
This has been heavily inspired by a talk by @laffra on PySheets (slides at https://blobs.duckdb.org/events/duckdb-amsterdam-meetup2/chris-laffra-pysheets-quack-by-example-duckdb-in-a-spreadsheet-with-wasm.pdf) and a follow up chat.