Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ prql Public
forked from PRQL/prql

PRQL is a modern language for transforming data — a simpler and more powerful SQL

License

Notifications You must be signed in to change notification settings

alexvoda/prql

 
 

Repository files navigation

PRQL

Language Docs Discord

GitHub CI Status GitHub contributors Stars

Pipelined Relational Query Language, pronounced "Prequel".

PRQL is a modern language for transforming data — a simpler and more powerful SQL. Like SQL, it's readable, explicit and declarative. Unlike SQL, it forms a logical pipeline of transformations, and supports abstractions such as variables and functions. It can be used with any database that uses SQL, since it transpiles to SQL.

PRQL was discussed on Hacker News and Lobsters earlier this year when it was just a proposal.

Here's a short example of the language; for more examples, visit prql-lang.org. To experiment with PRQL in the browser, check out PRQL Playground.

from employees                                # Each line transforms the previous result.
filter start_date > @2021-01-01               # Clear date syntax.
derive [                                      # `derive` adds columns / variables.
  gross_salary = salary + payroll_tax,
  gross_cost = gross_salary + benefits_cost   # Variables can use other variables.
]
filter gross_cost > 0
group [title, country] (                      # `group` runs a pipeline over each group.
  aggregate [                                 # `aggregate` reduces each group to a row.
    average salary,
    sum     salary,
    average gross_salary,
    sum     gross_salary,
    average gross_cost,
    sum_gross_cost = sum gross_cost,          # `=` sets a column name.
    ct = count,
  ]
)
sort [sum_gross_cost, -country]               # `-country` means descending order.
filter ct > 200
take 20

Resources

Contributing

If you're interested in joining the community to build a better SQL, there are lots of ways of contributing; big and small:

  • Star this repo.
  • Send a link to PRQL to a couple of people whose opinion you respect.
  • Subscribe to Issue #1 for updates.
  • Join the Discord.
  • Contribute towards the code. There are many ways of contributing, for any level of experience with rust. And if you have rust questions, there are lots of friendly people on the Discord who will patiently help you.
    • Find an issue labeled help wanted or good first issue and try to fix it. Feel free to PR partial solutions, or ask any questions on the Issue or Discord.
    • Start with something tiny! Write a test / write a docstring / make some rust nicer — it's a great way to get started in 30 minutes.
  • Contribute towards the language.
    • Find instances where the compiler produces incorrect results, and post a bug report — feel free to use the online compiler.
    • Open an issue / append to an existing issue with examples of queries that are difficult to express in PRQL — especially if more difficult than SQL.
    • With sufficient examples, suggest a change to the language! (Though suggestions without examples are difficult to engage with, so please do anchor suggestions in examples.)

Any of these will inspire others to invest their time and energy into the project; thank you in advance.

Development environment

Setting up a local dev environment is simple, thanks to the rust ecosystem:

  • Install rustup & cargo.
  • That's it! Running cargo test should complete successfully.
  • For more advanced development; e.g. adjusting insta outputs or compiling for web, run the commands in Taskfile.yml, either by copying & pasting or by installing Task and running task setup-dev.
  • For quick contributions, hit . in GitHub to launch a github.dev instance.
  • Any problems: post an issue and we'll help.

Contributors

Many thanks to those who've made our progress possible:

Contributors

Core developers

We have a few core developers who are responsible for reviewing code, making decisions on the direction of the language, and project administration:

We welcome others to join who have a track record of contributions.

Inspired by

  • dplyr is a beautiful language for manipulating data, in R. It's very similar to PRQL. It only works on in-memory R data.
    • There's also dbplyr which compiles a subset of dplyr to SQL, though requires an R runtime.
  • Kusto is also a beautiful pipelined language, very similar to PRQL. But it can only use Kusto-compatible DBs.
    • A Kusto-to-SQL transpiler would be a legitimate alternative to PRQL, though there would be some impedance mismatch in some areas. My central criticism of Kusto is that it gives up broad compatibility without getting that much in return.
  • Against SQL gives a fairly complete description of SQL's weaknesses, both for analytical and transactional queries. @jamii consistently writes insightful pieces, and it's worth sponsoring him for his updates.
  • Julia's DataPipes.jl & Chain.jl, which demonstrate how effective point-free pipelines can be, and how line-breaks can work as pipes.
  • OCaml, for its elegant and simple syntax.

Similar projects

  • Ecto is a sophisticated ORM library in Elixir which has pipelined queries as well as more traditional ORM features.
  • Morel is a functional language for data, also with a pipeline concept. It doesn't compile to SQL but states that it can access external data.
  • Malloy from Looker & @lloydtabb in a new language which combines a declarative syntax for querying with a modelling layer.
  • FunSQL.jl is a library in Julia which compiles a nice query syntax to SQL. It requires a Julia runtime.
  • LINQ, is a pipelined language for the .NET ecosystem which can (mostly) compile to SQL. It was one of the first languages to take this approach.
  • Sift is an experimental language which heavily uses pipes and relational algebra.
  • After writing this proposal (including the name!), I found Preql. Despite the similar name and compiling to SQL, it seems to focus more on making the language python-like, which is very different to this proposal.

If any of these descriptions can be improved, please feel free to PR changes.

How is PRQL different from these?

Many languages have attempted to replace SQL, and yet SQL has massively grown in usage and importance in the past decade. There are lots of reasonable critiques on these attempts. So a reasonable question is "Why are y'all building something that many others have failed at?". Some thoughts:

  • PRQL is open. It's not designed for a specific database. PRQL will always be fully open-source. There will never be a commercial product. We'll never have to balance profitability against compatibility, or try and expand up the stack to justify a valuation. Whether someone is building a new tool or writing a simple query — PRQL can be more compatible across DBs than SQL.
  • PRQL is analytical. The biggest growth in SQL usage has been from querying large amounts of data, often from analytical DBs that are specifically designed for this — with columnar storage and wide denormalized tables. SQL carries a lot of baggage unrelated to this, and focusing on the analytical use-case lets us make a better language.
  • PRQL is simple. There's often a tradeoff between power and accessibility — rust is powerful vs. Excel is accessible — but there are also instances where we can expand the frontier. PRQL's orthogonality is an example of synthesizing this tradeoff — have a single filter rather than WHERE & HAVING & QUALIFY brings both more power and more accessibility.

In the same way that "SQL was invented in the 1970s and therefore must be bad" is questionable logic, "n languages have tried and failed so therefore SQL cannot be improved." suffers a similar fallacy. SQL isn't bad because it's old. It's bad because — in some cases — it's bad.

About

PRQL is a modern language for transforming data — a simpler and more powerful SQL

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Rust 55.7%
  • JavaScript 35.4%
  • CSS 7.7%
  • Other 1.2%