LQN is a Common Lisp libary, query language and terminal utility to query and
transform text files such as JSON and CSV, as well as Lisp data (LDN), The
terminal utilities will parse the input data to internal lisp structures
according to the input mode. Then the lqn query language can be used for queries
and transformations.
lqn started as an experiment and programming exercise. But it has turned into
a little language i find rather useful. Both in the terminal, and
more interestingly, as a meta language for writing macros in CL. The main
purpose of the design is to make something that is intuitive, terse, yet
flexible enough that you can write generic CL if you need to. I also wanted to
make something that requres a relatively simple compiler.
Here is a small tutorial: https://inconvergent.net/2024/lisp-query-notation/.
An expanded version of the tutorial can be seen in this paper: https://inconvergent.net/code/lqn.pdf
When using LQN on the terminal there are three terminal commands, or input
modes: jqn, tqn and lqn. For JSON, text and lisp data respectively.
(For installation see below.) You can find some terminal command examples in
bin/lqn-sh.lisp, bin/jqn-sh.lisp, and bin/tqn-sh.lisp.
Symbol documentation can be seen in docs/lqn.md.
Internally JSON arrays/lists are represented as vectors. and JSON
objects/dicts are represented as hash-tables (ht). Thus a text file is a
vector of strings.We use object in the context of Operators and other
LQN utilities to refer to either a vector or a ht. Lisp data is read
directly.
The following operators have special behaviour. You can also write generic CL
code in almost all contexts, as we demonstrate soon. In operators we use _ to
refer to the current value.
In the following sections [d] represents an optional default value. E.g. if
key/index is missing, or if a functon would otherwise return nil.
k is an initial counter
value. Whereas .. means that there can be arbitrary arguments/expr.
expr denotes any expression or operator; like (+ 1 _) or #[:id].
In operators, and several functions, :keywords can be used to represent
lowercase strings. This is useful in the terminal to avoid escaping strings.
Particularly when using Selector operators. You can use "Strings" instead,
if you need case or whitespace.
(|| expr ..) pipes the results from the first expr to the second, and so
on. Returns the result of the last expr. The Pipe operator surrounds all
queries by default. So it is usually not neccessary to use it explicitly.
For convenience the pipe has the following default translations:
fx: to(?map (fx _)): mapfxacross all items.:word: to[(isub? _ "word")]to filter by"word"."Word": to[(sub? _ "Word")]to filter byWord, case sensitive.(..): to itself. That is, expressions are not translated. so this is the default transalation for top level expressions in any query.
Select :keys, indexes or paths from nested structure:
(@ k): get this key/index/path from current value.(@ k [d]): get this key/index/path from current value.(@ o k [d]): get this key/index/path fromo.
Paths support wildcards (*) and numerical indices for nested structures. E.g. this
is a valid path: :*/0/things.
Map operations over vector; or over the values of a ht:
#(fx): map(fx _)across all items.#(expr ..): evaluate these expressions sequentially on all items insequence.
Select from one structure into a new data structure. using selectors:
{s1 sel ..}: fromhtinto newht.#{s1 sel ..}: fromvectorofhtsinto newvectorofhts.#[s1 sel ..]: fromvectorofhtsinto newvector.
A selector is a triple (mode key expr). Only key is required. If expr is
not provided the expr is _, that is: the value of the key. The modes are
as follows:
+: always include thisexpr. [default]?: includeexprif the key is present and notnil.%: include Selector ifexpris notnil.-: drop this key in#{}and{}operators; ignore Selector entirely in#[]E.g.{_ -@key3}to select all keys exceptkey3.expris ignored.
Selectors can either be written out in full, or they can be be written in short
form depending on what you want to achieve. The @ in the following examples
is used to append a mode to a key without having to wrap the Selector in
parenthesis. If you need eg. case or spaces you can use "strings". Here are
some examples using {}. It behaves the same for the other Selector
operators:
{_} ; select all keys.
{_ :-@key1} ; select all keys except "key1".
{:key1 "Key2"} ; select "key1" and "Key2".
{:+@key} ; same as :key [+ mode is default].
{"+@Key"} ; select "Key".
{:?@key } ; select "key" if the value is not nil.
{(:%@key expr)} ; select "key" if expr is not nil.
{("?@Key" expr)} ; select "Key" if the value is not nil.
{("%@Key" expr)} ; select "Key" if expr is not nil.
{(:+ "Key" expr)} ; same as ("+@Key" expr).
; Use `_` in `expr` to refer to the value of the selected key:
{(:key1 sup)) ; convert value of "key1" to uppercase
(:key3 (or _ "That")) ; select the value of "key3", or literally "That".
(:key2 (+ 33 _))} ; add 33 to value of "key2"
; override and drop keys:
{_ ; select all keys, then override these:
(:key2 (sdwn _)) ; lowercase the value of "key2"
:-@key3} ; drop "key3"We use {} in the examples but all Selector operators have the same behaviour.
Filter vector; or the values of a ht:
[expr1 .. exprn]to keep any object or value that satisfies the expressions.
The filter operator behaves somewhat similar to the Selector operators. They are used
with [], ?srch, ?xpr, ?txpr, ?mxpr operators. The modes behave
like this:
+: if there are multiple expressions with+mode, require ALL of them to be satisfies.?: if there are any clauses with?mode, it will select items where either of these clauses is satisfied-: items that match any clause with-mode will ALWAYS be dropped.
If this is not what you need, you can compose boolean expressions with regular CL boolen operators. Here are some examples:
[:hello] ; strings containing "hello".
[:hi "Hello"] ; strings containing either "Hello" OR "hi".
[:+@hi :+@hello] ; strings containing "hi" AND "hello".
[:+@hi :+@hello "OH"] ; strings containing ("hi" AND "hello") OR "OH".
[int!?] ; items that can be parsed as int.
[(> _ 3)] ; numbers larger than 3.
[_ :-@hi] ; strings except those that contain "hi".
[(+@pref? _ "start") ; strings that start with "start" and end with "end".
(+@post? _ "end")]
[(fx1 _)] ; items where this expression is not nil.
[(or (fx1 _) (fx2 _))] ; ...Reduce vector; or the values of a ht:
(?fld init fx): fold(fx acc _)withinitas the firstaccvalue.accis inserted as the first argument tofx.(?fld init (fx .. _ ..)): fold(fx acc .. _ ..). The accumulator is inserted as the first argument tofx.(?fld init acc (fx .. acc .. nxt)): fold(fx .. acc .. nxt). Use this if you need to name the accumulator explicity.
Group input into a new ht:
(?grp expr [tx-expr]): keys are given byexpr, and values are given bytx-expr(or_).
Repeat the same expression while something is true:
(?rec test-expr expr): repeatexprwhiletest-expr._refers to the input value, then to the most recent evaluation ofexpr. Use(cnt)to get the number of the current iteration.(par)always refers to the input value.
Iterate a datastructure (as if with ?txpr) and collect the matches in a new
vector:
(?srch sel): collect_whenever theSelectormatches.(?srch sel .. expr): collectexprwhenever theSelectormatches.
Perform operation when pattern or condition is satisfied:
(?xpr sel): match current value againstEXPR Selector. Return the result if notnil.(?xpr sel hit-expr): match current value againstEXPR Selector. Evaluateshit-exprif not nil._is the matching item.(?xpr sel .. hit-expr miss-expr): match current value againstexpr selectors. Evaluatehit-exprif notnil; else evaluatemiss-expr._is the matching item.
Recursively traverse a nested structure of sequences and hts and return a
new value for each match:
(?txpr sel .. tx-expr): recursively traverse current value and replace matches withtx-expr.tx-exprcan be a function name or expression. Also traverses vectors andhtvalues.(?mxpr (sel .. tx-expr) .. (sel .. tx-expr)): one or more matches and transforms. Performs the transform of the first match only.
The internal representation of in lqn means you can use the regular CL
utilities such as gethash, aref, subseq, length etc. But for
convenience there are some utility functions/macros in defined in lqn. Some
of them are described below. There are more in the documentation.
Defined in the query scope:
(fi [k=0]): counts files fromk.(fn): name of the current file; or":internal:","pipe".(hld k v): hold this value at this key in a key value store.(ghv k [d]): get the value of this key; ord.(nope [d]): stop execution, returnd.(err [msg]): raiseerrorwithmsg.(wrn [msg]): raisewarnwithmsg.
Defined in all operators:
_: the current value.(cnt): counts from0in the enclosingSelector.(key): the currentkeyif the current value is aht. Otherwise(cnt).(itr): the current object in the iteration of the enclosingSelector.(par): the object containing(itr).(psize): number of items in(par).(isize): number of items in(itr).
General utilities:
(?? a expr [res=expr]): executeexpronly ifais notnil. ifexpris not nil it returnsexprorres; otherwisenil.(fmt f ..): formatfasstringwith these (format) args.(fmt s): get printed representation ofs.(out f ..): formatfto*standard-output*with these (format) args. returnsnil.(out s): output printed representation ofsto*standard-output*. returnsnil.(msym? a b): comparesymbolatob; ifbis akeywordorsymbola perfect match is required; ifbis astringit performs a substring match; ifbis an expression,ais compared to the evaluated value ofb.(noop ..): do nothing, returnnil.
For all sequences and hts:
(@* o d i ..): pick these indices/keys fromsequence/htinto newvector.(size? o [d]): length ofsequenceor number of keys inht.(all? o [empty]): are all items insequencesomething? orempty.(some? o [empty]): are some items insequencesomething? oremtpy.(empty? o [d]): issequenceorhtempty?.(compct o): Removenil, emptyvectors, emptyhtsand keys with emptyhts.
Make or join hts:
(cat$ ..): add all keys from thesehtsto a newht. left to right.(new$ :k1 expr1 ..): newhtwith these keys and expressions.
Primarily for sequences (string, vector, list):
(new* ..): newvectorwith these elements.(ind* s i): get this index fromsequence.(sel ..): get newvectorwith theseind*sorseqsfromsequence.(seq v i [j]): get rangei ..ori .. (1- j)fromsequence.(head s [n=10]): firstnitems ofsequence.(tail s [n=10]): lastnitems ofsequence.(cat* s ..): concatenate thesesequencesto avector.(flatn* s [n=1] [str=nil]): flattensequencentimes into avector. Ifstr=tstrings are flattened into individual chars as well.(flatall* s [str=nil]): flatten allsequences(exceptstrings) into newvector. Usetas the second argument to flattenstringsto individual chars as well.(flatn$ s n): flattenhtinto vector(new* k0 v0 k1 v1 ..)
Primarily for string searching. [i] means case insensitive:
([i]pref? s pref [d]):sifprefis a prefix ofs; ord.([i]sub? s sub [d]):sifsubis a substring ofs; ord.([i]subx? s sub): index wheresubstarts ins.([i]suf? s suf [d]):sifsufis a suffix ofs; ord.(repl s from to): replacefromwithtoins.
String maniuplation:
(sup s ..):str!and upcase.(sdwn s ..):str!and downcase.(trim s): trim leading and trailing whitespace fromstring.(splt s x [trim=t] [prune=nil]): splitsat allxintovectorofstrings.trimremoves whitespace.prunedrops empty strings.(join s x ..): join sequence withx(stringsorchars), returnsstring.(strcat s ..): concatenate thesestrings, or allstringsin one or moresequencesofstrings.
(is? o [d]) returns o if not nil, empty sequence, or empty ht; or d.
These functions return the argument if the argument is the corresponding type:
flt?, int?, ht?, lst?, num?, str?, vec?, seq?.
These functions return the argument parsed as the corresponding type if
possible; otherwise they return the optional second argument: int!?, flt!?,
num!?, str!?, vec!?, seq!?.
The following functions will coerce the argument, or fail if the coercion is
not supported: str!, int!, flt!, lst! sym!,
lqn requires SBCL. And is pretty easy to install via
quicklisp. SBCL is available in most package managers. And you can get
quicklisp at https://www.quicklisp.org/beta/. Make sure lqn is available in
your quicklisp local-projects folder. Mine is at
~/quicklisp/local-projects/.
Then create an alias for SBCL to execute shell wrappers e.g:
alias jqn="sbcl --script ~/path/to/lqn/bin/jqn-sh.lisp"
alias tqn="sbcl --script ~/path/to/lqn/bin/tqn-sh.lisp"
alias lqn="sbcl --script ~/path/to/lqn/bin/lqn-sh.lisp"Unfortunately this will tend to have a high startup time. To make it run faster
you can create an SBCL image/core that has lqn preloaded and dump it using
sb-ext:save-lisp-and-die. Then use the core in the alias instead of SBCL.
is an example script for creating your own core. You can also preload
your own libraries which will be available to lqn.
You can see an example bash script for making your own core herebin/core.sh