The rust-kql project is a set of Rust crates for parsing and evaluating Kusto Query Language (KQL) queries. For evaluation, the DataFusion engine is used. Due to differences between the Kusto engine and DataFusion, only a subset of KQL is supported. Future work may include adding support for DataFusion features that are not present in the original KQL.
A simple command line tool is provided to demonstrate how to use the parser and planner crates.
The kqlparser crate provides a parser for KQL queries. It is based on the nom parser library.
See the Status section below for the current state of the parser.
Most simple queries can be parsed, but due to unclearities in the KQL grammar, some queries may not be parsed.
The datafusion-kql crate provides a planner to convert parsed KQL queries into DataFusion logical plans.
See the Status section below for the current state of the planner.
Due the differences between Kusto and DataFusion, the development of the planner is going slower than the parser.
Very simple queries can be executed, but uncertain if more complex queries will ever going to work.
The kq crate provides a simple command line tool to show how to use the kqlparser and datafusion-kql crates.
Example usage of the kq command line tool:
kq -f users.csv 'users | where name == "iwan" and age > 30'
kq -f logins.csv 'logins | summarize count(name) by name'
kq -f users.csv -f logins.csv 'logins | join (users) on name | project name, age, login_time'✔️ native implemented, like existing DataFusion functions
✅ (mostly) done
🚧 partial / in progress
❌ not started
| Type | Parser | Planner |
|---|---|---|
| bool | ✅ | ✅ |
| datetime | 🚧 | ✅ |
| decimal | 🚧1 | ❌ |
| dynamic | ✅ | ❌ |
| guid | ❌ | ❌ |
| int | ✅ | ✅ |
| long | ✅ | ✅ |
| real | ✅ | ✅ |
| string | ✅ | ✅ |
| timespan | ✅ | ✅ |
| Operator | Parser | Planner |
|---|---|---|
| as | ✅ | ❌ |
| consume | ✅ | ❌ |
| count | ✅ | ✅ |
| datatable | ✅ | ✅ |
| distinct | ✅ | ❌ |
| evaluate | ✅ | ❌ |
| extend | ✅ | ✅ |
| externaldata | ✅ | ❌ |
| facet | ✅ | ❌ |
| find | ✅ | ❌ |
| fork | ✅ | ❌ |
| getschema | ✅ | ✅ |
| join | ✅ | 🚧 |
| lookup | ✅ | ❌ |
| mv-apply | ✅ | ❌ |
| mv-expand | ✅ | ✅ |
| ✅ | ✅ | |
| project | ✅ | ✅ |
| project-away | ✅ | ✅ |
| project-keep | ✅ | ✅ |
| project-rename | ✅ | ✅ |
| project-reorder | ✅ | ❌ |
| parse | ✅ | ❌ |
| parse-where | ✅ | ❌ |
| parse-kv | ✅ | ❌ |
| partition | ✅ | ❌ |
| range | ✅ | 🚧 |
| reduce | ✅ | ❌ |
| render | ✅ | ❌ |
| sample | ✅ | ❌ |
| sample-distinct | ✅ | ❌ |
| scan | ❌ | ❌ |
| search | ❌ | ❌ |
| serialize | ✅ | ✅ |
| summarize | ✅ | ✅ |
| sort | ✅ | ✅ |
| take | ✅ | ✅ |
| top | ✅ | ✅ |
| top-nested | ❌ | ❌ |
| top-hitters | ❌ | ❌ |
| union | ✅ | 🚧 |
| where | ✅ | ✅ |
| Type | Parser | Planner |
|---|---|---|
| alias | ❌ | ❌ |
| let | ✅ | ❌ |
| pattern | ❌ | ❌ |
| query parameters decleration | ❌ | ❌ |
| restrict | ❌ | ❌ |
| set | ❌ | ❌ |
| tabular expression | ✅ | 🚧 |
| Function | Implemented |
|---|---|
| abs() | ✔️ |
| acos() | ✔️ |
| asin() | ✔️ |
| atan() | ✔️ |
| atan2() | ✔️ |
| beta_cdf() | ❌ |
| beta_inv() | ❌ |
| beta_pdf() | ❌ |
| cos() | ✔️ |
| cot() | ✔️ |
| degrees() | ✔️ |
| erf() | ❌ |
| erfc() | ❌ |
| exp() | ✔️ |
| exp10() | ❌ |
| exp2() | ❌ |
| gamma() | ❌ |
| isfinite() | ❌ |
| isinf() | ❌ |
| isnan() | ❌ |
| log() | ✔️ |
| log10() | ✔️ |
| log2() | ✔️ |
| loggamma() | ❌ |
| not() | ❌ |
| pi() | ✔️ |
| pow() | ✔️ |
| radians() | ✔️ |
| rand() | 🚧 |
| range() | ❌ |
| round() | ✔️ |
| sign() | ❌ |
| sin() | ✔️ |
| sqrt() | ✔️ |
| tan() | ✔️ |
| welch_test() | ❌ |
| Function | Implemented |
|---|---|
| case() | ❌ |
| coalesce() | ✔️ |
| iff() | ❌ |
| max_of() | ❌ |
| min_of() | ❌ |
| Function | Implemented |
|---|---|
| base64_encode_tostring() | ❌ |
| base64_encode_fromguid() | ❌ |
| base64_decode_tostring() | ❌ |
| base64_decode_toarray() | ❌ |
| base64_decode_toguid() | ❌ |
| countof() | ❌ |
| extract() | ❌ |
| extract_all() | ❌ |
| extract_json() | ❌ |
| has_any_index() | ❌ |
| indexof() | 🚧 |
| isempty() | ❌ |
| isnotempty() | ❌ |
| isnotnull() | ❌ |
| isnull() | ❌ |
| parse_command_line() | ❌ |
| parse_csv() | ❌ |
| parse_ipv4() | ❌ |
| parse_ipv4_mask() | ❌ |
| parse_ipv6() | ❌ |
| parse_ipv6_mask() | ❌ |
| parse_json() | ❌ |
| parse_url() | ❌ |
| parse_urlquery() | ❌ |
| parse_version() | ❌ |
| replace_regex() | ✅ |
| replace_string() | ✅ |
| replace_strings() | ❌ |
| punycode_from_string() | ❌ |
| punycode_to_string() | ❌ |
| reverse() | ✔️ |
| split() | ✅ |
| strcat() | ✅ |
| strcat_delim() | ❌ |
| strcmp() | ❌ |
| strlen() | ✅ |
| strrep() | 🚧 |
| substring() | ✅ |
| tohex() | ❌ |
| tolower() | ✅ |
| toupper() | ✅ |
| translate() | ❌ |
| trim() | ✔️ |
| trim_end() | ❌ |
| trim_start() | ❌ |
| url_decode() | ❌ |
| url_encode() | ❌ |
| Function | Implemented |
|---|---|
| binary_and() | ❌ |
| binary_not() | ❌ |
| binary_or() | ❌ |
| binary_shift_left() | ❌ |
| binary_shift_right() | ❌ |
| binary_xor() | ❌ |
| bitset_count_ones() | ❌ |
| Function | Implemented |
|---|---|
| tobool() | ❌ |
| todatetime() | ❌ |
| todecimal() | ❌ |
| todouble() | ❌ |
| toguid() | ❌ |
| toint() | ❌ |
| tolong() | ❌ |
| tostring() | ❌ |
| totimespan() | ❌ |
| Function | Implemented |
|---|---|
| gettype() | ❌ |
| Function | Implemented |
|---|---|
| column_ifexists() | ❌ |
| column_cluster_endpoint() | ❌ |
| column_database() | ❌ |
| current_principal() | ❌ |
| current_principal_details() | ❌ |
| current_principal_is_member_of() | ❌ |
| cursor_after() | ❌ |
| estimate_data_size() | ❌ |
| extent_id() | ❌ |
| extent_tags() | ❌ |
| ingestion_time() | ❌ |
| Function | Implemented |
|---|---|
| next() | ❌ |
| prev() | ❌ |
| row_cumsum() | ❌ |
| row_number() | ✔️ |
| row_rank_dense() | ❌ |
| row_rank_min() | ❌ |
| Function | Implemented |
|---|---|
| bin() | ❌ |
| bin_at() | ❌ |
| ceiling() | ❌ |
| Function | Implemented |
|---|---|
| hash() | ❌ |
| hash_combine() | ❌ |
| hash_many() | ❌ |
| hash_md5() | ❌ |
| hash_sha1() | ❌ |
| hash_sha256() | ❌ |
| hash_xxhash64() | ❌ |
| Function | Implemented |
|---|---|
| dcount_hll() | ❌ |
| hll_merge() | ❌ |
| percentile_tdigest() | ❌ |
| percentile_array_tdigest() | ❌ |
| percentrank_tdigest() | ❌ |
| rank_tdigest() | ❌ |
| merge_tdigest() | ❌ |
| Function | Implemented |
|---|---|
| toscalar() | ❌ |
| Function | Implemented |
|---|---|
| ipv4_compare() | ❌ |
| ipv4_is_in_range() | ❌ |
| ipv4_is_in_any_range() | ❌ |
| ipv4_is_private() | ❌ |
| ipv4_netmask_suffix() | ❌ |
| ipv4_range_to_cidr_list() | ❌ |
| ipv6_compare() | ❌ |
| ipv6_is_match() | ❌ |
| format_ipv4() | ❌ |
| format_ipv4_mask() | ❌ |
| ipv6_is_in_range() | ❌ |
| ipv6_is_in_any_range() | ❌ |
| geo_info_from_ip_address() | ❌ |
| has_ipv4() | ❌ |
| has_ipv4_prefix() | ❌ |
| has_any_ipv4() | ❌ |
| has_any_ipv4_prefix() | ❌ |
| Function | Implemented |
|---|---|
| convert_angle() | ❌ |
| convert_energy() | ❌ |
| convert_force() | ❌ |
| convert_length() | ❌ |
| convert_mass() | ❌ |
| convert_speed() | ❌ |
| convert_temperature() | ❌ |
| convert_volume() | ❌ |
| Function | Implemented |
|---|---|
| array_concat() | ❌ |
| array_iff() | ❌ |
| array_index_of() | ❌ |
| array_join() | ✔️ |
| array_length() | ✔️ |
| array_reverse() | ✔️ |
| array_rotate_left() | ❌ |
| array_rotate_right() | ❌ |
| array_shift_left() | ❌ |
| array_shift_right() | ❌ |
| array_slice() | ✔️ |
| array_sort_asc() | ❌ |
| array_sort_desc() | ❌ |
| array_split() | ❌ |
| array_sum() | ❌ |
| bag_has_key() | ❌ |
| bag_keys() | ❌ |
| bag_merge() | ❌ |
| bag_pack() | ❌ |
| bag_pack_columns() | ❌ |
| bag_remove_keys() | ❌ |
| bag_set_key() | ❌ |
| jaccard_index() | ❌ |
| pack_all() | ❌ |
| pack_array() | ❌ |
| repeat() | ❌ |
| set_difference() | ❌ |
| set_has_element() | ❌ |
| set_intersect() | ❌ |
| set_union() | ❌ |
| treepath() | ❌ |
| zip() | ❌ |
| Function | Implemented |
|---|---|
| ago() | ❌ |
| datetime_add() | ❌ |
| datetime_diff() | ❌ |
| datetime_local_to_utc() | ❌ |
| datetime_part() | ❌ |
| datetime_utc_to_local() | ❌ |
| dayofmonth() | ❌ |
| dayofweek() | ❌ |
| dayofyear() | ❌ |
| endofday() | ❌ |
| endofmonth() | ❌ |
| endofweek() | ❌ |
| endofyear() | ❌ |
| format_datetime() | ❌ |
| format_timespan() | ❌ |
| getyear() | ❌ |
| hourofday() | ❌ |
| make_datetime() | ❌ |
| make_timespan() | ❌ |
| monthofyear() | ❌ |
| now() | ✔️ |
| startofday() | ❌ # today |
| startofmonth() | ❌ |
| startofweek() | ❌ |
| startofyear() | ❌ |
| unixtime_microseconds_todatetime() | ❌ |
| unixtime_milliseconds_todatetime() | ❌ |
| unixtime_nanoseconds_todatetime() | ❌ |
| unixtime_seconds_todatetime() | ❌ |
| weekofyear() | ❌ |
| Function | Implemented |
|---|---|
| series_cosine_similarity() | ❌ |
| series_decompose() | ❌ |
| series_decompose_anomalies() | ❌ |
| series_decompose_forecast() | ❌ |
| series_dot_product() | ❌ |
| series_fill_backward() | ❌ |
| series_fill_constant() | ❌ |
| series_fill_forward() | ❌ |
| series_fill_linear() | ❌ |
| series_fft() | ❌ |
| series_fir() | ❌ |
| series_fit_2lines() | ❌ |
| series_fit_2lines_dynamic() | ❌ |
| series_fit_line() | ❌ |
| series_fit_line_dynamic() | ❌ |
| series_fit_poly() | ❌ |
| series_ifft() | ❌ |
| series_iir() | ❌ |
| series_magnitude() | ❌ |
| series_outliers() | ❌ |
| series_pearson_correlation() | ❌ |
| series_periods_detect() | ❌ |
| series_periods_validate() | ❌ |
| series_product() | ❌ |
| series_seasonal() | ❌ |
| series_stats() | ❌ |
| series_stats_dynamic() | ❌ |
| series_sum() | ❌ |
| Function | Implemented |
|---|---|
| series_abs() | ❌ |
| series_acos() | ❌ |
| series_add() | ❌ |
| series_asin() | ❌ |
| series_atan() | ❌ |
| series_ceiling() | ❌ |
| series_cos() | ❌ |
| series_devide() | ❌ |
| series_equals() | ❌ |
| series_exp() | ❌ |
| series_floor() | ❌ |
| series_greater() | ❌ |
| series_greater_equals() | ❌ |
| series_less() | ❌ |
| series_less_equals() | ❌ |
| series_log() | ❌ |
| series_multiply() | ❌ |
| series_not_equals() | ❌ |
| series_pow() | ❌ |
| series_sign() | ❌ |
| series_sin() | ❌ |
| series_subtract() | ❌ |
| series_tan() | ❌ |
| Function | Implemented |
|---|---|
| geo_angle() | ❌ |
| geo_azimuth() | ❌ |
| geo_closest_point_on_line() | ❌ |
| geo_closest_point_on_polygon() | ❌ |
| geo_distance_2points() | ❌ |
| geo_distance_point_to_line() | ❌ |
| geo_distance_point_to_polygon() | ❌ |
| geo_from_wkt() | ❌ |
| geo_intersects_2lines() | ❌ |
| geo_intersects_2polygons() | ❌ |
| geo_intersects_line_with_polygon() | ❌ |
| geo_intersection_2lines() | ❌ |
| geo_intersection_2polygons() | ❌ |
| geo_intersection_line_with_polygon() | ❌ |
| geo_point_buffer() | ❌ |
| geo_point_in_circle() | ❌ |
| geo_point_in_polygon() | ❌ |
| geo_point_to_geohash() | ❌ |
| geo_point_to_s2cell() | ❌ |
| geo_point_to_h3cell() | ❌ |
| geo_line_buffer() | ❌ |
| geo_line_centroid() | ❌ |
| geo_line_densify() | ❌ |
| geo_line_interpolate_point() | ❌ |
| geo_line_length() | ❌ |
| geo_line_locate_point() | ❌ |
| geo_line_simplify() | ❌ |
| geo_line_to_s2cells() | ❌ |
| geo_polygon_area() | ❌ |
| geo_polygon_buffer() | ❌ |
| geo_polygon_centroid() | ❌ |
| geo_polygon_densify() | ❌ |
| geo_polygon_perimeter() | ❌ |
| geo_polygon_simplify() | ❌ |
| geo_polygon_to_s2cells() | ❌ |
| geo_polygon_to_h3cells() | ❌ |
| geo_geohash_to_central_point() | ❌ |
| geo_geohash_neighbors() | ❌ |
| geo_geohash_to_polygon() | ❌ |
| geo_s2cell_to_central_point() | ❌ |
| geo_s2cell_neighbors() | ❌ |
| geo_s2cell_to_polygon() | ❌ |
| geo_h3cell_to_central_point() | ❌ |
| geo_h3cell_neighbors() | ❌ |
| geo_h3cell_to_polygon() | ❌ |
| geo_h3cell_parent() | ❌ |
| geo_h3cell_children() | ❌ |
| geo_h3cell_level() | ❌ |
| geo_h3cell_rings() | ❌ |
| geo_simplify_polygons_array() | ❌ |
| geo_union_lines_array() | ❌ |
| geo_union_polygons_array() | ❌ |
| Function | Implemented |
|---|---|
| avg() | ✔️ |
| avgif() | ❌ |
| count() | ✔️ |
| countif() | ❌ |
| count_distinct() | ❌ |
| count_distinctif() | ❌ |
| dcount() | ❌ |
| dcountif() | ❌ |
| hll() | ❌ |
| hll_if() | ❌ |
| hll_merge() | ❌ |
| max() | ✔️ |
| maxif() | ❌ |
| min() | ✔️ |
| minif() | ❌ |
| percentile() | ❌ |
| percentiles() | ❌ |
| percentiles_array() | ❌ |
| percentilesw() | ❌ |
| percentilesw_array() | ❌ |
| stdev() | ❌ |
| stdevif() | ❌ |
| stdevp() | ❌ |
| sum() | ✔️ |
| sumif() | ❌ |
| tdigest() | ❌ |
| tdigest_merge() | ❌ |
| variance() | ❌ |
| varianceif() | ❌ |
| variancep() | ❌ |
| variancepif() | ❌ |
| Function | Implemented |
|---|---|
| binary_all_and() | ❌ |
| binary_all_or() | ❌ |
| binary_all_xor() | ❌ |
| Function | Implemented |
|---|---|
| arg_max() | ❌ |
| arg_min() | ❌ |
| take_any() | ❌ |
| take_anyif() | ❌ |
| Function | Implemented |
|---|---|
| buildschema() | ❌ |
| make_bag() | ❌ |
| make_bag_if() | ❌ |
| make_list() | ❌ |
| make_list_if() | ❌ |
| make_list_with_nulls() | ❌ |
| make_set() | ❌ |
| make_set_if() | ❌ |
Footnotes
-
Parsed as 64-bit floating number instead of 128-bit ↩