From d0bd47947b0adf4c5738f099db8d30664947fcfd Mon Sep 17 00:00:00 2001 From: Andrei Zavada Date: Sat, 18 Feb 2017 06:31:07 +0200 Subject: [PATCH 01/15] describe inverse distrib functions (PERCENTILE etc) --- .../querying/select/aggregate-functions.md | 45 +++++++++++++++---- 1 file changed, 37 insertions(+), 8 deletions(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md b/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md index e7574e6dfa..54ae1e9d7c 100644 --- a/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md +++ b/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md @@ -25,18 +25,21 @@ aliases: [arithmetic]: ../arithmetic-operations -You can turn a set of rows in your Riak TS table into a value with the aggregate feature. This document will walk you through the functions that make up aggregation in Riak TS. +You can turn a set of rows in your Riak TS table into a value with the aggregate feature. This document will walk you through the functions that make up aggregation in Riak TS. ## Aggregate Functions -* `COUNT()` - Returns the number of entries that match specified criteria. -* `SUM()` - Returns the sum of entries that match specified criteria. -* `MEAN()` & `AVG()` - Returns the average of entries that match specified criteria. -* `MIN()` - Returns the smallest value of entries that match specified criteria. -* `MAX()` - Returns the largest value of entries that match specified criteria. -* `STDDEV()`/`STDDEV_SAMP()` - Returns the statistical standard deviation of all entries that match specified criteria using Sample Standard Deviation. -* `STDDEV_POP()` - Returns the statistical standard deviation of all entries that match specified criteria using Population Standard Deviation. +* `COUNT` - Returns the number of entries that match specified criteria. +* `SUM` - Returns the sum of entries that match specified criteria. +* `MEAN` & `AVG` - Returns the average of entries that match specified criteria. +* `MIN` - Returns the smallest value of entries that match specified criteria. +* `MAX` - Returns the largest value of entries that match specified criteria. +* `STDDEV`/`STDDEV_SAMP` - Returns the statistical standard deviation of all entries that match specified criteria using Sample Standard Deviation. +* `STDDEV_POP` - Returns the statistical standard deviation of all entries that match specified criteria using Population Standard Deviation. +* `PERCENTILE_DISC`/`PERCENTILE_CONT` - Return a given percentile value in the observations represented by entries in the selection, assuming discrete/continuous distrinution model. +* `MEDIAN` - equivalent to `PERCENTILE_DISC` called with 0.5 as a parameter. +* `MODE` - Returns the mode of the population represented by entries in the selection. {{% note title="A Note On Negation" %}} You cannot negate an aggregate function. If you attempt something like: `select -count(temperature)`, you will receive an error. Instead, you can achieve negation with `-1*`; for instance: `-1*COUNT(...)`. @@ -143,3 +146,29 @@ Returns `NULL` if no values were returned or all values were `NULL`. |-------------------|-------------| | sint64 | sint64 | | double | double | + +### `PERCENTILE_DISC`, `PERCENTILE_CONT` + +Calculate a percentile, given as a value in the range [0..1], in a population of entries in the selection, with null values discarded. The column argument must be of numeric type (`sint64`, `double` or `timestamp`). Return type is the type of the column argument for `PERCENTILE_DISC`, and `double` for `PERCENTILE_CONT`. + +The `_DISC`/`_CONT` variants differ in the discrete/continuous distribution model assumed for the population. For the former, the value returned is the largest observation that is less than or equal to the percentile computed. For the latter, the result is the linear interpolation between two observations surrounding the percentile. + +```sql +SELECT PERCENTILE_DISC(x, 0.3), PERCENTILE_CONT(x, 0.3) FROM Table WHERE ... +``` + +### MEDIAN + +Equivalent to `PERCENTILE_DISC(x, 0.5)`. + +### MODE + +Calculate the mode (i.e., the value occurring with the highest frequency) of observations in the selection, with nulls discarded. If there are more than one modes, the lowest one is returned. The column argument must be of numeric type (`sint64`, `double` or `timestamp`). Return type is the type of the column argument. + +{{% note title="Notes on inverse distribution functions" %}} +1. These functions cannot be used in conjunction with `ORDER BY` or `GROUP BY` clauses, or with any other column specifiers. + +2. Multiple inverse distrinution function calls are permitted as long as they all have the same column argument. + +3. Inverse distrinution functions use query buffers. Queries with large selection size may incur increased latency depending on the value of `riak_kv.query.timeseries.qbuf_inmem_max` in your riak.conf and the I/O throughput of storage backing up query buffers (`riak_kv.query.timeseries.qbuf_root_path`). +{{% /note %}} From 3ed80008c52fb5f4da1642b9923b6f76bedd2e9b Mon Sep 17 00:00:00 2001 From: Andrei Zavada Date: Wed, 22 Feb 2017 19:51:47 +0200 Subject: [PATCH 02/15] clarify on percentile value range --- .../riak/ts/1.6.0/using/querying/select/aggregate-functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md b/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md index 54ae1e9d7c..72745fdaf1 100644 --- a/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md +++ b/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md @@ -149,7 +149,7 @@ Returns `NULL` if no values were returned or all values were `NULL`. ### `PERCENTILE_DISC`, `PERCENTILE_CONT` -Calculate a percentile, given as a value in the range [0..1], in a population of entries in the selection, with null values discarded. The column argument must be of numeric type (`sint64`, `double` or `timestamp`). Return type is the type of the column argument for `PERCENTILE_DISC`, and `double` for `PERCENTILE_CONT`. +Calculate a percentile, given as a value in the range [0.0..1.0], in a population of entries in the selection, with null values discarded. The column argument must be of numeric type (`sint64`, `double` or `timestamp`). Return type is the type of the column argument for `PERCENTILE_DISC`, and `double` for `PERCENTILE_CONT`. The `_DISC`/`_CONT` variants differ in the discrete/continuous distribution model assumed for the population. For the former, the value returned is the largest observation that is less than or equal to the percentile computed. For the latter, the result is the linear interpolation between two observations surrounding the percentile. From a876fc444db078772c70537fd752291f8e2fdbc8 Mon Sep 17 00:00:00 2001 From: Andrei Zavada Date: Mon, 27 Feb 2017 18:22:19 +0200 Subject: [PATCH 03/15] revert inaccurate s/Maximum concurrent queries/Maximum query queues/ --- content/riak/ts/1.6.0/configuring/riakconf.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/riak/ts/1.6.0/configuring/riakconf.md b/content/riak/ts/1.6.0/configuring/riakconf.md index e01c94cf9b..d0b7f8ea1a 100644 --- a/content/riak/ts/1.6.0/configuring/riakconf.md +++ b/content/riak/ts/1.6.0/configuring/riakconf.md @@ -72,9 +72,9 @@ riak_kv.query.timeseries.timeout = 10000 *This setting was formerly `timeseries_query_timeout_ms`, please update accordingly.* -### Maximum query queues +### Maximum concurrent queries -`riak_kv.query.timeseries.max_concurrent_queries`: the maximum number of query queues that can run concurrently per node (each queue serving one query). Default is 3. +`riak_kv.query.timeseries.max_concurrent_queries`: the maximum number of queries that can run concurrently per node (each queue serving one query). Default is 3. The total number of queries that can be run on a cluster is the number of nodes multiplied by the `max_concurrent_queries` value. This constraint is to prevent an unbounded number of queries overloading the cluster. From eeb10124ec2402a73532b6afef737dec48224970 Mon Sep 17 00:00:00 2001 From: Eric Date: Tue, 14 Mar 2017 09:05:00 -0700 Subject: [PATCH 04/15] Update canonical link for LIMIT --- content/riak/ts/1.6.0/using/querying/select/limit.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/limit.md b/content/riak/ts/1.6.0/using/querying/select/limit.md index 85e36d3de6..3a3cc6ccfa 100644 --- a/content/riak/ts/1.6.0/using/querying/select/limit.md +++ b/content/riak/ts/1.6.0/using/querying/select/limit.md @@ -12,7 +12,7 @@ project_version: "1.6.0" toc: true version_history: in: "1.6.0+" -canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/limit/" +canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select/limit/" --- [select]: /riak/ts/1.6.0/using/querying/select From d11ae9c151090709b6f33afd307b7a0f408cc539 Mon Sep 17 00:00:00 2001 From: Eric Date: Tue, 14 Mar 2017 09:05:32 -0700 Subject: [PATCH 05/15] Add page on IN keyword for SELECT --- .../riak/ts/1.6.0/using/querying/select/in.md | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 content/riak/ts/1.6.0/using/querying/select/in.md diff --git a/content/riak/ts/1.6.0/using/querying/select/in.md b/content/riak/ts/1.6.0/using/querying/select/in.md new file mode 100644 index 0000000000..c73524ba95 --- /dev/null +++ b/content/riak/ts/1.6.0/using/querying/select/in.md @@ -0,0 +1,62 @@ +--- +title: "IN in Riak TS" +description: "Using the IN keyword in Riak TS." +menu: + riak_ts-1.6.0: + name: "IN" + identifier: "in_riakts" + weight: 170 + parent: "select_riakts" +project: "riak_ts" +project_version: "1.6.0" +toc: true +version_history: + in: "1.6.0+" +canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select/in/" +--- + +[select]: /riak/ts/1.6.0/using/querying/select/ +[query guidelines]: /riak/ts/1.6.0/using/querying/guidelines/ + +The IN keyword is used with [`SELECT`][select] to return results where a specified column matches one or more given values. + +This document shows how to run various queries using `IN`. See the [guidelines][query guidelines] for more information on limitations and rules for queries in Riak TS. + +## Overview + +The IN keyword returns results from a SELECT statement which match one or more literal values. + +`IN` has the following syntax: + +```sql +SELECT * FROM «table_name» WHERE «column_name» IN («values») +``` + +{{% note title="WARNING" %}} +Before you run `SELECT` you must ensure the node issuing the query has adequate memory to receive the response. If the returning rows do not fit into the memory of the requesting node, the node is likely to fail. +{{% /note %}} + +## Example + +The following table defines a schema for sensor data. + +```sql +CREATE TABLE SensorData +( + id SINT64 NOT NULL, + time TIMESTAMP NOT NULL, + value DOUBLE, + PRIMARY KEY ( + (id, QUANTUM(time, 15, 'm')), + id, time + ) +) +``` + +### Basic + +Return only results where the value column matches the given values: + +```sql +SELECT * FROM SensorData WHERE value IN (1.2, 3.4, 5.6); +``` From 78487cae322d3bed421a8cf620c4032b2503f2f2 Mon Sep 17 00:00:00 2001 From: Lauren Date: Tue, 14 Mar 2017 12:52:21 -0400 Subject: [PATCH 06/15] Edits --- .../querying/select/aggregate-functions.md | 56 ++++++++++++++----- 1 file changed, 41 insertions(+), 15 deletions(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md b/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md index 72745fdaf1..d72b09a2ce 100644 --- a/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md +++ b/content/riak/ts/1.6.0/using/querying/select/aggregate-functions.md @@ -23,6 +23,7 @@ aliases: [arithmetic]: ../arithmetic-operations +[riak.conf]: /riak/ts/1.6.0/configuring/riakconf/ You can turn a set of rows in your Riak TS table into a value with the aggregate feature. This document will walk you through the functions that make up aggregation in Riak TS. @@ -37,9 +38,10 @@ You can turn a set of rows in your Riak TS table into a value with the aggregate * `MAX` - Returns the largest value of entries that match specified criteria. * `STDDEV`/`STDDEV_SAMP` - Returns the statistical standard deviation of all entries that match specified criteria using Sample Standard Deviation. * `STDDEV_POP` - Returns the statistical standard deviation of all entries that match specified criteria using Population Standard Deviation. -* `PERCENTILE_DISC`/`PERCENTILE_CONT` - Return a given percentile value in the observations represented by entries in the selection, assuming discrete/continuous distrinution model. -* `MEDIAN` - equivalent to `PERCENTILE_DISC` called with 0.5 as a parameter. -* `MODE` - Returns the mode of the population represented by entries in the selection. +* `PERCENTILE_CONT` - Assuming a continuous distribution model, returns an interpolated value that would occur in the given percentile value with respect to the sort specification, assuming +* `PERCENTILE_DISC`- Assuming a discrete distribution model, returns an element from the set determined by the percentile value and sort specification. +* `MEDIAN` - returns an element from the set determined by 0.5. +* `MODE` - Returns the value that appears most often in the selection. {{% note title="A Note On Negation" %}} You cannot negate an aggregate function. If you attempt something like: `select -count(temperature)`, you will receive an error. Instead, you can achieve negation with `-1*`; for instance: `-1*COUNT(...)`. @@ -147,28 +149,52 @@ Returns `NULL` if no values were returned or all values were `NULL`. | sint64 | sint64 | | double | double | -### `PERCENTILE_DISC`, `PERCENTILE_CONT` -Calculate a percentile, given as a value in the range [0.0..1.0], in a population of entries in the selection, with null values discarded. The column argument must be of numeric type (`sint64`, `double` or `timestamp`). Return type is the type of the column argument for `PERCENTILE_DISC`, and `double` for `PERCENTILE_CONT`. +### `PERCENTILE_DISC` & `PERCENTILE_CONT` -The `_DISC`/`_CONT` variants differ in the discrete/continuous distribution model assumed for the population. For the former, the value returned is the largest observation that is less than or equal to the percentile computed. For the latter, the result is the linear interpolation between two observations surrounding the percentile. +Calculate a percentile, given as a value in the range [0.0..1.0], for entries in the selection with null values discarded. These are inverse distribution functions and cannot be used in conjunction with `ORDER BY` or `GROUP BY` clauses, or with any other column specifiers. See note below for more information and guidelines. + +| Column Input Type | Return Type for `PERCENTILE_DISC` | Return Type for `PERCENTILE_CONT` | +|-------------------|-----------------------------------|-----------------------------------| +| sint64 | sint64 | double | +| double | double | double | +| timestamp | timestamp | double | + +`_DISC` and `_CONT` differ in that `_DISC` has a discrete distribution model and `_CONT` has a continuous distribution model. `_DISC` returns the largest observation that is less than or equal to the percentile computed. `_CONT` returns the linear interpolation between two observations surrounding the percentile. ```sql -SELECT PERCENTILE_DISC(x, 0.3), PERCENTILE_CONT(x, 0.3) FROM Table WHERE ... +SELECT PERCENTILE_DISC(x, 0.3), PERCENTILE_CONT(x, 0.3) FROM GeoCheckin WHERE ... ``` -### MEDIAN -Equivalent to `PERCENTILE_DISC(x, 0.5)`. +### `MEDIAN` + +Calculate 50% value for entries in the selection with null values discarded. This is an inverse distribution function and cannot be used in conjunction with `ORDER BY` or `GROUP BY` clauses, or with any other column specifiers. See note below for more information and guidelines. + +| Column Input Type | Return Type | +|-------------------|-------------| +| sint64 | sint64 | +| double | double | +| timestamp | timestamp | + + +### `MODE` + +Calculate the value occurring with the highest frequency of entries in the selection with nulls discarded. If more than one mode is found, the lowest one is returned. This is an inverse distribution function and cannot be used in conjunction with `ORDER BY` or `GROUP BY` clauses, or with any other column specifiers. See note below for more information and guidelines. + +| Column Input Type | Return Type | +|-------------------|-------------| +| sint64 | sint64 | +| double | double | +| timestamp | timestamp | -### MODE -Calculate the mode (i.e., the value occurring with the highest frequency) of observations in the selection, with nulls discarded. If there are more than one modes, the lowest one is returned. The column argument must be of numeric type (`sint64`, `double` or `timestamp`). Return type is the type of the column argument. +{{% note title="Inverse distribution functions" %}} +1. Inverse distribution functions cannot be used in conjunction with `ORDER BY` or `GROUP BY` clauses, or with any other column specifiers. -{{% note title="Notes on inverse distribution functions" %}} -1. These functions cannot be used in conjunction with `ORDER BY` or `GROUP BY` clauses, or with any other column specifiers. +2. Multiple inverse distribution function calls are permitted as long as they all have the same column argument. -2. Multiple inverse distrinution function calls are permitted as long as they all have the same column argument. +3. Inverse distribution functions use query buffers. Queries with large selection size may incur increased latency depending on the value of `riak_kv.query.timeseries.qbuf_inmem_max` in your [riak.conf] and the I/O throughput of storage backing up query buffers (`riak_kv.query.timeseries.qbuf_root_path`). -3. Inverse distrinution functions use query buffers. Queries with large selection size may incur increased latency depending on the value of `riak_kv.query.timeseries.qbuf_inmem_max` in your riak.conf and the I/O throughput of storage backing up query buffers (`riak_kv.query.timeseries.qbuf_root_path`). +`PERCENTILE_DISC`, `PERCENTILE_CONT`, `MEDIAN` and `MODE` are all inverse distribution functions. {{% /note %}} From 9a5bd3328a5cb1423c95f2649c72ae60b434c9de Mon Sep 17 00:00:00 2001 From: Lauren Date: Tue, 14 Mar 2017 13:33:52 -0400 Subject: [PATCH 07/15] Edits --- content/riak/ts/1.6.0/using/querying/select/in.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/in.md b/content/riak/ts/1.6.0/using/querying/select/in.md index c73524ba95..69244415a8 100644 --- a/content/riak/ts/1.6.0/using/querying/select/in.md +++ b/content/riak/ts/1.6.0/using/querying/select/in.md @@ -18,18 +18,18 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select/in/ [select]: /riak/ts/1.6.0/using/querying/select/ [query guidelines]: /riak/ts/1.6.0/using/querying/guidelines/ -The IN keyword is used with [`SELECT`][select] to return results where a specified column matches one or more given values. +The IN clause is used with [`SELECT`][select] to return results where a specified column matches one or more given values. This document shows how to run various queries using `IN`. See the [guidelines][query guidelines] for more information on limitations and rules for queries in Riak TS. ## Overview -The IN keyword returns results from a SELECT statement which match one or more literal values. +The IN clause returns results from a SELECT statement that match one or more literal values. `IN` has the following syntax: ```sql -SELECT * FROM «table_name» WHERE «column_name» IN («values») +SELECT * FROM »table_name« WHERE »column_name« IN (»values«) ``` {{% note title="WARNING" %}} From 1c7c86a9df03fe5630d71d89b86b1c76b1f1119a Mon Sep 17 00:00:00 2001 From: Lauren Rother Date: Tue, 14 Mar 2017 15:55:41 -0400 Subject: [PATCH 08/15] Update language Edits to language and some formatting. --- .../riak/ts/1.6.0/using/querying/delete.md | 2 +- .../riak/ts/1.6.0/using/querying/explain.md | 8 ++++--- .../ts/1.6.0/using/querying/guidelines.md | 3 +-- .../riak/ts/1.6.0/using/querying/reference.md | 6 ++--- .../riak/ts/1.6.0/using/querying/select.md | 22 ++++++++++--------- .../1.6.0/using/querying/select/group-by.md | 12 +++++----- .../ts/1.6.0/using/querying/select/limit.md | 6 ++--- .../1.6.0/using/querying/select/order-by.md | 8 +++---- 8 files changed, 35 insertions(+), 32 deletions(-) diff --git a/content/riak/ts/1.6.0/using/querying/delete.md b/content/riak/ts/1.6.0/using/querying/delete.md index c6e4a0bf18..ba69071086 100644 --- a/content/riak/ts/1.6.0/using/querying/delete.md +++ b/content/riak/ts/1.6.0/using/querying/delete.md @@ -38,7 +38,7 @@ The DELETE statement removes whole records matching a WHERE clause and a given t DELETE FROM «table_name» WHERE column1 = value1 [AND column2 = value ...] AND { time = t | time op t1 AND time op t2 }, where op = { >, <, >=, <= } ``` -The WHERE clause in `DELETE` should include all columns comprising `PRIMARY KEY` in the table definition. +The WHERE clause in `DELETE` should include all columns comprising the primary key in the table definition. Timestamp values can be provided as milliseconds or in [supported ISO 8601 formats][time rep]. `DELETE` acts as a single-key delete, like the [HTTP DELETE command][http delete]. diff --git a/content/riak/ts/1.6.0/using/querying/explain.md b/content/riak/ts/1.6.0/using/querying/explain.md index 8447b20a10..8e3fe52792 100644 --- a/content/riak/ts/1.6.0/using/querying/explain.md +++ b/content/riak/ts/1.6.0/using/querying/explain.md @@ -20,7 +20,8 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/explain" [planning]: /riak/ts/1.6.0/using/planning [riak shell]: /riak/ts/1.6.0/using/riakshell -You can use the EXPLAIN statement to better understand how a query you would like to run will be executed. This document will show you how to use `EXPLAIN` in Riak TS. +You can use an EXPLAIN statement to better understand how a query you would like to run will be executed. This document will show you how to use `EXPLAIN` in Riak TS. + ## EXPLAIN Guidelines @@ -33,10 +34,11 @@ The details about each subquery include: * A flag indicating if that flag is inclusive (less than or greater than or equal), * The range scan end key, * Another inclusive flag for the end key, and -* The value of the filter, i.e. constrained columns which are not a part of the partition key +* The value of the filter, i.e. constrained columns which are not a part of the partition key. To use `EXPLAIN`, the table in question must first be [defined][planning] and [activated][creating-activating]. Only metadata about the desired query is returned. + ## EXPLAIN Example Assuming our standard example table, GeoCheckin, has been created: @@ -56,7 +58,7 @@ CREATE TABLE GeoCheckin ) ``` -We can run `EXPLAIN` in riak shell as follows: +We can run `EXPLAIN` in [riak shell] as follows: ``` riak-shell>EXPLAIN SELECT * FROM GeoCheckin WHERE myfamily = 'family1' AND myseries = 'series1' AND time >= 2 AND time <= 7000000 AND weather='fair'; diff --git a/content/riak/ts/1.6.0/using/querying/guidelines.md b/content/riak/ts/1.6.0/using/querying/guidelines.md index e9cc468530..0736bfc247 100644 --- a/content/riak/ts/1.6.0/using/querying/guidelines.md +++ b/content/riak/ts/1.6.0/using/querying/guidelines.md @@ -66,7 +66,6 @@ It is possible to use ISO 8601-compliant date/time strings rather than integer t ### Local Key - Any field in the local key but not in the partition key can be queried with any operator supported for that field's type. Bounded ranges are not required. Any filter is allowed, including `OR` and `!=` ``` @@ -145,7 +144,7 @@ The following operators are supported for each data type: Blob data should be queried using integers in base 16 (hex) notation, preceded by `0x` using riak shell or by providing any block of data (e.g. binary, text or JSON) through a Riak client library. -However, we do not recommend using blob columns in primary keys yet, due to limitations in the Riak TS 1.5 `list_keys` API. +However, we do not recommend using blob columns in primary keys yet, due to limitations in the `list_keys` API. ### Query parameters diff --git a/content/riak/ts/1.6.0/using/querying/reference.md b/content/riak/ts/1.6.0/using/querying/reference.md index ea16fbd442..6de19ab1b4 100644 --- a/content/riak/ts/1.6.0/using/querying/reference.md +++ b/content/riak/ts/1.6.0/using/querying/reference.md @@ -30,7 +30,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/reference" [arithmetic]: /riak/ts/1.6.0/using/querying/select/arithmetic-operations/ [aggregate]: /riak/ts/1.6.0/using/querying/select/aggregate-functions/ -This document lists each SQL statement available in Riak TS. +This document lists the SQL implementations available in Riak TS. ## DESCRIBE @@ -126,7 +126,7 @@ See the [Creating and Activating Tables][create table] page for more information ## GROUP BY -The GROUP BY statement is used with `SELECT` to pick out and condense rows sharing the same value, then return a single row. +The GROUP BY clause is used with `SELECT` to pick out and condense rows sharing the same value, then return a single row. `GROUP BY` has the following syntax: @@ -180,7 +180,7 @@ See the [LIMIT in Riak TS][limit] page for more information and usage examples. ## OFFSET -The OFFSET clause is used with `SELECT` to skip a specified number of results then return remaining results. +The OFFSET modifier is used with `SELECT` to skip a specified number of results then return remaining results. `OFFSET` has the following syntax: diff --git a/content/riak/ts/1.6.0/using/querying/select.md b/content/riak/ts/1.6.0/using/querying/select.md index 905ea69cba..f4fea9006b 100644 --- a/content/riak/ts/1.6.0/using/querying/select.md +++ b/content/riak/ts/1.6.0/using/querying/select.md @@ -20,10 +20,13 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select" [arithmetic operations]: arithmetic-operations/ [GROUP BY]: group-by/ [guidelines]: /riak/ts/1.6.0/using/querying/guidelines +[IN]: in/ [iso8601]: ../../timerepresentations/ [iso8601 accuracy]: /riak/ts/1.6.0/using/timerepresentations/#reduced-accuracy [ISO 8601]: https://en.wikipedia.org/wiki/ISO_8601 [learn timestamps accuracy]: /riak/ts/1.6.0/learn-about/timestamps/#reduced-accuracy +[LIMIT]: limit/ +[ORDER BY]: order-by/ You can use the SELECT statement in Riak TS to query your TS dataset. This document will show you how to run various queries using `SELECT`. @@ -31,6 +34,9 @@ You can use the SELECT statement in Riak TS to query your TS dataset. This docum * See [aggregate functions] to learn how turn a set of rows in your Riak TS table into a value. * See [arithmetic operations] to see a list of operations available with `SELECT`. * See [GROUP BY] to learn how to condense rows sharing the same value. +* See [ORDER BY] to learn how to sort your results by one or more columns. +* See [LIMIT] to learn how to limit your results. +* See [IN] to learn how to return results for specific columns. For all of the examples on this page, we are using our standard example GeoCheckin table: @@ -59,7 +65,7 @@ When querying with user-supplied data, it is essential that you protect against ## Querying Columns -All queries using `SELECT` must include a 'where' clause with all components. +All queries using `SELECT` must include a WHERE clause with all components. You can select particular columns from the data to query: @@ -281,8 +287,7 @@ See [our documentation on ISO 8601 support][iso8601] for more details on how to ## IS [NOT] NULL -For queries which operate on ranges which contain columns that are nullable, the -WHERE clause may contain NULL condition predicates. +For queries which operate on ranges which contain columns that are nullable, the WHERE clause may contain NULL condition predicates. ```sql _expression_ IS NULL @@ -291,8 +296,7 @@ NOT _expression_ IS NULL -- equivalent to: _expression_ IS NOT NULL NOT _expression_ IS NOT NULL -- equivalent to: _expression_ IS NULL ``` -For example, the following query extracts the key column values for all records -within the time range, which do not have a value set for the temperature column: +For example, the following query extracts the key column values for all records within the time range, which do not have a value set for the temperature column: ```sql SELECT region, state, time @@ -302,13 +306,11 @@ WHERE time > 1234560 AND time < 1234569 AND temperature IS NULL ``` -IS NOT NULL is often used to operate on records which are fully populated to the -point that an application requires. Within this context, the above query to -locate records which do not satisfy such a requirement can be utilized to -investigate and subsequently reconcile records which are partially populated. +`IS NOT NULL` is often used to operate on records that are fully populated to the point that an application requires. Within this context, the above query to locate records that do not satisfy such a requirement can be utilized to +investigate and subsequently reconcile records that are partially populated. Another common use of NULL condition predicates is determining the relative -cardinality of the subset not containing a value for the _column_. +cardinality of the subset not containing a value for the column. ## Client NULL Results diff --git a/content/riak/ts/1.6.0/using/querying/select/group-by.md b/content/riak/ts/1.6.0/using/querying/select/group-by.md index 4598a2e8c3..c1db1e2408 100644 --- a/content/riak/ts/1.6.0/using/querying/select/group-by.md +++ b/content/riak/ts/1.6.0/using/querying/select/group-by.md @@ -18,18 +18,18 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select/gro [aggregate function]: ../aggregate-functions [guidelines]: /riak/ts/1.6.0/using/querying/guidelines -The GROUP BY statement is used with `SELECT` to pick out and condense rows sharing the same value and return a single row. `GROUP BY` is useful for aggregating an attribute of a device over a time period; for instance, you could use it to pull average values for every 30 minute period over the last 24 hours. +The GROUP BY clause is used with `SELECT` to pick out and condense rows sharing the same value and return a single row. `GROUP BY` is useful for aggregating an attribute of a device over a time period; for instance, you could use it to pull average values for every 30 minute period over the last 24 hours. This document will show you how to run various queries using `GROUP BY`. See the [guidelines] for more information on limitations and rules for queries in TS. ## GROUP BY Basics -`GROUP BY` returns a single row for each unique combination of values for columns specified in the GROUP BY statement. There is no guaranteed order for the returned rows. +`GROUP BY` returns a single row for each unique combination of values for columns specified in the GROUP BY clause. There is no guaranteed order for the returned rows. -The SELECT statement must contain only the columns specified in `GROUP BY`. Columns not used as groups can appear as function parameters. The GROUP BY statement works on all rows, not just the values in the partition key, so all columns are available. +The SELECT statement must contain only the columns specified in `GROUP BY`. Columns not used as groups can appear as function parameters. The GROUP BY clause works on all rows, not just the values in the partition key, so all columns are available. -The [aggregate function] may be used with the GROUP BY statement. If used, `SELECT` may contain the columns specified in either `GROUP BY` or the [aggregate function]. +[Aggregate functions][aggregate function] may be used with the GROUP BY clause. If used, `SELECT` may contain the columns specified in either `GROUP BY` or the [aggregate function]. {{% note title="WARNING" %}} Before you run `GROUP BY` you must ensure the node issuing the query has adequate memory to receive the response. If the returning rows do not fit into the memory of the requesting node, the node is likely to fail. @@ -118,7 +118,7 @@ GROUP BY project; ### GROUP BY columns not in the partition key -Since the GROUP BY statement works on all rows, all columns are available, which means that using `GROUP BY` on the partition key values is not very useful because the partition key limits the result set. +Since the GROUP BY clause works on all rows, all columns are available, which means that using `GROUP BY` on the partition key values is not very useful because the partition key limits the result set. If we create the following table: @@ -129,7 +129,7 @@ visits SINT64, a_time TIMESTAMP NOT NULL, PRIMARY KEY(userid, a_time)); ``` -And and try to run the GROUP BY statement including `userid` in the SELECT statement: +And try to run the GROUP BY clause including `userid` in the SELECT statement: ```sql SELECT userid, SUM(visits) diff --git a/content/riak/ts/1.6.0/using/querying/select/limit.md b/content/riak/ts/1.6.0/using/querying/select/limit.md index 3a3cc6ccfa..606ecdfd19 100644 --- a/content/riak/ts/1.6.0/using/querying/select/limit.md +++ b/content/riak/ts/1.6.0/using/querying/select/limit.md @@ -19,14 +19,14 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select/lim [query guidelines]: /riak/ts/1.6.0/using/querying/guidelines/ [configuring]: /riak/ts/1.6.0/configuring/riakconf/#maximum-returned-data-size -The LIMIT statement is used with [`SELECT`][select] to return a limited number of results. +The LIMIT clause is used with [`SELECT`][select] to return a limited number of results. This document shows how to run various queries using `LIMIT`. See the [guidelines][query guidelines] for more information on limitations and rules for queries in Riak TS. {{% note title="A Note on Latency" %}} `LIMIT` uses on-disk query buffer to prevent overload, which adds some overhead and increases the query latency. -You may adjust various parameters in [riak.conf](/riak/ts/1.6.0/configuring/riakconf/) depending on how much memory your riak nodes will have, including `max_running_fsms`, `max_quanta_span`, `max_concurrent_queries`. It is also worth noting that `max_returned_data_size` is calculated differently for LIMIT statements; you can read more about that [here](/riak/ts/1.6.0/configuring/riakconf/#maximum-returned-data-size). All of these settings impact the maximum size of data you can retrieve at one time, and it is important to understand your environmental limitations or you run the risk of an out-of-memory condition. +You may adjust various parameters in [riak.conf](/riak/ts/1.6.0/configuring/riakconf/) depending on how much memory your Riak nodes will have, including `max_running_fsms`, `max_quanta_span`, `max_concurrent_queries`. It is also worth noting that `max_returned_data_size` is calculated differently for LIMIT clauses; you can read more about that [here](/riak/ts/1.6.0/configuring/riakconf/#maximum-returned-data-size). All of these settings impact the maximum size of data you can retrieve at one time, and it is important to understand your environmental limitations or you run the risk of an out-of-memory condition. However, the most effective means of speeding up your `LIMIT` queries is to place the query buffer directory (`timeseries_query_buffers_root_path`) on fast storage or in memory-backed /tmp directory. {{% /note %}} @@ -34,7 +34,7 @@ However, the most effective means of speeding up your `LIMIT` queries is to plac ## Overview -The LIMIT statement returns a limited number of results from a SELECT statement. +The LIMIT clause returns a limited number of results from a SELECT statement. `LIMIT` has the following syntax: diff --git a/content/riak/ts/1.6.0/using/querying/select/order-by.md b/content/riak/ts/1.6.0/using/querying/select/order-by.md index 62e45f5205..be6d6a6688 100644 --- a/content/riak/ts/1.6.0/using/querying/select/order-by.md +++ b/content/riak/ts/1.6.0/using/querying/select/order-by.md @@ -19,14 +19,14 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/select/ord [query guidelines]: /riak/ts/1.6.0/using/querying/guidelines/ [configuring]: /riak/ts/1.6.0/configuring/riakconf/#maximum-returned-data-size -The ORDER BY statement is used with [`SELECT`][select] to sort results by one or more columns in ascending or descending order. `ORDER BY` is useful for operations such as returning the most recent results in a set. +The ORDER BY clause is used with [`SELECT`][select] to sort results by one or more columns in ascending or descending order. `ORDER BY` is useful for operations such as returning the most recent results in a set. This document shows how to run various queries using `ORDER BY`. See the [guidelines][query guidelines] for more information on limitations and rules for queries in Riak TS. {{% note title="A Note on Latency" %}} `ORDER BY` uses on-disk query buffer to prevent overload, which adds some overhead and increases the query latency. -You may adjust various parameters in [riak.conf](/riak/ts/1.6.0/configuring/riakconf/) depending on how much memory your riak nodes will have, including `max_running_fsms`, `max_quanta_span`, `max_concurrent_queries`. It is also worth noting that `max_returned_data_size` is calculated differently for ORDER BY statements; you can read more about that [here](/riak/ts/1.6.0/configuring/riakconf/#maximum-returned-data-size). All of these settings impact the maximum size of data you can retrieve at one time, and it is important to understand your environmental limitations or you run the risk of an out-of-memory condition. +You may adjust various parameters in [riak.conf](/riak/ts/1.6.0/configuring/riakconf/) depending on how much memory your Riak nodes will have, including `max_running_fsms`, `max_quanta_span`, `max_concurrent_queries`. It is also worth noting that `max_returned_data_size` is calculated differently for ORDER BY clauses; you can read more about that [here](/riak/ts/1.6.0/configuring/riakconf/#maximum-returned-data-size). All of these settings impact the maximum size of data you can retrieve at one time, and it is important to understand your environmental limitations or you run the risk of an out-of-memory condition. However, the most effective means of speeding up your `ORDER BY` queries is to place the query buffer directory (`timeseries_query_buffers_root_path`) on fast storage or in memory-backed /tmp directory. {{% /note %}} @@ -34,7 +34,7 @@ However, the most effective means of speeding up your `ORDER BY` queries is to p ## Overview -The ORDER BY statement sorts results according to the specified column(s) and any optional keywords or clauses used. +The ORDER BY clause sorts results according to the specified column(s) and any optional modifiers or functions used. `ORDER BY` has the following syntax: @@ -46,7 +46,7 @@ During an `ORDER BY` sort if two rows are equal according to the leftmost column ### Options -The following keywords can be appended to `ORDER BY` to further sort results: +The following modifiers can be appended to `ORDER BY` to further sort results: #### `ASC` From 0c4b9ffcf146e0add85af24c48670196df30e7fd Mon Sep 17 00:00:00 2001 From: Eric Date: Tue, 14 Mar 2017 13:42:20 -0700 Subject: [PATCH 09/15] Add table management section --- .../ts/1.0.0/using/creating-activating.md | 5 ++ .../ts/1.1.0/using/creating-activating.md | 5 ++ .../ts/1.2.0/using/creating-activating.md | 5 ++ .../ts/1.3.0/using/creating-activating.md | 9 ++- .../ts/1.3.1/using/creating-activating.md | 9 ++- .../ts/1.4.0/using/creating-activating.md | 7 +- .../1.4.0/using/global-object-expiration.md | 7 +- .../configuring/global-object-expiration.md | 7 +- .../ts/1.5.0/using/creating-activating.md | 11 ++- .../configuring/global-object-expiration.md | 7 +- .../ts/1.5.1/using/creating-activating.md | 11 ++- content/riak/ts/1.6.0/configuring.md | 5 +- content/riak/ts/1.6.0/table-management.md | 33 ++++++++ .../creating-activating.md | 40 +++++----- .../global-object-expiration.md | 21 +++-- .../per-table-object-expiration.md | 76 +++++++++++++++++++ content/riak/ts/1.6.0/using.md | 4 +- content/riak/ts/1.6.0/using/deleting-data.md | 2 +- content/riak/ts/1.6.0/using/planning.md | 18 ++--- content/riak/ts/1.6.0/using/querying.md | 10 +-- .../riak/ts/1.6.0/using/querying/explain.md | 4 +- .../ts/1.6.0/using/querying/guidelines.md | 6 +- content/riak/ts/1.6.0/using/riakshell.md | 6 +- .../ts/1.6.0/using/timerepresentations.md | 4 +- content/riak/ts/1.6.0/using/writingdata.md | 8 +- 25 files changed, 238 insertions(+), 82 deletions(-) create mode 100644 content/riak/ts/1.6.0/table-management.md rename content/riak/ts/1.6.0/{using => table-management}/creating-activating.md (92%) rename content/riak/ts/1.6.0/{configuring => table-management}/global-object-expiration.md (79%) create mode 100644 content/riak/ts/1.6.0/table-management/per-table-object-expiration.md diff --git a/content/riak/ts/1.0.0/using/creating-activating.md b/content/riak/ts/1.0.0/using/creating-activating.md index 14aa9aa671..543f2aee53 100644 --- a/content/riak/ts/1.0.0/using/creating-activating.md +++ b/content/riak/ts/1.0.0/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.0.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.0.0/using/creating-activating/ --- diff --git a/content/riak/ts/1.1.0/using/creating-activating.md b/content/riak/ts/1.1.0/using/creating-activating.md index e0609f44fc..f4b4102d48 100644 --- a/content/riak/ts/1.1.0/using/creating-activating.md +++ b/content/riak/ts/1.1.0/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.1.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.1.0/using/creating-activating/ --- diff --git a/content/riak/ts/1.2.0/using/creating-activating.md b/content/riak/ts/1.2.0/using/creating-activating.md index 17cf034834..21c5b20caf 100644 --- a/content/riak/ts/1.2.0/using/creating-activating.md +++ b/content/riak/ts/1.2.0/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.2.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.2.0/using/creating-activating/ --- diff --git a/content/riak/ts/1.3.0/using/creating-activating.md b/content/riak/ts/1.3.0/using/creating-activating.md index 38f23be484..96f616d0f6 100644 --- a/content/riak/ts/1.3.0/using/creating-activating.md +++ b/content/riak/ts/1.3.0/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.3.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.3.0/using/creating-activating/ --- @@ -28,7 +33,7 @@ aliases: Once you have [planned out your table][planning] you can create it by: -* Executing a `CREATE TABLE` statement using any Riak client, +* Executing a `CREATE TABLE` statement using any Riak client, * Using riak shell, or * Running the `riak-admin` command (as root, using `su` or `sudo`). @@ -181,4 +186,4 @@ ddl: {ddl_v1,<<"GeoCheckin">>, ## Next Steps -Now that you've created and activated your Riak TS table, you can [write data][writing] to it. \ No newline at end of file +Now that you've created and activated your Riak TS table, you can [write data][writing] to it. diff --git a/content/riak/ts/1.3.1/using/creating-activating.md b/content/riak/ts/1.3.1/using/creating-activating.md index 69bc8fe85a..05fe6b70b7 100644 --- a/content/riak/ts/1.3.1/using/creating-activating.md +++ b/content/riak/ts/1.3.1/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.3.1" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.3.1/using/creating-activating/ --- @@ -28,7 +33,7 @@ aliases: Once you have [planned out your table][planning] you can create it by: -* Executing a `CREATE TABLE` statement using any Riak client, +* Executing a `CREATE TABLE` statement using any Riak client, * Using riak shell, or * Running the `riak-admin` command (as root, using `su` or `sudo`). @@ -181,4 +186,4 @@ ddl: {ddl_v1,<<"GeoCheckin">>, ## Next Steps -Now that you've created and activated your Riak TS table, you can [write data][writing] to it. \ No newline at end of file +Now that you've created and activated your Riak TS table, you can [write data][writing] to it. diff --git a/content/riak/ts/1.4.0/using/creating-activating.md b/content/riak/ts/1.4.0/using/creating-activating.md index 337c1495a1..5c3b29c36e 100644 --- a/content/riak/ts/1.4.0/using/creating-activating.md +++ b/content/riak/ts/1.4.0/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.4.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.4.0/using/creating-activating/ canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating" @@ -30,7 +35,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating Once you have [planned out your table][planning] you can create it by: -* Executing a CREATE TABLE statement using any Riak TS client, +* Executing a CREATE TABLE statement using any Riak TS client, * Using riak shell, or * Running the `riak-admin` command (as root, using `su` or `sudo`). diff --git a/content/riak/ts/1.4.0/using/global-object-expiration.md b/content/riak/ts/1.4.0/using/global-object-expiration.md index 1fdb080423..b90231efdc 100644 --- a/content/riak/ts/1.4.0/using/global-object-expiration.md +++ b/content/riak/ts/1.4.0/using/global-object-expiration.md @@ -13,7 +13,8 @@ toc: true version_history: in: "1.4.0+" locations: - - [">=1.5.0", "configuring/global-object-expiration"] + - [">=1.6.0", "table-management/global-object-expiration"] + - ["<=1.5.1", "configuring/global-object-expiration"] - ["<=1.4.0", "using/global-object-expiration"] aliases: - /riakts/1.4.0/using/global-object-expiration/ @@ -22,7 +23,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/global-object-expir [ttl]: https://en.wikipedia.org/wiki/Time_to_live -By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. +By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. {{% note %}} Currently only global expiration is supported in Riak TS. @@ -76,7 +77,7 @@ Global expiration supports two modes: - `whole_file` - the whole sorted string table (`.sst`) file is deleted when all of its objects are expired. - `normal` - individual objects are removed as part of the usual compaction process. -We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. +We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. The following example configure objects to expire after 1 day: diff --git a/content/riak/ts/1.5.0/configuring/global-object-expiration.md b/content/riak/ts/1.5.0/configuring/global-object-expiration.md index 4ce271523e..f1e41b306a 100644 --- a/content/riak/ts/1.5.0/configuring/global-object-expiration.md +++ b/content/riak/ts/1.5.0/configuring/global-object-expiration.md @@ -13,7 +13,8 @@ toc: true version_history: in: "1.4.0+" locations: - - [">=1.5.0", "configuring/global-object-expiration"] + - [">=1.6.0", "table-management/global-object-expiration"] + - ["<=1.5.1", "configuring/global-object-expiration"] - ["<=1.4.0", "using/global-object-expiration"] aliases: - /riakts/1.5.0/configuring/global-object-expiration/ @@ -22,7 +23,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/configuring/global-object [ttl]: https://en.wikipedia.org/wiki/Time_to_live -By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. +By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. {{% note %}} Currently only global expiration is supported in Riak TS. @@ -76,7 +77,7 @@ Global expiration supports two modes: - `whole_file` - the whole sorted string table (`.sst`) file is deleted when all of its objects are expired. - `normal` - individual objects are removed as part of the usual compaction process. -We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. +We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. The following example configure objects to expire after 1 day: diff --git a/content/riak/ts/1.5.0/using/creating-activating.md b/content/riak/ts/1.5.0/using/creating-activating.md index 1a8d6132da..cdb6533059 100644 --- a/content/riak/ts/1.5.0/using/creating-activating.md +++ b/content/riak/ts/1.5.0/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.5.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.5.0/using/creating-activating/ canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating" @@ -31,7 +36,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating Once you have [planned out your table][planning] you can create it by: -* Executing a CREATE TABLE statement using any Riak TS client, +* Executing a CREATE TABLE statement using any Riak TS client, * Using `riak-shell`, or * Running the `riak-admin` command (as root, using `su` or `sudo`). @@ -56,7 +61,7 @@ CREATE TABLE GeoCheckin ## `CREATE TABLE` in Client Library -Using one of the Riak TS client libraries, execute the CREATE TABLE statement via that library's query functionality. This will create and activate the table in one step. +Using one of the Riak TS client libraries, execute the CREATE TABLE statement via that library's query functionality. This will create and activate the table in one step. ```csharp string tableName = "GeoCheckin"; @@ -234,7 +239,7 @@ CREATE TABLE (...) WITH ( custom_prop = 42.24) ``` -Any property with any string or numeric value can be associated with a table, including but not limited to standard [Riak bucket properties]. +Any property with any string or numeric value can be associated with a table, including but not limited to standard [Riak bucket properties]. Please note the following when using `WITH`: diff --git a/content/riak/ts/1.5.1/configuring/global-object-expiration.md b/content/riak/ts/1.5.1/configuring/global-object-expiration.md index 3391a66081..dba3b816c3 100644 --- a/content/riak/ts/1.5.1/configuring/global-object-expiration.md +++ b/content/riak/ts/1.5.1/configuring/global-object-expiration.md @@ -13,7 +13,8 @@ toc: true version_history: in: "1.4.0+" locations: - - [">=1.5.1", "configuring/global-object-expiration"] + - [">=1.6.0", "table-management/global-object-expiration"] + - ["<=1.5.1", "configuring/global-object-expiration"] - ["<=1.4.0", "using/global-object-expiration"] aliases: - /riakts/1.5.1/configuring/global-object-expiration/ @@ -22,7 +23,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/configuring/global-object [ttl]: https://en.wikipedia.org/wiki/Time_to_live -By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. +By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. {{% note %}} Currently only global expiration is supported in Riak TS. @@ -76,7 +77,7 @@ Global expiration supports two modes: - `whole_file` - the whole sorted string table (`.sst`) file is deleted when all of its objects are expired. - `normal` - individual objects are removed as part of the usual compaction process. -We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. +We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. The following example configure objects to expire after 1 day: diff --git a/content/riak/ts/1.5.1/using/creating-activating.md b/content/riak/ts/1.5.1/using/creating-activating.md index fc34c79bca..a1c998d383 100644 --- a/content/riak/ts/1.5.1/using/creating-activating.md +++ b/content/riak/ts/1.5.1/using/creating-activating.md @@ -10,6 +10,11 @@ menu: project: "riak_ts" project_version: "1.5.1" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.5.1/using/creating-activating/ canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating" @@ -31,7 +36,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating Once you have [planned out your table][planning] you can create it by: -* Executing a CREATE TABLE statement using any Riak TS client, +* Executing a CREATE TABLE statement using any Riak TS client, * Using `riak-shell`, or * Running the `riak-admin` command (as root, using `su` or `sudo`). @@ -56,7 +61,7 @@ CREATE TABLE GeoCheckin ## `CREATE TABLE` in Client Library -Using one of the Riak TS client libraries, execute the CREATE TABLE statement via that library's query functionality. This will create and activate the table in one step. +Using one of the Riak TS client libraries, execute the CREATE TABLE statement via that library's query functionality. This will create and activate the table in one step. ```csharp string tableName = "GeoCheckin"; @@ -234,7 +239,7 @@ CREATE TABLE (...) WITH ( custom_prop = 42.24) ``` -Any property with any string or numeric value can be associated with a table, including but not limited to standard [Riak bucket properties]. +Any property with any string or numeric value can be associated with a table, including but not limited to standard [Riak bucket properties]. Please note the following when using `WITH`: diff --git a/content/riak/ts/1.6.0/configuring.md b/content/riak/ts/1.6.0/configuring.md index d8714ed021..4b406d9ac7 100644 --- a/content/riak/ts/1.6.0/configuring.md +++ b/content/riak/ts/1.6.0/configuring.md @@ -22,9 +22,9 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/configuring" [riakconf]: /riak/ts/1.6.0/configuring/riakconf/ [mdc]: /riak/ts/1.6.0/configuring/mdc/ -[global expiry]: /riak/ts/1.6.0/configuring/global-object-expiration/ +[global expiry]: /riak/ts/1.6.0/table-management/global-object-expiration/ [kv config]: /riak/kv/2.2.0/configuring/reference -[WITH]: /riak/ts/1.6.0/using/creating-activating/#using-the-with-clause +[WITH]: /riak/ts/1.6.0/table-management/creating-activating/#using-the-with-clause Riak TS mostly relies on Riak KV's [default configuration settings][kv config]. However, there are a few TS-specific configurations you should know about: @@ -32,4 +32,3 @@ Riak TS mostly relies on Riak KV's [default configuration settings][kv config]. * If you are using Riak TS Enterprise Edition, you can learn more about configuring multi-datacenter replication (MDC) [here][mdc]. * You have the option of deleting data via global object expiration. For more information on how to configure global expiry, go [here][global expiry]. * The default `n_val` (the number of distinct copies of each record kept in your cluster for safety and availability) is 3. You can only change this default per-table when you create a table using `WITH`. Read more about that [here][WITH]. - diff --git a/content/riak/ts/1.6.0/table-management.md b/content/riak/ts/1.6.0/table-management.md new file mode 100644 index 0000000000..8492525136 --- /dev/null +++ b/content/riak/ts/1.6.0/table-management.md @@ -0,0 +1,33 @@ +--- +title: "Table Management in Riak TS" +description: "Table Management in Riak TS" +menu: + riak_ts-1.6.0: + name: "Table Management" + identifier: "table_management" + weight: 275 + pre: icon-reorder +project: "riak_ts" +project_version: "1.6.0" +aliases: + - /riakts/1.6.0/table-management/ +canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/" +--- + +[plan]: /riak/ts/1.6.0/using/planning/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ +[global expiry]: /riak/ts/1.6.0/table-management/global-object-expiration/ +[table expiry]: /riak/ts/1.6.0/table-management/per-table-object-expiration/ +[mdc]: /riak/ts/1.6.0/configuring/mdc/ +[writing]: /riak/ts/1.6.0/using/writingdata/ + +After [planning][plan] your Riak TS table: + +1. [Create and activate][activating] your Riak TS table. (You'll need `sudo` and `su` access for this step.) +2. [Enable global object object expiration][global expiry] if you plan on setting up an object retention policy. If you are an Enterprise user, see the [per table object expiration][table expiry] instructions to enable object retention on per table basis. +3. Then, if you are an Enterprise user, [set up Multi-Datacenter replication][mdc]. +4. [Write data][writing] to your table. + +Check out [riak shell][riakshell], a handy tool for using TS. + +Then check out how to [query][querying] your data, [customize your Riak TS configuration][configuring], analyze your data with [aggregate functions][aggregate], or apply some [arithmetic operations][arithmetic]. diff --git a/content/riak/ts/1.6.0/using/creating-activating.md b/content/riak/ts/1.6.0/table-management/creating-activating.md similarity index 92% rename from content/riak/ts/1.6.0/using/creating-activating.md rename to content/riak/ts/1.6.0/table-management/creating-activating.md index 44edbd0d73..5faa35f16c 100644 --- a/content/riak/ts/1.6.0/using/creating-activating.md +++ b/content/riak/ts/1.6.0/table-management/creating-activating.md @@ -4,34 +4,40 @@ description: "Creating and Activating Your Riak TS Table" menu: riak_ts-1.6.0: name: "Create Your Table" - identifier: "creating_activating_riakts" - weight: 302 - parent: "using" + identifier: "table_management_activating_riakts" + weight: 100 + parent: "table_management" project: "riak_ts" project_version: "1.6.0" toc: true +version_history: + in: "1.0.0+" + locations: + - [">=1.6.0", "table-management/creating-activating"] + - ["<=1.5.1", "using/creating-activating"] aliases: - /riakts/1.6.0/using/creating-activating/ -canonical_link: "https://docs.basho.com/riak/ts/latest/using/creating-activating" + - /riakts/1.6.0/table-management/creating-activating/ +canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/creating-activating" --- -[csharp]: ../../developing/csharp#query -[describe]: ../querying/describe/ -[erlang]: ../../developing/erlang/#query-2 -[java]: ../../developing/java#query -[nodejs]: ../../developing/nodejs/#query -[php]: ../../developing/php#query -[python]: ../../developing/python#query -[ruby]: ../../developing/ruby#sql-queries -[planning]: ../planning/ -[writing]: ../writingdata/ +[csharp]: /riak/ts/1.6.0/developing/csharp#query +[describe]: /riak/ts/1.6.0/using/querying/describe/ +[erlang]: /riak/ts/1.6.0/developing/erlang/#query-2 +[java]: /riak/ts/1.6.0/developing/java#query +[nodejs]: /riak/ts/1.6.0/developing/nodejs/#query +[php]: /riak/ts/1.6.0/developing/php#query +[python]: /riak/ts/1.6.0/developing/python#query +[ruby]: /riak/ts/1.6.0/developing/ruby#sql-queries +[planning]: /riak/ts/1.6.0/using/planning/ +[writing]: /riak/ts/1.6.0/using/writingdata/ [Riak bucket properties]: /riak/kv/2.2.0/configuring/reference/#default-bucket-properties Once you have [planned out your table][planning] you can create it by: -* Executing a CREATE TABLE statement using any Riak TS client, +* Executing a CREATE TABLE statement using any Riak TS client, * Using `riak-shell`, or * Running the `riak-admin` command (as root, using `su` or `sudo`). @@ -56,7 +62,7 @@ CREATE TABLE GeoCheckin ## `CREATE TABLE` in Client Library -Using one of the Riak TS client libraries, execute the CREATE TABLE statement via that library's query functionality. This will create and activate the table in one step. +Using one of the Riak TS client libraries, execute the CREATE TABLE statement via that library's query functionality. This will create and activate the table in one step. ```csharp string tableName = "GeoCheckin"; @@ -234,7 +240,7 @@ CREATE TABLE (...) WITH ( custom_prop = 42.24) ``` -Any property with any string or numeric value can be associated with a table, including but not limited to standard [Riak bucket properties]. +Any property with any string or numeric value can be associated with a table, including but not limited to standard [Riak bucket properties]. Please note the following when using `WITH`: diff --git a/content/riak/ts/1.6.0/configuring/global-object-expiration.md b/content/riak/ts/1.6.0/table-management/global-object-expiration.md similarity index 79% rename from content/riak/ts/1.6.0/configuring/global-object-expiration.md rename to content/riak/ts/1.6.0/table-management/global-object-expiration.md index a807b2b8b4..fda7f97688 100644 --- a/content/riak/ts/1.6.0/configuring/global-object-expiration.md +++ b/content/riak/ts/1.6.0/table-management/global-object-expiration.md @@ -4,29 +4,28 @@ description: "Enabling and configuring global object expiration for Riak TS." menu: riak_ts-1.6.0: name: "Global Object Expiration" - identifier: "config_expiry" - weight: 320 - parent: "configure" + identifier: "table_mng_global_expiry" + weight: 200 + parent: "table_management" project: "riak_ts" project_version: "1.6.0" toc: true version_history: in: "1.4.0+" locations: - - [">=1.6.0", "configuring/global-object-expiration"] + - [">=1.6.0", "table-management/global-object-expiration"] + - ["<=1.5.1", "configuring/global-object-expiration"] - ["<=1.4.0", "using/global-object-expiration"] aliases: - /riakts/1.6.0/configuring/global-object-expiration/ -canonical_link: "https://docs.basho.com/riak/ts/latest/configuring/global-object-expiration" + - /riakts/1.6.0/table-management/global-object-expiration/ +canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/global-object-expiration" --- [ttl]: https://en.wikipedia.org/wiki/Time_to_live +[table expiry]: /riak/ts/1.6.0/table-management/per-table-object-expiration -By default, LevelDB keeps all of your data. But Riak TS allows you to configure global object expiration (`expiry`) or [time to live (TTL)][ttl] for your data. - -{{% note %}} -Currently only global expiration is supported in Riak TS. -{{% /note %}} +By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a global or [per table basis][table expiry]. Expiration is disabled by default, but enabling it lets you expire older objects to reclaim the space used or purge data with a limited time value. @@ -76,7 +75,7 @@ Global expiration supports two modes: - `whole_file` - the whole sorted string table (`.sst`) file is deleted when all of its objects are expired. - `normal` - individual objects are removed as part of the usual compaction process. -We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. +We recommend using `whole_file` with time series data that has a similar lifespan, as it will be much more efficient. The following example configure objects to expire after 1 day: diff --git a/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md b/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md new file mode 100644 index 0000000000..e8f6e17acf --- /dev/null +++ b/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md @@ -0,0 +1,76 @@ +--- +title: "Configure Per Table Object Expiration" +description: "Enabling and configuring per-table object expiration for Riak TS." +menu: + riak_ts-1.6.0: + name: "Per Table Object Expiration" + identifier: "table_mng_per_table_expiry" + weight: 300 + parent: "table_management" +project: "riak_ts" +project_version: "1.6.0" +toc: true +commercial_offering: true +version_history: + in: "1.6.0+" + locations: + - [">=1.6.0", "table-management/per-table-object-expiration"] +aliases: + - /riakts/1.6.0/table-management/per-table-object-expiration/ +canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/per-table-object-expiration/" +--- + +[ttl]: https://en.wikipedia.org/wiki/Time_to_live +[global expiry]: /riak/ts/1.6.0/table-management/global-object-expiration/ +[create table]: /riak/ts/1.6.0/table-management/creating-activating/ +[create table with]: /riak/ts/1.6.0/table-management/creating-activating/#using-with + +By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a [global][global expiry] or per table basis. + +Expiration is disabled by default, but enabling it lets you expire older objects to reclaim the space used or purge data with a limited time value. + +## Enabling Expiry + +To enable object expiry for global or per table use, add the `leveldb.expiration` setting to your riak.conf file: + +```riak.conf +leveldb.expiration = on +``` + +{{% note %}} +Turning on object expiration will not retroactively expire previous data. Only data created while expiration is on will be scheduled for expiration. +{{% /note %}} + +## Expiry Table Properties + +|Property Name|Values| +|---|---| +|expiration|`enabled` / `disabled`| +|defaut_time_to_live|`unlimited` or a duration string| +|expiration_mode|`use_global_config` / `per_item` / `whole_file`| + +Each table can have one or more of the properties listed above. If any properties are omitted on the table, the property values in riak.conf will be used. If the properties are not set within riak.conf, the default values will be used. + +## Creating Tables With Expiry + +To enable object expiration on a per table basis, [create a table][create table] and specify the expiry properties for objects using the optional [WITH clause][create table with]. For example: + +```sql +CREATE TABLE GeoCheckin +( + id SINT64 NOT NULL, + region VARCHAR NOT NULL, + state VARCHAR NOT NULL, + time TIMESTAMP NOT NULL, + weather VARCHAR NOT NULL, + temperature DOUBLE, + PRIMARY KEY ( + (id, QUANTUM(time, 15, 'm')), + id, time + ) +) WITH ( + expiration = enabled + default_time_to_live = 123.4m + expiration_mode = whole_file +) +``` diff --git a/content/riak/ts/1.6.0/using.md b/content/riak/ts/1.6.0/using.md index 63237f8dde..90e3cbf6c2 100644 --- a/content/riak/ts/1.6.0/using.md +++ b/content/riak/ts/1.6.0/using.md @@ -16,7 +16,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using" --- -[activating]: creating-activating/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ [aggregate]: querying/select/aggregate-functions/ [arithmetic]: querying/select/arithmetic-operations/ [configuring]: /riak/ts/1.6.0/configuring/ @@ -36,5 +36,5 @@ Now that you've [downloaded][download] and [installed][installing] Riak TS, ther 3. [Write data][writing] to your table. Check out [riak shell][riakshell], a handy tool for using TS. - + Then check out how to [query][querying] your data, [customize your Riak TS configuration][configuring], analyze your data with [aggregate functions][aggregate], or apply some [arithmetic operations][arithmetic]. diff --git a/content/riak/ts/1.6.0/using/deleting-data.md b/content/riak/ts/1.6.0/using/deleting-data.md index 977a4e6f82..6bbba48310 100644 --- a/content/riak/ts/1.6.0/using/deleting-data.md +++ b/content/riak/ts/1.6.0/using/deleting-data.md @@ -18,7 +18,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/deleting-data" --- [delete]: /riak/ts/1.6.0/using/querying/delete -[expiry]: /riak/ts/1.6.0/configuring/global-object-expiration +[expiry]: /riak/ts/1.6.0/table-management/global-object-expiration Riak TS offers several ways to delete data: with clients, using the DELETE statement, and through global expiry. Global expiry is more efficient than other delete options but operates on all of your data. `DELETE` works per-row but takes more resources to run. diff --git a/content/riak/ts/1.6.0/using/planning.md b/content/riak/ts/1.6.0/using/planning.md index 2b22f76301..dbce6cf54b 100644 --- a/content/riak/ts/1.6.0/using/planning.md +++ b/content/riak/ts/1.6.0/using/planning.md @@ -16,7 +16,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/planning" --- -[activating]: ../creating-activating/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ [table arch]: ../../learn-about/tablearchitecture/ [bestpractices]: ../../learn-about/bestpractices/ [describe]: ../querying/describe/ @@ -26,9 +26,9 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/planning" [order by]: /riak/ts/1.6.0/using/querying/select/order-by -You've [installed][installing] Riak TS, and you're ready to create a table. +You've [installed][installing] Riak TS, and you're ready to create a table. -* If you're just looking to test out Riak TS, you can jump ahead to [Create a Table][activating], and test out our sample table. +* If you're just looking to test out Riak TS, you can jump ahead to [Create a Table][activating], and test out our sample table. * If you're looking to set up your production environment, keep reading for guidelines on how best to structure your table. * If you're looking for more information about what the components of a Riak TS table do, check out [Table Architecture][table arch]. * Riak TS tables are closely tied to SQL tables. If would like to know more about how Riak TS integrates SQL, check out [SQL for Riak TS][sql]. @@ -256,7 +256,7 @@ Please take care in defining how you address your unique data, as it will affect A table's local key determines how it is stored and ordered on disk. Adding the `ASC` or `DESC` keywords to the local key lets you control the sort order of records on disk, and avoid sorting using [`ORDER BY`][order by] or at the application level. -Ordering rows using `ASC` or `DESC` on the local key reduces workload on the cluster because no sorting is required when a query is executed. This may make using `ASC` or `DESC` on the local key a better choice than using `ORDER BY`. +Ordering rows using `ASC` or `DESC` on the local key reduces workload on the cluster because no sorting is required when a query is executed. This may make using `ASC` or `DESC` on the local key a better choice than using `ORDER BY`. The `ASC` or `DESC` keywords must be applied to the local key, not the partition key. The keywords can only be applied to `SINT64`, `TIMESTAMP` and `VARCHAR` columns. @@ -332,14 +332,14 @@ In the new `descending_table`, the `DESC` keyword has been added to `b` in the l ## Quantum -Choosing the right quantum for your Riak TS table is incredibly important, as it can optimize or diminish your query performance. The short guide is: +Choosing the right quantum for your Riak TS table is incredibly important, as it can optimize or diminish your query performance. The short guide is: 1. If you care most about individual query [latency](#latency), then spread your data around the cluster so a typical query spans all nodes, or 2. If you care most about [throughput](#throughput), then localize your data so that typical queries are confined to single nodes, or 3. If you can't predict your usage or you have a mixed-use case, optimize for [latency](#latency) because the fractional latency gains of less data localization are much higher than the throughput losses. -4. Finally, if you simply don't know what you prefer, there's more information in the section below to help you decide. +4. Finally, if you simply don't know what you prefer, there's more information in the section below to help you decide. ### Your quanta use-case @@ -390,12 +390,12 @@ To optimize query latency, your quantum should be small enough that your normal- If you're up for a little math, you can use the following formula to identify the optimal quantum: ``` -Q ~ t/N +Q ~ t/N ``` `t` is the time spanned by your typical query, `N` is the number of nodes in your cluster, and `Q` is your quantum. -For example, if you have a 5-node cluster and your typical query is for 40 hours of data, your quantum should be no larger than 8 hours: +For example, if you have a 5-node cluster and your typical query is for 40 hours of data, your quantum should be no larger than 8 hours: ``` 8 = 40/5 @@ -423,7 +423,7 @@ For example, if your typical query is an hour’s worth of data, then your quant Take care not to let 100% of your queries hit a single node, however, or you risk crashing the node. -#### How large should my quantum be? +#### How large should my quantum be? For the best throughput, we suggest that your maximum quantum size be chosen to minimize the number of concurrent queries hitting the same node. For the best performance, your quantum should be bounded by the total time spanned by your instantaneous volume of concurrent queries. diff --git a/content/riak/ts/1.6.0/using/querying.md b/content/riak/ts/1.6.0/using/querying.md index 0f76f7f10b..cbc102f5e8 100644 --- a/content/riak/ts/1.6.0/using/querying.md +++ b/content/riak/ts/1.6.0/using/querying.md @@ -15,7 +15,7 @@ aliases: canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying" --- -[activating]: ../creating-activating/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ [DESCRIBE]: describe/ [guidelines]: guidelines/ [planning]: ../planning/ @@ -28,16 +28,16 @@ You've [planned][planning] and [created][activating] your Riak TS table, and you Riak TS offers you several ways to define, manipulate, and query the data within your TS table. You can: -* Use [SELECT] to run various queries on your TS dataset. +* Use [SELECT] to run various queries on your TS dataset. * Use [DESCRIBE] to see a full definition of your TS table. * Use [SHOW TABLES] to list all the TS tables you have. * Use [SHOW CREATE TABLE] to generate the SQL required to recreate a TS table. -You can also take a look at the [guidelines] to get an idea of the rules and best practices for running queries. +You can also take a look at the [guidelines] to get an idea of the rules and best practices for running queries. {{% note title="WARNING" %}} -When querying, you must ensure the node issuing the query has adequate memory to receive the response. Queries will return rows based on the timespan (quanta) specified, if the returning rows do not fit into the memory of the requesting node, the node is likely to fail. +When querying, you must ensure the node issuing the query has adequate memory to receive the response. Queries will return rows based on the timespan (quanta) specified, if the returning rows do not fit into the memory of the requesting node, the node is likely to fail. Any given query consists of subqueries. If a single subquery loads a result that does not fit into memory, an out of memory error will occur on the subquery node and the requesting node will return a timeout error as it waits for the subquery to return. -{{% /note %}} \ No newline at end of file +{{% /note %}} diff --git a/content/riak/ts/1.6.0/using/querying/explain.md b/content/riak/ts/1.6.0/using/querying/explain.md index 8e3fe52792..1e3a725197 100644 --- a/content/riak/ts/1.6.0/using/querying/explain.md +++ b/content/riak/ts/1.6.0/using/querying/explain.md @@ -15,7 +15,7 @@ aliases: canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/explain" --- -[creating-activating]: /riak/ts/1.6.0/using/creating-activating +[creating-activating]: /riak/ts/1.6.0/table-management/creating-activating [develop]: /riak/ts/1.6.0/developing [planning]: /riak/ts/1.6.0/using/planning [riak shell]: /riak/ts/1.6.0/using/riakshell @@ -25,7 +25,7 @@ You can use an EXPLAIN statement to better understand how a query you would like ## EXPLAIN Guidelines -`EXPLAIN` takes a SELECT statement as an argument and shows information about each subquery. The constraints placed upon the WHERE clause in the SELECT statement determine the subquery values. +`EXPLAIN` takes a SELECT statement as an argument and shows information about each subquery. The constraints placed upon the WHERE clause in the SELECT statement determine the subquery values. The details about each subquery include: diff --git a/content/riak/ts/1.6.0/using/querying/guidelines.md b/content/riak/ts/1.6.0/using/querying/guidelines.md index 0736bfc247..f6ea174f1b 100644 --- a/content/riak/ts/1.6.0/using/querying/guidelines.md +++ b/content/riak/ts/1.6.0/using/querying/guidelines.md @@ -17,7 +17,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/guidelines --- [table arch]: ../../../learn-about/tablearchitecture/#data-modeling -[activating]: ../../creating-activating/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ [writing]: ../../writingdata/ [planning]: ../../planning#column-definitions [iso8601]: ../../../timerepresentations/ @@ -25,7 +25,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/querying/guidelines [configuring]: ../../../configuring/riakconf/ -Riak TS supports several kinds of queries of your TS data. To create the most successful queries possible, there are some guidelines and limitations you should know. +Riak TS supports several kinds of queries of your TS data. To create the most successful queries possible, there are some guidelines and limitations you should know. This document will cover the basic rules of querying in Riak TS, general guidelines to help you create the best queries possible, and all current limitations impacting queries in TS. @@ -175,4 +175,4 @@ CREATE TABLE GeoCheckin With the above quantum and with the default `max_quanta_span` of 5000, the maximum timeframe we can query at a time is going to be 5000 minutes provided that the data returned from the query wouldn’t exceed the limits set in `max_returned_data_size`. -See the Data Modeling section in [Table Architecture][table arch] for more information on selecting your quanta and setting parameters. \ No newline at end of file +See the Data Modeling section in [Table Architecture][table arch] for more information on selecting your quanta and setting parameters. diff --git a/content/riak/ts/1.6.0/using/riakshell.md b/content/riak/ts/1.6.0/using/riakshell.md index a6391952ae..8f830de675 100644 --- a/content/riak/ts/1.6.0/using/riakshell.md +++ b/content/riak/ts/1.6.0/using/riakshell.md @@ -16,7 +16,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/riakshell" --- [nodename]: /riak/kv/2.2.0/using/cluster-operations/changing-cluster-info/ -[creating]: /riak/ts/1.6.0/using/creating-activating +[creating]: /riak/ts/1.6.0/table-management/creating-activating [writing]: /riak/ts/1.6.0/using/writingdata [riak shell README]: https://github.com/basho/riak_shell/blob/develop/README.md @@ -26,7 +26,7 @@ You can use riak shell within Riak TS to run SQL and logging commands from one p ## Capabilities -The following are supported in riak shell: +The following are supported in riak shell: * logging * log replay @@ -346,7 +346,7 @@ You can get more specific help by calling `help` with the extension name and fun ## Configuration -You can configure riak shell from the riak_shell.config file. You can find the file in your Riak TS directory. +You can configure riak shell from the riak_shell.config file. You can find the file in your Riak TS directory. The following things can be configured: diff --git a/content/riak/ts/1.6.0/using/timerepresentations.md b/content/riak/ts/1.6.0/using/timerepresentations.md index ae01ebd91d..1995ff2fd4 100644 --- a/content/riak/ts/1.6.0/using/timerepresentations.md +++ b/content/riak/ts/1.6.0/using/timerepresentations.md @@ -16,7 +16,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/timerepresentations --- -[activating]: ../creating-activating/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ [planning]: ../planning/ [querying]: ../querying/ [config reference]: /riak/kv/2.2.0/configuring/reference/#the-advanced-config-file @@ -125,4 +125,4 @@ Effectively, there is no way in the UNIX time scheme to differentiate an event t Similarly, Riak TS would treat `915148800` as the start of a new time quantum, and any data points which a client added for that second would be considered to be in the first time quantum in 1999. -The data is not lost, but a query against 1998 time quanta will not produce those data points despite the fact that some of the events flagged as `915148800` technically occurred in 1998. \ No newline at end of file +The data is not lost, but a query against 1998 time quanta will not produce those data points despite the fact that some of the events flagged as `915148800` technically occurred in 1998. diff --git a/content/riak/ts/1.6.0/using/writingdata.md b/content/riak/ts/1.6.0/using/writingdata.md index 419d9d0366..7e17d6a630 100644 --- a/content/riak/ts/1.6.0/using/writingdata.md +++ b/content/riak/ts/1.6.0/using/writingdata.md @@ -16,7 +16,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/using/writingdata" --- -[activating]: ../creating-activating/ +[activating]: /riak/ts/1.6.0/table-management/creating-activating/ [planning]: ../planning/ [querying]: ../querying/ [http]: /riak/ts/1.6.0/developing/http/ @@ -169,7 +169,7 @@ var Riak = require('basho-riak-client'); var hosts = [ 'myriakdb.host:8087' ]; var client = new Riak.Client(hosts); -var columns = [ +var columns = [ { name: 'id', type: Riak.Commands.TS.ColumnType.Int64 }, { name: 'region', type: Riak.Commands.TS.ColumnType.Varchar }, { name: 'state', type: Riak.Commands.TS.ColumnType.Varchar }, @@ -200,7 +200,7 @@ client.execute(store); ``` ```erlang -%% TS 1.3 or newer. Records are represented as tuples. +%% TS 1.3 or newer. Records are represented as tuples. {ok, Pid} = riakc_pb_socket:start_link("myriakdb.host", 8087). riakc_ts:put(Pid, "GeoCheckin", [{1, <<"South Atlantic">>, <<"Florida">>, 1451606401, <<"hot">>, 23.5}, {2, <<"East North Central">>, <<"Illinois">>, 1451606402, <<"windy">>, 19.8}]). ``` @@ -331,7 +331,7 @@ You can add data via SQL statements either through the [query interface][queryin {{% note title="INSERT limitations" %}} Writing data via an SQL INSERT statement (as demonstrated below) has been found to be 3x slower than using one of our supported clients or the riak shell to insert data under a normal workload (10 bytes per column, up to ~ 50 columns). In these cases, we strongly recommend that you only `INSERT` small data updates and do not use it in a production environment. -Larger workloads should only use a supported client to insert data. +Larger workloads should only use a supported client to insert data. {{% /note %}} Here are a couple examples of adding rows from SQL: From 6e957f85318cf44053d5e8ea074e8d04b69ac5ed Mon Sep 17 00:00:00 2001 From: Eric Date: Tue, 14 Mar 2017 14:18:28 -0700 Subject: [PATCH 10/15] WIP add group by time example --- .../1.6.0/using/querying/select/group-by.md | 34 ++++++++++++++----- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/group-by.md b/content/riak/ts/1.6.0/using/querying/select/group-by.md index c1db1e2408..4f75546a4c 100644 --- a/content/riak/ts/1.6.0/using/querying/select/group-by.md +++ b/content/riak/ts/1.6.0/using/querying/select/group-by.md @@ -22,21 +22,27 @@ The GROUP BY clause is used with `SELECT` to pick out and condense rows sharing This document will show you how to run various queries using `GROUP BY`. See the [guidelines] for more information on limitations and rules for queries in TS. - + ## GROUP BY Basics -`GROUP BY` returns a single row for each unique combination of values for columns specified in the GROUP BY clause. There is no guaranteed order for the returned rows. +`GROUP BY` returns a single row for each unique combination of values for columns specified in the GROUP BY clause. There is no guaranteed order for the returned rows. The SELECT statement must contain only the columns specified in `GROUP BY`. Columns not used as groups can appear as function parameters. The GROUP BY clause works on all rows, not just the values in the partition key, so all columns are available. [Aggregate functions][aggregate function] may be used with the GROUP BY clause. If used, `SELECT` may contain the columns specified in either `GROUP BY` or the [aggregate function]. +Grouping by time is also possible using the time function with the GROUP BY clause. The time function has the following syntax: + +```sql +GROUP BY time(«column_name», «duration») +``` + {{% note title="WARNING" %}} -Before you run `GROUP BY` you must ensure the node issuing the query has adequate memory to receive the response. If the returning rows do not fit into the memory of the requesting node, the node is likely to fail. +Before you run `GROUP BY` you must ensure the node issuing the query has adequate memory to receive the response. If the returning rows do not fit into the memory of the requesting node, the node is likely to fail. {{% /note %}} -## GROUP BY Examples +## GROUP BY Examples The following table defines a schema for tasks, including which project they are part of and when they were completed. @@ -64,10 +70,10 @@ GROUP BY project; ### More than one group -You can group as many columns as you choose, and the order of the grouping has no effect. +You can group as many columns as you choose, and the order of the grouping has no effect. The query below returns one column per unique project, name combination, and counts how many rows have the same project, name combination. - + ```sql SELECT project, COUNT(name) FROM tasks @@ -124,8 +130,8 @@ If we create the following table: ```sql CREATE TABLE tasks2 ( -userid VARCHAR NOT NULL, -visits SINT64, +userid VARCHAR NOT NULL, +visits SINT64, a_time TIMESTAMP NOT NULL, PRIMARY KEY(userid, a_time)); ``` @@ -140,4 +146,14 @@ GROUP BY userid; The result set would only have the group 'roddy' because it is required by the WHERE clause. -If, however, we combine two column names from the partition key in the group using `SUM` without specifying `userid`, `GROUP BY` will return multiple result rows for the `userid` 'roddy' with one column per visit. \ No newline at end of file +If, however, we combine two column names from the partition key in the group using `SUM` without specifying `userid`, `GROUP BY` will return multiple result rows for the `userid` 'roddy' with one column per visit. + +### GROUP BY time + +The query below returns the number results completed each day: + +```sql +SELECT COUNT(*) +FROM tasks +GROUP BY time(completed, 1d); +``` From a3aec7a274ab9ee6758166de5a2d6e25735d552e Mon Sep 17 00:00:00 2001 From: Lauren Rother Date: Wed, 15 Mar 2017 11:04:39 -0400 Subject: [PATCH 11/15] Update BLOB info --- content/riak/ts/1.6.0/using/querying/guidelines.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/riak/ts/1.6.0/using/querying/guidelines.md b/content/riak/ts/1.6.0/using/querying/guidelines.md index 0736bfc247..7521a187b8 100644 --- a/content/riak/ts/1.6.0/using/querying/guidelines.md +++ b/content/riak/ts/1.6.0/using/querying/guidelines.md @@ -142,9 +142,9 @@ The following operators are supported for each data type: ### Blob data in queries and primary keys -Blob data should be queried using integers in base 16 (hex) notation, preceded by `0x` using riak shell or by providing any block of data (e.g. binary, text or JSON) through a Riak client library. +Blob data should be queried using integers in base 16 (hex) notation, preceded by `0x` using riak shell or by providing any block of data (e.g. binary, text or JSON) through a Riak client library. Blob data may only use strict equality (=, !=). -However, we do not recommend using blob columns in primary keys yet, due to limitations in the `list_keys` API. +We do not recommend using blob columns in primary keys, due to limitations in the `list_keys` API. ### Query parameters From 22ea3d3a5012563c52789a78f86766a87c308e70 Mon Sep 17 00:00:00 2001 From: Lauren Date: Wed, 15 Mar 2017 12:04:16 -0400 Subject: [PATCH 12/15] < > to > < --- content/riak/ts/1.6.0/using/querying/select/group-by.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/riak/ts/1.6.0/using/querying/select/group-by.md b/content/riak/ts/1.6.0/using/querying/select/group-by.md index 4f75546a4c..11d6d8c849 100644 --- a/content/riak/ts/1.6.0/using/querying/select/group-by.md +++ b/content/riak/ts/1.6.0/using/querying/select/group-by.md @@ -34,7 +34,7 @@ The SELECT statement must contain only the columns specified in `GROUP BY`. Colu Grouping by time is also possible using the time function with the GROUP BY clause. The time function has the following syntax: ```sql -GROUP BY time(«column_name», «duration») +GROUP BY time(»column_name«, »duration«) ``` {{% note title="WARNING" %}} From 298505f1f1ee8ad7716b6ca058704fe662d00ef0 Mon Sep 17 00:00:00 2001 From: Eric Date: Wed, 15 Mar 2017 11:51:02 -0700 Subject: [PATCH 13/15] Update based on feedback --- .../per-table-object-expiration.md | 26 ++++++++++++++----- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md b/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md index e8f6e17acf..3875d55305 100644 --- a/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md +++ b/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md @@ -22,32 +22,44 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/per-tabl [ttl]: https://en.wikipedia.org/wiki/Time_to_live [global expiry]: /riak/ts/1.6.0/table-management/global-object-expiration/ +[expiry retention]: /riak/ts/1.6.0/table-management/global-object-expiration/#setting-retention-time +[expiry modes]: /riak/ts/1.6.0/table-management/global-object-expiration/#expiry-modes [create table]: /riak/ts/1.6.0/table-management/creating-activating/ [create table with]: /riak/ts/1.6.0/table-management/creating-activating/#using-with -By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a [global][global expiry] or per table basis. +By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a [global][global expiry] or a per table basis. Expiration is disabled by default, but enabling it lets you expire older objects to reclaim the space used or purge data with a limited time value. ## Enabling Expiry -To enable object expiry for global or per table use, add the `leveldb.expiration` setting to your riak.conf file: +Enabling object expiry on a per table basis is similar to [enabling expiry globally][global expiry]. The expiration setting, `leveldb.expiration`, must be enabled in your riak.conf file in order to use per table or global expiry features: ```riak.conf leveldb.expiration = on ``` +You also have the option of setting a [retention time][expiry retention] and an [expiry mode][expiry modes] in your riak.conf. For example: + +```riak.conf +leveldb.expiration = on +leveldb.expiration.retention_time = 5h +leveldb.expiration.mode = whole_file +``` + +These settings will apply globally. So if you enable expiry on a table and the `default_time_to_live` or `expiration_mode` table properties are not set, the table will inherit the values set in riak.conf. + {{% note %}} Turning on object expiration will not retroactively expire previous data. Only data created while expiration is on will be scheduled for expiration. {{% /note %}} ## Expiry Table Properties -|Property Name|Values| -|---|---| -|expiration|`enabled` / `disabled`| -|defaut_time_to_live|`unlimited` or a duration string| -|expiration_mode|`use_global_config` / `per_item` / `whole_file`| +|Property Name|Default|Values| +|---|---|---| +|expiration|`disabled|``enabled` / `disabled`| +|defaut_time_to_live|`unlimited`|`unlimited` or a duration string| +|expiration_mode|`whole_file`|`use_global_config` / `per_item` / `whole_file`| Each table can have one or more of the properties listed above. If any properties are omitted on the table, the property values in riak.conf will be used. If the properties are not set within riak.conf, the default values will be used. From ed67081bffd6d8d3be76db2eeda0a8c6d638bb8f Mon Sep 17 00:00:00 2001 From: "John R. Daily" Date: Thu, 16 Mar 2017 15:11:41 -0400 Subject: [PATCH 14/15] Revise and extend expiry docs This is a work in progress, and I have left some of the link syntax broken for the moment. * Add information on `alter table` * Capture some of the nuance around older data and expiry * Make explicit the outcome of enabling expiry without corresponding ttl * Clarify that 1 minute is the finest level of granularity --- .../global-object-expiration.md | 9 ++++++--- .../per-table-object-expiration.md | 20 +++++++++++++------ 2 files changed, 20 insertions(+), 9 deletions(-) diff --git a/content/riak/ts/1.6.0/table-management/global-object-expiration.md b/content/riak/ts/1.6.0/table-management/global-object-expiration.md index fda7f97688..fc11b52afc 100644 --- a/content/riak/ts/1.6.0/table-management/global-object-expiration.md +++ b/content/riak/ts/1.6.0/table-management/global-object-expiration.md @@ -37,9 +37,10 @@ To enable global object expiry, add the `leveldb.expiration` setting to your ria leveldb.expiration = on ``` -{{% note %}} -Turning on global object expiration will not retroactively expire previous data. Only data created while expiration is on will be scheduled for expiration. -{{% /note %}} +Enabling expiry will instruct LevelDB to start tracking write times for data. New or updated data, along with recently-written data that has not yet been compacted, will be eligible for expiry. Older data will not expire until it is rewritten, due to data updates or internal read repair[XXX http://docs.basho.com/riak/kv/2.2.0/learn/concepts/replication/]. + +If expiration is enabled without configuring a retention time, LevelDB will track the age of data but will not expire it. + ## Setting Retention Time @@ -68,6 +69,8 @@ leveldb.expiration = on leveldb.expiration.retention_time = 8d9h ``` +The minimum *effective* expiry period is 1 minute, and in fact data can remain for almost 2 minutes if written shortly after the "top" of the minute. + ## Expiry Modes Global expiration supports two modes: diff --git a/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md b/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md index 3875d55305..8f7b17a4ba 100644 --- a/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md +++ b/content/riak/ts/1.6.0/table-management/per-table-object-expiration.md @@ -27,7 +27,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/per-tabl [create table]: /riak/ts/1.6.0/table-management/creating-activating/ [create table with]: /riak/ts/1.6.0/table-management/creating-activating/#using-with -By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a [global][global expiry] or a per table basis. +By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a [global][global expiry] or, when using Basho's Enterprise Edition, a per table basis. Expiration is disabled by default, but enabling it lets you expire older objects to reclaim the space used or purge data with a limited time value. @@ -49,16 +49,14 @@ leveldb.expiration.mode = whole_file These settings will apply globally. So if you enable expiry on a table and the `default_time_to_live` or `expiration_mode` table properties are not set, the table will inherit the values set in riak.conf. -{{% note %}} -Turning on object expiration will not retroactively expire previous data. Only data created while expiration is on will be scheduled for expiration. -{{% /note %}} +See [global expiry][global expiry] for a discussion of what happens to existing data when expiry is enabled. ## Expiry Table Properties |Property Name|Default|Values| |---|---|---| |expiration|`disabled|``enabled` / `disabled`| -|defaut_time_to_live|`unlimited`|`unlimited` or a duration string| +|default_time_to_live|`unlimited`|`unlimited` or a duration string| |expiration_mode|`whole_file`|`use_global_config` / `per_item` / `whole_file`| Each table can have one or more of the properties listed above. If any properties are omitted on the table, the property values in riak.conf will be used. If the properties are not set within riak.conf, the default values will be used. @@ -82,7 +80,17 @@ CREATE TABLE GeoCheckin ) ) WITH ( expiration = enabled - default_time_to_live = 123.4m + default_time_to_live = 123m + expiration_mode = whole_file +) +``` + +For existing tables, bucket properties can be modified with the `ALTER TABLE` command. + +``` +ALTER TABLE GeoCheckin WITH ( + expiration = enabled + default_time_to_live = 123m expiration_mode = whole_file ) ``` From 4f1fce607cf4d8a70fcf92309c9ae7c7ba5efe1d Mon Sep 17 00:00:00 2001 From: Lauren Date: Fri, 17 Mar 2017 12:18:50 -0400 Subject: [PATCH 15/15] Update link + formate --- .../riak/ts/1.6.0/table-management/global-object-expiration.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/riak/ts/1.6.0/table-management/global-object-expiration.md b/content/riak/ts/1.6.0/table-management/global-object-expiration.md index fc11b52afc..50542af83a 100644 --- a/content/riak/ts/1.6.0/table-management/global-object-expiration.md +++ b/content/riak/ts/1.6.0/table-management/global-object-expiration.md @@ -24,6 +24,7 @@ canonical_link: "https://docs.basho.com/riak/ts/latest/table-management/global-o [ttl]: https://en.wikipedia.org/wiki/Time_to_live [table expiry]: /riak/ts/1.6.0/table-management/per-table-object-expiration +[read repair]: /riak/kv/2.2.1/learn/concepts/replication/#read-repair By default, LevelDB keeps all of your data. But Riak TS allows you to configure object expiration (`expiry`) or [time to live (TTL)][ttl] for your data on a global or [per table basis][table expiry]. @@ -37,7 +38,7 @@ To enable global object expiry, add the `leveldb.expiration` setting to your ria leveldb.expiration = on ``` -Enabling expiry will instruct LevelDB to start tracking write times for data. New or updated data, along with recently-written data that has not yet been compacted, will be eligible for expiry. Older data will not expire until it is rewritten, due to data updates or internal read repair[XXX http://docs.basho.com/riak/kv/2.2.0/learn/concepts/replication/]. +Enabling expiry will instruct LevelDB to start tracking write times for data. New or updated data, along with recently-written data that has not yet been compacted, will be eligible for expiry. Older data will not expire until it is rewritten, due to data updates or internal [read repair]. If expiration is enabled without configuring a retention time, LevelDB will track the age of data but will not expire it.