-
Create an AWS account or be able to log in to an existing one
-
Log into the AWS Console. Record the region name (in the upper-right corner). (If your AWS account uses single sign-on (SSO), the URL to log in to the console may be different, e.g. it may instead be https://(example).awsapps.com/start/.)
-
Create a Redshift cluster or locate an existing one. (For example, Amazon provides a tutorial to create a new serverless cluster.) Record the hostname, port number, database name, workgroup name (for serverless clusters), and cluster identifier (for provisioned clusters).
-
Ensure that the VPC security group that Redshift is running in has an inbound rule that accepts connections from your IP address on the Redshift port.
-
If using a provisioned cluster, make sure the cluster is started and not paused. From the AWS Console, choose the cluster, then "Actions", then "Resume".
-
This example uses the
sample_data_devdatabase which is built-in and contains example datasets. If you wish to use the example as-is, then also create thesample_data_devdatabase from the console:- From the AWS console, find your cluster.
- Choose "Query Data".
- Choose your database in the panel on the left to connect to it.
- Expand the database in the panel on the left and expand "native databases", then expand
sample_data_dev. - Click the folder icon on the
tpchschema listed, which will have a tooltip labeled "Open sample notebooks". This will ask if you want to create the sample TPC-H data. - Confirm that you want to create the database, then wait for Redshift to populate the data.
-
Configure the AWS CLI:
aws sso configure # If you have never logged in before export AWS_PROFILE=<...> # This comes from `sso configure`. # Or use `aws configure list-profiles` export AWS_REGION=<...>
-
Run this command in your terminal to log in with the AWS CLI:
aws sso login # This will open the browser -
If your cluster is not publicly accessable, and you are not running from within AWS (e.g. on an EC2 machine with access to Redshift), you will need to create a jump box and an SSH tunnel to access the cluster. See the AWS documentation on bastion hosts.
-
Install the Redshift ADBC driver:
dbc install --level user redshift
-
Customize the C++ program
main.cpp- Change the connection arguments in the
AdbcDatabaseSetOption()calls- Change the value of
urito match the hostname and port you recorded in the earlier step, or your SSH tunnel if necessary - Change the value of
redshift.cluster_typeto match your Redshift cluster type - Change the value of
redshift.workgroup_nameorredshift.cluster_identifierto match the workgroup name or cluster identifier you recorded in the earlier step - Change the value of
redshift.db_nameto match the database name you recorded in the earlier step (or leave it assample_data_devto use the built-in sample database)
- Change the value of
- If you changed the database name, also change the SQL SELECT statement in
AdbcStatementSetSqlQuery()
- Change the connection arguments in the
-
Build and run the C++ program:
Using Make:
pixi run make ./redshift_demo
Or using CMake:
pixi run cmake -B build pixi run cmake --build build ./build/redshift_demo
-
Clean build artifacts:
Using Make:
pixi run make clean
Using CMake:
rm -rf build