Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

Connecting C++ and Amazon Redshift with ADBC

Instructions

Prerequisites

  1. Install Pixi

  2. Install dbc

  3. Install the AWS CLI

  4. Create an AWS account or be able to log in to an existing one

Set Up Redshift

  1. Log into the AWS Console. Record the region name (in the upper-right corner). (If your AWS account uses single sign-on (SSO), the URL to log in to the console may be different, e.g. it may instead be https://(example).awsapps.com/start/.)

  2. Create a Redshift cluster or locate an existing one. (For example, Amazon provides a tutorial to create a new serverless cluster.) Record the hostname, port number, database name, workgroup name (for serverless clusters), and cluster identifier (for provisioned clusters).

  3. Ensure that the VPC security group that Redshift is running in has an inbound rule that accepts connections from your IP address on the Redshift port.

  4. If using a provisioned cluster, make sure the cluster is started and not paused. From the AWS Console, choose the cluster, then "Actions", then "Resume".

  5. This example uses the sample_data_dev database which is built-in and contains example datasets. If you wish to use the example as-is, then also create the sample_data_dev database from the console:

    1. From the AWS console, find your cluster.
    2. Choose "Query Data".
    3. Choose your database in the panel on the left to connect to it.
    4. Expand the database in the panel on the left and expand "native databases", then expand sample_data_dev.
    5. Click the folder icon on the tpch schema listed, which will have a tooltip labeled "Open sample notebooks". This will ask if you want to create the sample TPC-H data.
    6. Confirm that you want to create the database, then wait for Redshift to populate the data.
  6. Configure the AWS CLI:

    aws sso configure         # If you have never logged in before
    export AWS_PROFILE=<...>  # This comes from `sso configure`.
                              # Or use `aws configure list-profiles`
    export AWS_REGION=<...>
  7. Run this command in your terminal to log in with the AWS CLI:

    aws sso login             # This will open the browser
  8. If your cluster is not publicly accessable, and you are not running from within AWS (e.g. on an EC2 machine with access to Redshift), you will need to create a jump box and an SSH tunnel to access the cluster. See the AWS documentation on bastion hosts.

Connect to Redshift

  1. Install the Redshift ADBC driver:

    dbc install --level user redshift
  2. Customize the C++ program main.cpp

    • Change the connection arguments in the AdbcDatabaseSetOption() calls
      • Change the value of uri to match the hostname and port you recorded in the earlier step, or your SSH tunnel if necessary
      • Change the value of redshift.cluster_type to match your Redshift cluster type
      • Change the value of redshift.workgroup_name or redshift.cluster_identifier to match the workgroup name or cluster identifier you recorded in the earlier step
      • Change the value of redshift.db_name to match the database name you recorded in the earlier step (or leave it as sample_data_dev to use the built-in sample database)
    • If you changed the database name, also change the SQL SELECT statement in AdbcStatementSetSqlQuery()
  3. Build and run the C++ program:

    Using Make:

    pixi run make
    ./redshift_demo

    Or using CMake:

    pixi run cmake -B build
    pixi run cmake --build build
    ./build/redshift_demo

Clean up

  1. Clean build artifacts:

    Using Make:

    pixi run make clean

    Using CMake:

    rm -rf build