- Supports dbt version
1.0.* - Supports Seeds
- Correctly detects views and their columns
- Support incremental models
- Support two incremental update strategies:
insert_overwriteandappend - Does not support the use of
unique_key
- Support two incremental update strategies:
- Only supports Athena engine 2
pip install dbt-athena-adapter- Or
pip install git+https://github.com/Tomme/dbt-athena.git
To start, you will need an S3 bucket, for instance my-staging-bucket and an Athena database:
CREATE DATABASE IF NOT EXISTS analytics_dev
COMMENT 'Analytics models generated by dbt (development)'
LOCATION 's3://my-staging-bucket/'
WITH DBPROPERTIES ('creator'='Foo Bar', 'email'='[email protected]');Notes:
- Take note of your AWS region code (e.g.
us-west-2oreu-west-2, etc.). - You can also use AWS Glue to create and manage Athena databases.
This plugin does not accept any credentials directly. Instead, credentials are determined automatically based on aws cli/boto3 conventions and
stored login info. You can configure the AWS profile name to use via aws_profile_name. Checkout DBT profile configuration below for details.
A dbt profile can be configured to run against AWS Athena using the following configuration:
| Option | Description | Required? | Example |
|---|---|---|---|
| s3_staging_dir | S3 location to store Athena query results and metadata | Required | s3://bucket/dbt/ |
| region_name | AWS region of your Athena instance | Required | eu-west-1 |
| schema | Specify the schema (Athena database) to build models into (lowercase only) | Required | dbt |
| database | Specify the database (Data catalog) to build models into (lowercase only) | Required | awsdatacatalog |
| poll_interval | Interval in seconds to use for polling the status of query results in Athena | Optional | 5 |
| aws_profile_name | Profile to use from your AWS shared credentials file. | Optional | my-profile |
| work_group | Identifier of Athena workgroup | Optional | my-custom-workgroup |
| num_retries | Number of times to retry a failing query | Optional | 3 |
Example profiles.yml entry:
athena:
target: dev
outputs:
dev:
type: athena
s3_staging_dir: s3://athena-query-results/dbt/
region_name: eu-west-1
schema: dbt
database: awsdatacatalog
aws_profile_name: my-profile
work_group: my-workgroupAdditional information
threadsis supporteddatabaseandcatalogcan be used interchangeably
external_location(default=none)- The location where Athena saves your table in Amazon S3
- If
nonethen it will default to{s3_staging_dir}/tables - If you are using a static value, when your table/partition is recreated underlying data will be cleaned up and overwritten by new data
partitioned_by(default=none)- An array list of columns by which the table will be partitioned
- Limited to creation of 100 partitions (currently)
bucketed_by(default=none)- An array list of columns to bucket data
bucket_count(default=none)- The number of buckets for bucketing your data
format(default='parquet')- The data format for the table
- Supports
ORC,PARQUET,AVRO,JSON, orTEXTFILE
write_compression(default=none)- The compression type to use for any storage format that allows compression to be specified. To see which options are available, check out CREATE TABLE AS
field_delimiter(default=none)- Custom field delimiter, for when format is set to
TEXTFILE
- Custom field delimiter, for when format is set to
More information: CREATE TABLE AS
Support for incremental models:
- Support two incremental update strategies with partitioned tables:
insert_overwriteandappend - Does not support the use of
unique_key
Due to the nature of AWS Athena, not all core dbt functionality is supported. The following features of dbt are not implemented on Athena:
- Snapshots
-
Quoting is not currently supported
- If you need to quote your sources, escape the quote characters in your source definitions:
version: 2 sources: - name: my_source tables: - name: first_table identifier: "first table" # Not like that - name: second_table identifier: "\"second table\"" # Like this
-
Tables, schemas and database should only be lowercase
-
Only supports Athena engine 2
First, install the adapter and its dependencies using make (see Makefile):
make install_depsNext, configure the environment variables in dev.env to match your Athena development environment. Finally, run the tests using make:
make run_tests