Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@royendo
Copy link
Contributor

@royendo royendo commented Nov 4, 2025

Summary

This PR extends the OLAP engine documentation from PR #8210 (which added BigQuery and Snowflake) to include four additional OLAP engines that were recently added:

Changes

Updated: partitioned-models.md

Section: "Using Other OLAP Engines for Partition Queries"

  • Updated intro text to list all 6 supported engines (BigQuery, Snowflake, MySQL, Postgres, Redshift, Athena)
  • Added complete working examples for each new engine:
    • MySQL: Daily partitioning using DATE() function
    • Postgres: Daily partitioning using DATE_TRUNC()
    • Redshift: Monthly partitioning using DATE_TRUNC()
    • Athena: Multi-column partitioning (year, month)
  • Updated tip box to mention all engines

Example pattern added:

partitions:
  connector: mysql  # or postgres, redshift, athena
  sql: SELECT DISTINCT partition_column FROM table

connector: mysql  # or postgres, redshift, athena
sql: SELECT * FROM table WHERE condition

output:
  connector: duckdb  # Fast dashboard queries

Updated: custom-apis.md

Section: "Using Alternative OLAP Engines" (renamed from "Using BigQuery or Snowflake")

  • Renamed section to be more inclusive
  • Updated intro to list all 6 supported engines
  • Added API examples for each new engine:
    • MySQL: Querying orders table
    • Postgres: Querying events table
    • Redshift: Querying transactions table
    • Athena: Querying S3 data
  • Expanded cost warning to cover all database/warehouse billing models:
    • BigQuery: Per TB scanned
    • Snowflake: Warehouse compute time
    • Redshift: Cluster compute time
    • Athena: Per TB scanned
    • MySQL/Postgres: Instance compute time and IOPS
  • Updated "When to use" guidance to be more generic

Example API pattern added:

type: api
connector: mysql  # or postgres, redshift, athena
sql: SELECT * FROM table WHERE date >= '2025-01-01' LIMIT 100

Documentation Structure

Both files now follow a consistent structure:

  1. Introduction - Lists all 6 supported engines
  2. BigQuery example - Original from PR docs: Document BigQuery and Snowflake support in partitioned models (for PRs #8069 and #8108) #8210
  3. Snowflake example - Original from PR docs: Document BigQuery and Snowflake support in partitioned models (for PRs #8069 and #8108) #8210
  4. MySQL example - New
  5. Postgres example - New
  6. Redshift example - New
  7. Athena example - New
  8. Guidance - Updated to cover all engines

Benefits Documented

  • Flexibility: Query data from source without ingestion
  • Cost optimization: Only query what you need
  • Performance: Output to DuckDB for fast dashboards
  • Real-time access: Query latest data from source
  • Native features: Leverage source-specific optimizations

Related PRs

cc @kanshul

🤖 Generated with Claude Code

…ation

Extends PR #8210 (BigQuery and Snowflake) to include documentation for additional OLAP engines added in recent PRs:
- PR #8169: MySQL OLAP interface
- PR #8224: Postgres OLAP interface
- PR #8232: Redshift OLAP interface
- PR #8180: Athena OLAP interface

### Changes to partitioned-models.md

- Updated intro text to list all 6 supported OLAP engines (BigQuery, Snowflake, MySQL, Postgres, Redshift, Athena)
- Added 4 new complete examples showing partition patterns for each engine:
  - **MySQL**: Daily partitioning using DATE() function
  - **Postgres**: Daily partitioning using DATE_TRUNC()
  - **Redshift**: Monthly partitioning using DATE_TRUNC()
  - **Athena**: Multi-column partitioning (year, month)
- Updated tip box to include all engines

### Changes to custom-apis.md

- Renamed section from "Using BigQuery or Snowflake" to "Using Alternative OLAP Engines"
- Updated intro to list all 6 supported engines
- Added 4 new API examples for MySQL, Postgres, Redshift, and Athena
- Expanded cost warning to cover all database/warehouse billing models
- Updated "When to use" guidance to be more generic

All examples follow the same pattern as PR #8210: use the source connector for querying, then output to DuckDB for fast dashboard performance.

Related PRs: #8210, #8169, #8224, #8232, #8180

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@royendo
Copy link
Contributor Author

royendo commented Nov 4, 2025

not sure how i feel about the long page of examples, but untli we have an example directory integrated in docs, i think its fine.

@royendo royendo marked this pull request as ready for review November 4, 2025 14:38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the Snowflake example doesn't have output: duckdb like the other examples do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on consistent examples on the Custom APIs page also applies here. E.g., some examples use transactions, others use events.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i get the reasoning for snowflake not having the YAML, but IMO slightly different SQL should be fine for examples,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants