-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Finalize Tinybird pipes/datasources #3001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,5 @@ | ||
| TOKEN dub_tinybird_token APPEND | ||
| TOKEN "dub_tinybird_token" APPEND | ||
|
|
||
|
|
||
| SCHEMA > | ||
| `id` String `json:$.id`, | ||
|
|
||
This file was deleted.
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,32 +2,44 @@ | |
|
|
||
| SCHEMA > | ||
| `timestamp` DateTime64(3), | ||
| `event_id` String, | ||
| `event_name` String, | ||
| `click_id` String, | ||
| `workspace_id` LowCardinality(String), | ||
| `link_id` String, | ||
| `domain` String, | ||
| `key` String, | ||
| `url` String, | ||
| `event_id` String, | ||
|
Comment on lines
+6
to
+11
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainNullability mismatch between raw and MV can break ingestion. Raw source defines workspace_id/domain/key as Nullable(String); MV has non-nullable String/LowCardinality(String). Inserts with NULLs will fail unless the pipe coerces to ''. Align schemas or coalesce in the pipe. Option A — align MV schema: - `workspace_id` LowCardinality(String),
+ `workspace_id` LowCardinality(Nullable(String)),
- `domain` String,
+ `domain` Nullable(String),
- `key` String,
+ `key` Nullable(String),Option B — keep MV non-nullable and coalesce upstream: -- In the MV-building pipe
COALESCE(workspace_id, '') AS workspace_id,
COALESCE(domain, '') AS domain,
COALESCE(`key`, '') AS `key`,Please confirm which approach you prefer; I can propose a full diff once chosen. Also applies to: 17-19, 28-41 I'll help you verify this review comment by examining the datasource files and checking for the nullability mismatch. Let me start by exploring the codebase. Length of output: 116 🏁 Script executed: #!/bin/bash
# List all files in the tinybird datasources directory
fd . packages/tinybird/datasources --type f | head -20Length of output: 786 🏁 Script executed: #!/bin/bash
# Show the MV datasource file content
cat packages/tinybird/datasources/dub_sale_events_mv.datasourceLength of output: 1268 🏁 Script executed: #!/bin/bash
# Look for the raw datasource file
fd "dub_sale_events" packages/tinybird/datasources --type fLength of output: 116 🏁 Script executed: #!/bin/bash
# Search for pipe files that might reference these datasources
fd -e pipe packages/tinybird/ | xargs grep -l "dub_sale_events" 2>/dev/null | head -5Length of output: 377 Length of output: 1929 🏁 Script executed: #!/bin/bash
# Search for pipe files related to dub_sale_events with corrected syntax
fd . packages/tinybird --full-path '.*dub_sale_events.*\.pipe'Length of output: 2007 🏁 Script executed: #!/bin/bash
# List all pipe files in tinybird
fd -e pipe packages/tinybird/Length of output: 377 🏁 Script executed: #!/bin/bash
# Check the MV pipe definition
fd "dub_sale_events_pipe" packages/tinybird/ -e pipeLength of output: 49 🏁 Script executed: #!/bin/bash
# Show the pipe file content to see NULL handling
cat packages/tinybird/pipes/dub_sale_events_pipe.pipeLength of output: 2043 Nullability mismatch confirmed—pipe's COALESCE insufficient without default fallback. The raw source defines Choose and implement one approach:
Apply the same fix to the other affected fields at lines 17–19 and 28–41 of the MV datasource if they have the same issue. 🤖 Prompt for AI Agents |
||
| `event_name` String, | ||
| `customer_id` String, | ||
| `payment_processor` LowCardinality(String), | ||
| `invoice_id` String, | ||
| `amount` UInt32, | ||
| `url` String, | ||
| `sale_type` String, | ||
| `metadata` String, | ||
| `continent` LowCardinality(String), | ||
| `country` LowCardinality(String), | ||
| `city` String, | ||
| `region` String, | ||
| `city` LowCardinality(String), | ||
| `region` LowCardinality(String), | ||
| `latitude` String, | ||
| `longitude` String, | ||
| `device` LowCardinality(String), | ||
| `browser` LowCardinality(String), | ||
| `os` LowCardinality(String), | ||
| `engine` LowCardinality(String), | ||
| `trigger` String, | ||
| `ua` String, | ||
| `referer` String, | ||
| `referer_url` String, | ||
| `qr` UInt8, | ||
| `ip` String, | ||
| `metadata` String | ||
| `device_model` LowCardinality(String), | ||
| `device_vendor` LowCardinality(String), | ||
| `browser_version` String, | ||
| `os_version` String, | ||
| `engine` LowCardinality(String), | ||
| `engine_version` String, | ||
| `cpu_architecture` LowCardinality(String), | ||
| `qr` UInt8, | ||
| `bot` UInt8 | ||
|
|
||
| ENGINE "MergeTree" | ||
| ENGINE_PARTITION_KEY "toYYYYMM(timestamp)" | ||
| ENGINE_SORTING_KEY "link_id, timestamp" | ||
| ENGINE_SORTING_KEY "workspace_id, link_id, timestamp" | ||
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Non‑nullable fields may reject NULLs from the pipe.
workspace_id,domain,keyare non‑nullable. If upstream join/coalesce yields NULL, inserts will fail. Either (a) make these columns Nullable, or (b) guarantee non‑NULLs in the pipe (see suggested fallbacks indub_lead_events_pipe.pipe).Option A (make columns nullable):
Option B (keep non‑nullable): keep schema as‑is and adopt the pipe changes proposed earlier.
Also applies to: 11-15, 41-41
🏁 Script executed:
Length of output: 61
🏁 Script executed:
Length of output: 51
🏁 Script executed:
Length of output: 1066
🏁 Script executed:
Length of output: 1150
🏁 Script executed:
Length of output: 1493
Address NULL handling for workspace_id, domain, key fields.
The datasource declares
workspace_id,domain, andkeyas non-nullable, but the upstream pipe usescoalesce()without a final fallback. If bothlead_eventandlink_metadatasources are NULL,coalesce()returns NULL, causing insert failures.Choose one approach:
Option A (make columns nullable):
Option B (ensure non-NULL in pipe): Add fallback values (e.g.,
coalesce(..., '')) to guarantee non-NULL results before insert.📝 Committable suggestion
🤖 Prompt for AI Agents