Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@rudolfix
Copy link
Collaborator

@rudolfix rudolfix commented May 26, 2025

Description

changes in transformations:

  • fixes tranformation decorator overloads so they have correct typing
  • passes TransformationConfiguration as base spec so buffer is always injected
  • wraps tranformation_function
  • makes str SQL a model
  • tests configurations and parametrized transformations

changes in resources

  • allows resources to return. this is the same as yielding once.
  • allows base spec to be passed to resource function
  • makes DltResource and SourceFactory to wrap decorated function and fixes signatures
  • allows inner resources to be injectable, warns for transformers
  • normalizes and tests how functions are wrapped and unwrapped so signatures and configs are available

merges standalone resources with regular resources

  • all are DltResource
  • we generate the correct typing for call! (huge QoL improvement IMO)
  • all resources can be configured including inner resources and including default args, previously only standalone could. that unifies behavior for resources and sources re. config injection we should mention this in the release notes
  • resources can return another resources if have DltResource in type annotation
  • resources can be renamed with lambda names also sections can be renamed

other changes
read commit list carefully

I'm aware of one config test that is not passing. I'll fix it after merge

rudolfix and others added 20 commits May 21, 2025 11:04
* changes contrib and README

* Apply suggestions from code review

Co-authored-by: Anton Burnashev <[email protected]>

---------

Co-authored-by: Anton Burnashev <[email protected]>
…ion as base spec so buffer is always injected (3) wraps tranformation_function (4) makes str SQL a model (5) tests configurations and parametrized transformations
…also functions (3) allows base spec to be passed to resource function (4) makes DltResource and SourceFactory to wrap decorated function and fixes signatures (5) allows inner resources to be injectable, warns for transformers (6) normalizes and tests how functions are wrapped and unwrapped so signatures and configs are available
…rom providers but explicit cannot. if those were instances of base configurations, behavior was inconsistent (explicit values were treated like defaults). also if native value is found for a config and it does not accept native values, config resolution will fail, previously it was ignored
…sources (2) we generate the correct types for __call__! (3) all resources can be configured including inner resources and including default params, previously only standalone could. that unifies behavior for resources and sources re. config injection (4) resources can return another resources if have DltResource in type annotation (5) resources can be renamed with lambda names also sections can be renamed
@rudolfix rudolfix requested a review from sh-rp May 26, 2025 11:28
@sh-rp sh-rp merged commit 0b4326c into feat/2527-transformations May 28, 2025
if rv is None:
raise InvalidResourceDataTypeIsNone(name, rv, NoneType)
# is it Pipe or resource
if hasattr(rv, "_gen_idx") or hasattr(rv, "_pipe"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have users that rely on this working? maybe we should add a warning on the release. I have never seen this, but it makes sense that some people would try..

section: Optional[str] = None,
) -> Any:
"""
Decorator to mark a function as a transformation. Returns a DltTransformation object.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, yes, we need to add proper docstrings (and add this class to the docstrings linter)

)

# build transformation function
@wraps(func)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool :)

):
resolved_transformation_type = "model"
)
# if we cannot reach the destination, or a running outside of a pipeline, we extract frames
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment is out of date

)


class InvalidResourceReturnsResource(InvalidResourceDataType):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, this is good :) but we should still put it in the release notes I think

assert "100" in query


def test_yield_tables_fallback() -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not tested in another place already? I will check.

# set default value which is cred class instance
dest_config.credentials = ConnectionStringCredentials("mysql+pymsql://USER@/dlt_data")
config = dest.configuration(dest_config)
# will be able to merge and resolve credentials
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very cool! I also think though that this changed config behavior will have to go into the release notes properly.

incremental: Optional[TIncrementalConfig] = None,
_impl_cls: Type[TDltResourceImpl] = DltResource, # type: ignore[assignment]
section: Optional[str] = None,
section: Optional[TTableHintTemplate[str]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a confusing type, I know how it works, but to the user it does not make sense to have it named like this here. Maybe it should be something like ResolvableHint or ResolvableValue.

sh-rp added a commit that referenced this pull request May 30, 2025
* triggers devel tests

* fixed malformed docstring

* use native sqlglot type annotation

* pass hints via SQLGlot metadata

* fix linter errors and tests

* fix a few more tests and edge cases

* fix bug in lineage

* enable columns schema for both ReadableRelation Types

* add more tests and make lineage tests independent from loading

* add lineage tests for all sql destinations

* enable tests on ci and disable column schema for sqlalchemy for now

* fix some more tests

* add sqlalchemy hack

* first fix for snowflake and some smaller chnages and clarifications

* fix sqlglot schema creation, makes clickhouse work

* re-add transformations tests folder

* fix lineage datatype

* disable databricks and synapse ibis backend tests

* move transformation code from prototype excluding old lineage and including updates so that linter passes, no real code changes yet.

* fix some of the python extractor based transformations

* fix most tests

* make basic transformation tests run on all destinations

* enable all current transformation tests for all destinations
run some duckdb transformations on all OSes

* a little bit of cleanup

* move common transactions and mark all destination transaction tests as essential for now

* Add improvements from review in prototype PR and some cleanup

* exclude dremio

* fix some transformations tests

* fix row_counts for snowflake and add some comments

* converts SupportsReadableRelation to an ABC

* add scalar access to SupportsReadableRelation

* simplify transformation signature

* add top level dlt objects and some small changes

* second part of removing transformation extra args

* add clickhouse tests

* add config based transformation source

* add better transformation examples

* use fruitshop template for testing

* remove custom row_counts method in favor of "global" test one

* first draft of transformations doc

* some work on the docs page

* feat: 2540 lineage `allow_unknown_columns` and `allow_anonymous_columns` (#2577)

* test compute_columns_schema() and exception handling

* convert transformation code examples to snippets

* finish first round of transformation docs

* Quite a few PR fixes

* fixes some tests

* add support and docs for dataframe and arrow operations

* add config and fallback if destination not reachable

* fix scalar method
fallback to models if pipeline destination is not available

* hopefully fix one test

* Docs: addition of normalizer behaviour to transformations docs (#2639)

* Normalizer info added

* Unnecessary paragraph removed, regular normalization linked

* feat: 2540 - SQLGlot type mapping (#2587)

* fixes some tests

* post rebase cleanup

* renamed kwarg

* type handling done; WIP

* sqlglot-dlt type mapping completed

* added docstrings to tests

* removed unused test file

* attach metadata to DataType

* refactored test to parameterized form

* refactor function names

* bug fix .to_py()

* rename compute_columns_schema() kwargs

* refactor type conversion branches

* fixes some tests

* add support and docs for dataframe and arrow operations

* add config and fallback if destination not reachable

* fix scalar method
fallback to models if pipeline destination is not available

* fix: update return type in athena_adapter docstring to reflect correct destination (#2599)

* list secrets in vault config provider to avoid calls to backend (#2597)

* fixes bug where configuration section was not propagated when embedded configuration is resolved

* splits vault provider settings per vault type

* adds option to list secrets to vault and google secrets provider

* uses google secrets provider with global cache for tests

* documents vault provider

* test and docs fixes

* slightly clarify clickhouse docs (#2594)

* slightly clarify clickhouse docs

* Update clickhouse.md

* Extract dataset code snippets into tests snippets system (#2598)

* extracts dataset code blocks into tested snippets and uses fruitshop pipeline as base dataset for demonstration purposes

* add ibis group

* Enabling 'model' loader_file_format for athena, synapse and dremio (#2556)

* Athena model loader format initial support

* test_verify_capabilities_data_types adjusted for athena

* Synapse enabled

* The offset logic for tsql made unreachable

* Athena test config without iceberg removed, dremio added

* Unnecessary synapse workaround removed

* fix some typos in cursor-restapi docs (#2608)

* fix some typos in cursor-restapi docs

* fix typo

* refactor init-command for use in dlt project (#2568)

* refactor init-command for use in dlt project

* remove config.toml from project docs

* fix ibis mypy error

---------

Co-authored-by: dave <[email protected]>

* docs: Fix incorrect nesting in secrets.toml (#2614)

* fixes parquet data writer settings docs & rewrites configuration docs (#2583)

* fixes parquet data writer settings docs

* adds section to dlt resource decorator

* fixes and tests how config sections are created when single resource is extracted

* fixes config sections for parallel doc example

* exports postgres adapter

* rewrites configuration docs, moves a few docs sections in sidebar

* snippet fixes

* accepts docs changes from review

Co-authored-by: Violetta Mishechkina <[email protected]>

* adds tip how to eject core source

* linter fixes

---------

Co-authored-by: Violetta Mishechkina <[email protected]>

* enables fsspec per-thread instance cache and updates documentation (#2621)

* bumps pendulum and docs (#2624)

* fixes sql database docstrings and docs

* bumps poetry to 3.0.1 and drop dlt poetry

* Added dedup sort example (#2235)

* Added dedup sort example

* Updated formatting

* Updated

* Updated

* Update docs/website/docs/general-usage/incremental-loading.md

---------

Co-authored-by: Alena Astrakhantseva <[email protected]>
Co-authored-by: Marcin Rudolf <[email protected]>

* Docs: add advanced project tutorial (#2338)

* hopefully fix one test

* trigger ci

* improve tests, lint

---------

Co-authored-by: David Scharf <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: rudolfix <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: hsm207 <[email protected]>
Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Alexander Grueneberg <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: dat-a-man <[email protected]>
Co-authored-by: Alena Astrakhantseva <[email protected]>

* qualify all queries that come into the transformations

* fix lineage for snowflake and clickhouse lineage

* apply schema fix for sqlglot and remove special treatment of snowflake

* align datasets interfaces with ibis implementation ["col"] selects column and not table with one column

* disable incremental on transformations decorator and warn if incremental args are discovered

* fixes one more test

* fixes snowflake tests after sqlglot schema fix

* removes standalone resources, fixes transformation function wrapping (#2684)

* changes contrib and README (#2666)

* changes contrib and README

* Apply suggestions from code review

Co-authored-by: Anton Burnashev <[email protected]>

---------

Co-authored-by: Anton Burnashev <[email protected]>

* raises if resolving dataclass without configspec

* adds function type inspect that follows wrappers

* removes make fun, uses wraps

* adds conftest to transformations

* (1) fixes tranformation overloads (2) passes TransformationConfiguration as base spec so buffer is always injected (3) wraps tranformation_function (4) makes str SQL a model (5) tests configurations and parametrized transformations

* (1) removes resources returning resources (2) allows resources to be also functions (3) allows base spec to be passed to resource function (4) makes DltResource and SourceFactory to wrap decorated function and fixes signatures (5) allows inner resources to be injectable, warns for transformers (6) normalizes and tests how functions are wrapped and unwrapped so signatures and configs are available

* normalizes config resolve behavior: default values can be overriden from providers but explicit cannot. if those were instances of base configurations, behavior was inconsistent (explicit values were treated like defaults). also if native value is found for a config and it does not accept native values, config resolution will fail, previously it was ignored

* do not use config specs cached in module when creating autospecs

* fixes venv tests when uv is present

* if incremental parses from another incremental as native value, it copies origina type correctly

* merges standalone resources with regular resources: (1) all are DltResources (2) we generate the correct types for __call__! (3) all resources can be configured including inner resources and including default params, previously only standalone could. that unifies behavior for resources and sources re. config injection (4) resources can return another resources if have DltResource in type annotation (5) resources can be renamed with lambda names also sections can be renamed

* fixes transformation decorators so they generate correct typing

* binds params to resource function instead of using defaults to avoid generating config injection in rest_api

* removes remaining full_refresh flags

* fixes Makefile commands to run common and local destination tests

* fixes xdg home test

* fixes venv tests for uv

* linter and docsstring fixes

---------

Co-authored-by: Anton Burnashev <[email protected]>

* allows for initial values that are configurations also in case no native initial values are supported

* fixes docs linting

* Outer select quotes columns (#2694)

* fix normalizer tests

* fix a few small tests

* remove dependency on ibis for common tests (not supported on python 3.13)

* fixes for python 3.9

* fix sqlglot schema propagation and retrieval

* fixes leaking sqlalchemy credentials into other test

* skip not materialized columns in sqlglot schema generation

---------

Co-authored-by: Marcin Rudolf <[email protected]>
Co-authored-by: zilto <[email protected]>
Co-authored-by: Thierry Jean <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: hsm207 <[email protected]>
Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Alexander Grueneberg <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: dat-a-man <[email protected]>
Co-authored-by: Alena Astrakhantseva <[email protected]>
lucargir pushed a commit to lucargir/dlt that referenced this pull request Jun 6, 2025
* triggers devel tests

* fixed malformed docstring

* use native sqlglot type annotation

* pass hints via SQLGlot metadata

* fix linter errors and tests

* fix a few more tests and edge cases

* fix bug in lineage

* enable columns schema for both ReadableRelation Types

* add more tests and make lineage tests independent from loading

* add lineage tests for all sql destinations

* enable tests on ci and disable column schema for sqlalchemy for now

* fix some more tests

* add sqlalchemy hack

* first fix for snowflake and some smaller chnages and clarifications

* fix sqlglot schema creation, makes clickhouse work

* re-add transformations tests folder

* fix lineage datatype

* disable databricks and synapse ibis backend tests

* move transformation code from prototype excluding old lineage and including updates so that linter passes, no real code changes yet.

* fix some of the python extractor based transformations

* fix most tests

* make basic transformation tests run on all destinations

* enable all current transformation tests for all destinations
run some duckdb transformations on all OSes

* a little bit of cleanup

* move common transactions and mark all destination transaction tests as essential for now

* Add improvements from review in prototype PR and some cleanup

* exclude dremio

* fix some transformations tests

* fix row_counts for snowflake and add some comments

* converts SupportsReadableRelation to an ABC

* add scalar access to SupportsReadableRelation

* simplify transformation signature

* add top level dlt objects and some small changes

* second part of removing transformation extra args

* add clickhouse tests

* add config based transformation source

* add better transformation examples

* use fruitshop template for testing

* remove custom row_counts method in favor of "global" test one

* first draft of transformations doc

* some work on the docs page

* feat: 2540 lineage `allow_unknown_columns` and `allow_anonymous_columns` (dlt-hub#2577)

* test compute_columns_schema() and exception handling

* convert transformation code examples to snippets

* finish first round of transformation docs

* Quite a few PR fixes

* fixes some tests

* add support and docs for dataframe and arrow operations

* add config and fallback if destination not reachable

* fix scalar method
fallback to models if pipeline destination is not available

* hopefully fix one test

* Docs: addition of normalizer behaviour to transformations docs (dlt-hub#2639)

* Normalizer info added

* Unnecessary paragraph removed, regular normalization linked

* feat: 2540 - SQLGlot type mapping (dlt-hub#2587)

* fixes some tests

* post rebase cleanup

* renamed kwarg

* type handling done; WIP

* sqlglot-dlt type mapping completed

* added docstrings to tests

* removed unused test file

* attach metadata to DataType

* refactored test to parameterized form

* refactor function names

* bug fix .to_py()

* rename compute_columns_schema() kwargs

* refactor type conversion branches

* fixes some tests

* add support and docs for dataframe and arrow operations

* add config and fallback if destination not reachable

* fix scalar method
fallback to models if pipeline destination is not available

* fix: update return type in athena_adapter docstring to reflect correct destination (dlt-hub#2599)

* list secrets in vault config provider to avoid calls to backend (dlt-hub#2597)

* fixes bug where configuration section was not propagated when embedded configuration is resolved

* splits vault provider settings per vault type

* adds option to list secrets to vault and google secrets provider

* uses google secrets provider with global cache for tests

* documents vault provider

* test and docs fixes

* slightly clarify clickhouse docs (dlt-hub#2594)

* slightly clarify clickhouse docs

* Update clickhouse.md

* Extract dataset code snippets into tests snippets system (dlt-hub#2598)

* extracts dataset code blocks into tested snippets and uses fruitshop pipeline as base dataset for demonstration purposes

* add ibis group

* Enabling 'model' loader_file_format for athena, synapse and dremio (dlt-hub#2556)

* Athena model loader format initial support

* test_verify_capabilities_data_types adjusted for athena

* Synapse enabled

* The offset logic for tsql made unreachable

* Athena test config without iceberg removed, dremio added

* Unnecessary synapse workaround removed

* fix some typos in cursor-restapi docs (dlt-hub#2608)

* fix some typos in cursor-restapi docs

* fix typo

* refactor init-command for use in dlt project (dlt-hub#2568)

* refactor init-command for use in dlt project

* remove config.toml from project docs

* fix ibis mypy error

---------

Co-authored-by: dave <[email protected]>

* docs: Fix incorrect nesting in secrets.toml (dlt-hub#2614)

* fixes parquet data writer settings docs & rewrites configuration docs (dlt-hub#2583)

* fixes parquet data writer settings docs

* adds section to dlt resource decorator

* fixes and tests how config sections are created when single resource is extracted

* fixes config sections for parallel doc example

* exports postgres adapter

* rewrites configuration docs, moves a few docs sections in sidebar

* snippet fixes

* accepts docs changes from review

Co-authored-by: Violetta Mishechkina <[email protected]>

* adds tip how to eject core source

* linter fixes

---------

Co-authored-by: Violetta Mishechkina <[email protected]>

* enables fsspec per-thread instance cache and updates documentation (dlt-hub#2621)

* bumps pendulum and docs (dlt-hub#2624)

* fixes sql database docstrings and docs

* bumps poetry to 3.0.1 and drop dlt poetry

* Added dedup sort example (dlt-hub#2235)

* Added dedup sort example

* Updated formatting

* Updated

* Updated

* Update docs/website/docs/general-usage/incremental-loading.md

---------

Co-authored-by: Alena Astrakhantseva <[email protected]>
Co-authored-by: Marcin Rudolf <[email protected]>

* Docs: add advanced project tutorial (dlt-hub#2338)

* hopefully fix one test

* trigger ci

* improve tests, lint

---------

Co-authored-by: David Scharf <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: rudolfix <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: hsm207 <[email protected]>
Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Alexander Grueneberg <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: dat-a-man <[email protected]>
Co-authored-by: Alena Astrakhantseva <[email protected]>

* qualify all queries that come into the transformations

* fix lineage for snowflake and clickhouse lineage

* apply schema fix for sqlglot and remove special treatment of snowflake

* align datasets interfaces with ibis implementation ["col"] selects column and not table with one column

* disable incremental on transformations decorator and warn if incremental args are discovered

* fixes one more test

* fixes snowflake tests after sqlglot schema fix

* removes standalone resources, fixes transformation function wrapping (dlt-hub#2684)

* changes contrib and README (dlt-hub#2666)

* changes contrib and README

* Apply suggestions from code review

Co-authored-by: Anton Burnashev <[email protected]>

---------

Co-authored-by: Anton Burnashev <[email protected]>

* raises if resolving dataclass without configspec

* adds function type inspect that follows wrappers

* removes make fun, uses wraps

* adds conftest to transformations

* (1) fixes tranformation overloads (2) passes TransformationConfiguration as base spec so buffer is always injected (3) wraps tranformation_function (4) makes str SQL a model (5) tests configurations and parametrized transformations

* (1) removes resources returning resources (2) allows resources to be also functions (3) allows base spec to be passed to resource function (4) makes DltResource and SourceFactory to wrap decorated function and fixes signatures (5) allows inner resources to be injectable, warns for transformers (6) normalizes and tests how functions are wrapped and unwrapped so signatures and configs are available

* normalizes config resolve behavior: default values can be overriden from providers but explicit cannot. if those were instances of base configurations, behavior was inconsistent (explicit values were treated like defaults). also if native value is found for a config and it does not accept native values, config resolution will fail, previously it was ignored

* do not use config specs cached in module when creating autospecs

* fixes venv tests when uv is present

* if incremental parses from another incremental as native value, it copies origina type correctly

* merges standalone resources with regular resources: (1) all are DltResources (2) we generate the correct types for __call__! (3) all resources can be configured including inner resources and including default params, previously only standalone could. that unifies behavior for resources and sources re. config injection (4) resources can return another resources if have DltResource in type annotation (5) resources can be renamed with lambda names also sections can be renamed

* fixes transformation decorators so they generate correct typing

* binds params to resource function instead of using defaults to avoid generating config injection in rest_api

* removes remaining full_refresh flags

* fixes Makefile commands to run common and local destination tests

* fixes xdg home test

* fixes venv tests for uv

* linter and docsstring fixes

---------

Co-authored-by: Anton Burnashev <[email protected]>

* allows for initial values that are configurations also in case no native initial values are supported

* fixes docs linting

* Outer select quotes columns (dlt-hub#2694)

* fix normalizer tests

* fix a few small tests

* remove dependency on ibis for common tests (not supported on python 3.13)

* fixes for python 3.9

* fix sqlglot schema propagation and retrieval

* fixes leaking sqlalchemy credentials into other test

* skip not materialized columns in sqlglot schema generation

---------

Co-authored-by: Marcin Rudolf <[email protected]>
Co-authored-by: zilto <[email protected]>
Co-authored-by: Thierry Jean <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: hsm207 <[email protected]>
Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Alexander Grueneberg <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: dat-a-man <[email protected]>
Co-authored-by: Alena Astrakhantseva <[email protected]>
@rudolfix rudolfix deleted the fix/fixes-transformer-function-wrapping branch June 6, 2025 12:09
dat-a-man added a commit that referenced this pull request Jun 24, 2025
* triggers devel tests

* fixed malformed docstring

* use native sqlglot type annotation

* pass hints via SQLGlot metadata

* fix linter errors and tests

* fix a few more tests and edge cases

* fix bug in lineage

* enable columns schema for both ReadableRelation Types

* add more tests and make lineage tests independent from loading

* add lineage tests for all sql destinations

* enable tests on ci and disable column schema for sqlalchemy for now

* fix some more tests

* add sqlalchemy hack

* first fix for snowflake and some smaller chnages and clarifications

* fix sqlglot schema creation, makes clickhouse work

* re-add transformations tests folder

* fix lineage datatype

* disable databricks and synapse ibis backend tests

* move transformation code from prototype excluding old lineage and including updates so that linter passes, no real code changes yet.

* fix some of the python extractor based transformations

* fix most tests

* make basic transformation tests run on all destinations

* enable all current transformation tests for all destinations
run some duckdb transformations on all OSes

* a little bit of cleanup

* move common transactions and mark all destination transaction tests as essential for now

* Add improvements from review in prototype PR and some cleanup

* exclude dremio

* fix some transformations tests

* fix row_counts for snowflake and add some comments

* converts SupportsReadableRelation to an ABC

* add scalar access to SupportsReadableRelation

* simplify transformation signature

* add top level dlt objects and some small changes

* second part of removing transformation extra args

* add clickhouse tests

* add config based transformation source

* add better transformation examples

* use fruitshop template for testing

* remove custom row_counts method in favor of "global" test one

* first draft of transformations doc

* some work on the docs page

* feat: 2540 lineage `allow_unknown_columns` and `allow_anonymous_columns` (#2577)

* test compute_columns_schema() and exception handling

* convert transformation code examples to snippets

* finish first round of transformation docs

* Quite a few PR fixes

* fixes some tests

* add support and docs for dataframe and arrow operations

* add config and fallback if destination not reachable

* fix scalar method
fallback to models if pipeline destination is not available

* hopefully fix one test

* Docs: addition of normalizer behaviour to transformations docs (#2639)

* Normalizer info added

* Unnecessary paragraph removed, regular normalization linked

* feat: 2540 - SQLGlot type mapping (#2587)

* fixes some tests

* post rebase cleanup

* renamed kwarg

* type handling done; WIP

* sqlglot-dlt type mapping completed

* added docstrings to tests

* removed unused test file

* attach metadata to DataType

* refactored test to parameterized form

* refactor function names

* bug fix .to_py()

* rename compute_columns_schema() kwargs

* refactor type conversion branches

* fixes some tests

* add support and docs for dataframe and arrow operations

* add config and fallback if destination not reachable

* fix scalar method
fallback to models if pipeline destination is not available

* fix: update return type in athena_adapter docstring to reflect correct destination (#2599)

* list secrets in vault config provider to avoid calls to backend (#2597)

* fixes bug where configuration section was not propagated when embedded configuration is resolved

* splits vault provider settings per vault type

* adds option to list secrets to vault and google secrets provider

* uses google secrets provider with global cache for tests

* documents vault provider

* test and docs fixes

* slightly clarify clickhouse docs (#2594)

* slightly clarify clickhouse docs

* Update clickhouse.md

* Extract dataset code snippets into tests snippets system (#2598)

* extracts dataset code blocks into tested snippets and uses fruitshop pipeline as base dataset for demonstration purposes

* add ibis group

* Enabling 'model' loader_file_format for athena, synapse and dremio (#2556)

* Athena model loader format initial support

* test_verify_capabilities_data_types adjusted for athena

* Synapse enabled

* The offset logic for tsql made unreachable

* Athena test config without iceberg removed, dremio added

* Unnecessary synapse workaround removed

* fix some typos in cursor-restapi docs (#2608)

* fix some typos in cursor-restapi docs

* fix typo

* refactor init-command for use in dlt project (#2568)

* refactor init-command for use in dlt project

* remove config.toml from project docs

* fix ibis mypy error

---------

Co-authored-by: dave <[email protected]>

* docs: Fix incorrect nesting in secrets.toml (#2614)

* fixes parquet data writer settings docs & rewrites configuration docs (#2583)

* fixes parquet data writer settings docs

* adds section to dlt resource decorator

* fixes and tests how config sections are created when single resource is extracted

* fixes config sections for parallel doc example

* exports postgres adapter

* rewrites configuration docs, moves a few docs sections in sidebar

* snippet fixes

* accepts docs changes from review

Co-authored-by: Violetta Mishechkina <[email protected]>

* adds tip how to eject core source

* linter fixes

---------

Co-authored-by: Violetta Mishechkina <[email protected]>

* enables fsspec per-thread instance cache and updates documentation (#2621)

* bumps pendulum and docs (#2624)

* fixes sql database docstrings and docs

* bumps poetry to 3.0.1 and drop dlt poetry

* Added dedup sort example (#2235)

* Added dedup sort example

* Updated formatting

* Updated

* Updated

* Update docs/website/docs/general-usage/incremental-loading.md

---------

Co-authored-by: Alena Astrakhantseva <[email protected]>
Co-authored-by: Marcin Rudolf <[email protected]>

* Docs: add advanced project tutorial (#2338)

* hopefully fix one test

* trigger ci

* improve tests, lint

---------

Co-authored-by: David Scharf <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: rudolfix <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: hsm207 <[email protected]>
Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Alexander Grueneberg <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: dat-a-man <[email protected]>
Co-authored-by: Alena Astrakhantseva <[email protected]>

* qualify all queries that come into the transformations

* fix lineage for snowflake and clickhouse lineage

* apply schema fix for sqlglot and remove special treatment of snowflake

* align datasets interfaces with ibis implementation ["col"] selects column and not table with one column

* disable incremental on transformations decorator and warn if incremental args are discovered

* fixes one more test

* fixes snowflake tests after sqlglot schema fix

* removes standalone resources, fixes transformation function wrapping (#2684)

* changes contrib and README (#2666)

* changes contrib and README

* Apply suggestions from code review

Co-authored-by: Anton Burnashev <[email protected]>

---------

Co-authored-by: Anton Burnashev <[email protected]>

* raises if resolving dataclass without configspec

* adds function type inspect that follows wrappers

* removes make fun, uses wraps

* adds conftest to transformations

* (1) fixes tranformation overloads (2) passes TransformationConfiguration as base spec so buffer is always injected (3) wraps tranformation_function (4) makes str SQL a model (5) tests configurations and parametrized transformations

* (1) removes resources returning resources (2) allows resources to be also functions (3) allows base spec to be passed to resource function (4) makes DltResource and SourceFactory to wrap decorated function and fixes signatures (5) allows inner resources to be injectable, warns for transformers (6) normalizes and tests how functions are wrapped and unwrapped so signatures and configs are available

* normalizes config resolve behavior: default values can be overriden from providers but explicit cannot. if those were instances of base configurations, behavior was inconsistent (explicit values were treated like defaults). also if native value is found for a config and it does not accept native values, config resolution will fail, previously it was ignored

* do not use config specs cached in module when creating autospecs

* fixes venv tests when uv is present

* if incremental parses from another incremental as native value, it copies origina type correctly

* merges standalone resources with regular resources: (1) all are DltResources (2) we generate the correct types for __call__! (3) all resources can be configured including inner resources and including default params, previously only standalone could. that unifies behavior for resources and sources re. config injection (4) resources can return another resources if have DltResource in type annotation (5) resources can be renamed with lambda names also sections can be renamed

* fixes transformation decorators so they generate correct typing

* binds params to resource function instead of using defaults to avoid generating config injection in rest_api

* removes remaining full_refresh flags

* fixes Makefile commands to run common and local destination tests

* fixes xdg home test

* fixes venv tests for uv

* linter and docsstring fixes

---------

Co-authored-by: Anton Burnashev <[email protected]>

* allows for initial values that are configurations also in case no native initial values are supported

* fixes docs linting

* Outer select quotes columns (#2694)

* fix normalizer tests

* fix a few small tests

* remove dependency on ibis for common tests (not supported on python 3.13)

* fixes for python 3.9

* fix sqlglot schema propagation and retrieval

* fixes leaking sqlalchemy credentials into other test

* skip not materialized columns in sqlglot schema generation

---------

Co-authored-by: Marcin Rudolf <[email protected]>
Co-authored-by: zilto <[email protected]>
Co-authored-by: Thierry Jean <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: hsm207 <[email protected]>
Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Alexander Grueneberg <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: dat-a-man <[email protected]>
Co-authored-by: Alena Astrakhantseva <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants