forked from dlt-hub/dlt
-
Notifications
You must be signed in to change notification settings - Fork 0
[pull] devel from dlt-hub:devel #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
master merge for 1.0.0 release
Merge broken docs fix to master
master merge for 1.1.0 release
master merge for 1.2.0 release
master merge for 1.3.0 release
master merge for 1.4.0 release
master merge for 1.4.1 release
master merge for 1.5.0 release
master merge for 1.6.0 release
master merge for 1.6.1 release
master merge for 1.7.0 release
master merge for 1.8.0 release
master merge for 1.8.1 release
master merge for 1.9.0 release
master merge for 1.10.0 release
master merge 1.11.0 release
* Upsert for iceberg * Docs adjustment * test_resolve_merge_strategy corrected for iceberg * Docs adjustments and athena iceberg test fix [ci skip] * Pyiceberg bumped, improved error messages, batching for iceberg * Test fix for filesystem iceberg
…vely None values (#2633) * Initial implementation of all none warning in json normalizer * Warning added to arrow extrctor * Tests added * No pandas used in test * Tests moved to avoid module import errors * Json null column warning moved to verify_normalized_table * Sql database test, null columns now included * Changes removed from merge_column and moved to coerce_non_null_value, tests adjusted * Null cols with none data type excluded from sql jobs * Creation of sql queries now omits columns with none type * X-normalizer is now removed when type is set, data type is now not None, but not set at all * x-normalizer popped if emtpy * No early return in _infer_column, incomplete cols removed from prepare_load_table * root_table_column_names reverted back to original in gen_upsert_sql, test skeleton for merge disposition * Removal of null columns unified * Test fixes, column processing hint
…#2707) * Replace deprecated pkg_resources with packaging * Empty commit to trigger tests again * fix marimo e2e test --------- Co-authored-by: dave <[email protected]>
* uppercase env var * allow both upper and lower * fix marimo e2e test (reverted cherry-picked)
* adds parquet support to postgres via adbc * use selector to compute list of file formats in caps * adds docs on failing data types * adds direct test for all data types * fixes test
* does additonal cast on arrow < 19 to detect null values in non null columns before writing parquet * detects concurrent writes to databricks, allows for retry * removes cffi version of psycopg2
* added constants for load_id col in _dlt_loads table * formatting
* Adding an option to set database location for Athena * ahtena db_location documentation * fixing formatting
* first pass formatting raise statements * added custom exception * updated Exception classes * linting & formatting * fixed tests * fix imports * capitalize argument to please mypy * add raise keyword * applied review comments * passing tests * fixed test * fixed broken check logic * format * fixes bigquery test --------- Co-authored-by: Marcin Rudolf <[email protected]>
* Psycopg2SqlClient: accept dsn options * accept dsn options (refacto) + doc + test * fix postgres doc (dsn with options) * fix: test postgres query params with options * fix: correct postgres DSN format with options * fix: unquote is unnecessary
* extended release notes 1.12.0 * add to sidebars.js * fix snippets langs * fix admonition * 1.12.1 * fix 1.12.1 * remove transformations mention * fix admonition * move it to getting started, reduce the number of titles * change critical to substantial * add what's new to the top bar * experiment with version * move release notes back to Reference, added What's new to the top bar * maybe slash helps * skip if master * skip if master didn't work, change it back * remove what's new, add new section "Release Highlights" * update icons * fix the dark icon * fix the dark icon white strokes * fix the dark icon white strokes 2 * update icons * Update slug from release_notes to release-notes * fix inactive-1 * apply Marcin comments --------- Co-authored-by: Elise Boyd <[email protected]>
* moves source state handling to extract, uses contextvars to propagate current pipe context, does not store last state in global var * implements thread pool with shutdown timeout, adds warning when threads do not join, switch default method to spawn if in orchestrator * detects prefect, dagster and marimo in telemetry * propagates pipe context in pipe iterator using contextvars * cleansup dlt.current module * enables running in wasm/pyodide * bumps for 1.12.4a0 wasm release * Update tests/common/runners/test_runners.py Co-authored-by: djudjuu <[email protected]> --------- Co-authored-by: djudjuu <[email protected]>
* extended release notes 1.12.0 * add to sidebars.js * fix snippets langs * fix admonition * 1.12.1 * fix 1.12.1 * remove transformations mention * fix admonition * move it to getting started, reduce the number of titles * change critical to substantial * add what's new to the top bar * experiment with version * move release notes back to Reference, added What's new to the top bar * maybe slash helps * skip if master * skip if master didn't work, change it back * remove what's new, add new section "Release Highlights" * update icons * fix the dark icon * fix the dark icon white strokes * fix the dark icon white strokes 2 * update icons * Update slug from release_notes to release-notes * fix inactive-1 * apply Marcin comments --------- Co-authored-by: Elise Boyd <[email protected]>
* Missing configs location hint with cli commands * Test for update last run context * Test for running drop command from wrong dir * Improved last run context impl * adds unique uri to identify run_context * passes run_context used to create pipeline and uses it to populate local state, adds more elements * uses run_dir to obtain run context run dir * extracts get_run_context_warning in exception message and covers two warning cases * Test for get_run_context_warning --------- Co-authored-by: Marcin Rudolf <[email protected]>
* add simple wasm notebook * add first version of deployment script * adds pyodid exec_info helper * small updates to the example notebook * add example page with transformations notebook into docs * fix stupid typing error * disable threading in dlt if platform with out threading detected * move to playground * simplify playground notebook fix typos add tests for playground notebook * add missing marimo dependency for tests * PR reviews plus simple tests * add playground link to intro page * adds marimo wasm contributing guide * one more contributing note * move notebook deployment to own file with own rules * add comments to marimo cells
…nts (#2839) * Pyiceberg's python contraint moved from project wide deps * Python restriction removed from unnecessary placed
* update playground text * Update docs/website/docs/tutorial/playground.mdx Co-authored-by: Alena Astrakhantseva <[email protected]> --------- Co-authored-by: Alena Astrakhantseva <[email protected]>
* clean marimo app strings * fix marimo tests and remove studio name in 2 more places
Co-authored-by: Katharina Lenz <[email protected]>
* Add DuckLake setup section to documentation * update: adding example of secrets.toml
* bump to latest lancedb * do not pass api-key to embedding_func, align schema for orphan deletion * bump lancedb * updated example * use pyarrow helpers in type mapper * removes code duplication from lancedb_client, moves jobs to a separate module * sets nullability, fixes schema on merge to include vector column if not added by the user, removes nullability on auto-embed columns in adapter * read vector field from config * fix nullability test hint * unit test add_vector_column * more specific ValueError parsing * no longer accept value error when opening table * schema alignment test next versions * no fusion datatype typecasting * refactor * problems with json loading * test fixes * fixes column normalization when reading existing schema * warn against orphan removal without settings * added docs * todos, check for merge-disposition * fixed missing load tests * fixed tests * fixed multiple merge keys condition * pyarrow precision types * remove unused code * added max precision in LanceDB tests * remove arrow to fsiont_tupe tests * refactor * prepare_load_table in orphan removal job * documentation update * refactor * adds method to get dict of non-default values from configuration * moves parquet and csv format configuration from data writers to destination * adds parquet format to destination caps to allow lancedb to have custom settings * adds more lancedb configs, moves connect method to credentials, allows lancedb client to be passed instead of creds * forces arrow list struct to be saved in parquet, not the parquet default * looks for row key only for merge disposition * moves fill_empty_source_column_values_with_placeholder to pyarrow helper * tests bring own vector and explicit client as credentials * ignores lancedb in mypy.ini * adds missing docs * deprecates file format configs in data writers * fix unit tests for add_vector_column * adjust example code to updated lancedb exceptions * skip lancedb example (because running on fork breaks) --------- Co-authored-by: Marcin Rudolf <[email protected]> Co-authored-by: MOLKA ZHANI <[email protected]>
…s star schema demonstration (#2845) * update fruitshop source with more data * fix tests
* Add Unity Catalog integration details to Iceberg docs in dlt+ * Link managed Iceberg table support. --------- Co-authored-by: Violetta Mishechkina <[email protected]>
* add lint step to check file sizes * move filesizes check to bottom
* add missing source section for filesystem readers * add basic tests for all sources
* rename flag for executing raw queries to "execute_raw_query" * return sge queries from the internal _query method which removes a lot of unneeded transpiling clean up make_transformation function tests still pending * adds some tests to readable dataset and a test for column hint merging * allows any dialect when writing queries and fixes tests * update docs and set correct quoting to queries in normalization and load stage * fixes normalizer tests * fix limit on mssql normalize aliases in normalization step * add missing quote to alias * revert identifier normalization step in normalizer_query and use bigquery compiler for bigquery destinations * post rebase fix * smallish pr fixes * add materializable sqlmodel and handle hints in extractor * add and test always_materialize setting * add test for sql transformation type * convert transformation functions to need yield instead of return * migrate tests and docs snippets to yield in transformations * add simple test for materializable model * use correct compiler for converting ibis into sqlglot for each dialect fixes on transformation test * add first simple version of using unbound ibis tables in transformations * skip ibis test on python 3.9 * fix query building in new relation * return a "real" relation from a transformation * add ibis option when getting table from dataset natively support unbound ibis tables in transformations and when getting relations from dataset * update model item format tests to use relation * * remove one unneeded test (same thing is already tested in transformations) * fix wei conversion in linneage * adds support for adding resource hints to pyarrow items * switch most read access tests to default dataset * update datasets and transformations docs pages * separate ibis and default dbapi datasets and fix typing * update transformation tests and small typing fixes for updated datasets * fix default dataset type * fix wei sqlglot conversion * add sqlglot dialect type and some cleanup * fix dataset snippets * fix sqlglot schema test * removes ibis relation and dataset consolidates relation and dataset baseclasses with implementations updates interfaces/protocols fro relation and dataset and makes those the publicly available interface with "Relation" and "Dataset" remove query method from relation interface * fix one doc snippet * rename dataset and relation interfaces * fix relation ship between cursor and relation, remove function wiring hack in favor of explicit forwarding for better typing * clean up readablerelation (no actual code changes) * fix str test to assume pretty sql (which it is now) fix one transformation snippet * small changes from review comments: * query method on dataset * typing update of table method * rename query method to "to_sql" on relation * clean up transform function a bit (could maybe be even better= reject non-sql strings in transformation to not shadow errors * add support for "non-generator" transformations * move hints computation into resource class * smallish PR fixes * add support for dynamic hints in transformations -> this allows to have multiple relations with different schemas in the relation, so this is allowed now too * fixes dynamic table caching * Enhances ReadableDBAPIRelation: min/max, filter with expression (#2833) * Min max, filter with expr_or_string * Fix in min max test * Overload fix and docs * Test read interfaces partially uses default relation max * prevent sqglot schema from adding default hints info, only allow parametrized types and don't supply hints if none are present in dlt schema * make multi schema transformations work again * move model item format tests to transformations folder * re-order interface tests and fix playground dataset access * PR review test updated * update dataset and transformation pages * update transformations tests to new fruitshop * Last PR fixes * update columns_schema property --------- Co-authored-by: Marcin Rudolf <[email protected]> Co-authored-by: anuunchin <[email protected]>
…ss (#2854) * do not run lancedb custom destination example test on forked subprocess * use lancedb connection correctly
…2871) * Added troubleshooting steps for Databricks and updated documentation for using OneLake as a destination with the filesystem. * Updated
* add dataset migration guide * Update docs/website/docs/general-usage/dataset-access/dataset.md Co-authored-by: anuunchin <[email protected]> --------- Co-authored-by: anuunchin <[email protected]>
note: docs test rely on postgres credentials at the momen, so did not pass, linting and typechecking this new snippet passed.
) * adds dlt workspace extra, updates exception and github workflows * renames app from "marimo app" to "pipeline dashboard" updates --marimo flag to --dashboard * rename studio folders to dashboard * removes all other references to studio * exclude lockfile and markdown files from lfs * update workspace extra dependency versions * bump version
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.1)
Can you help keep this open source service alive? 💖 Please sponsor : )