Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@Gandalfdore
Copy link
Contributor

@Gandalfdore Gandalfdore commented Jan 23, 2025

This pull request includes changes to the data.csv file and the kdp/processor.py file to add new features and improve error handling. The most important changes are summarized below:

Data updates:

  • data.csv: Added new rows with various features and values to the dataset.

Error handling improvements:

  • kdp/processor.py: Enhanced the _add_pipeline_text method to validate required statistics for text features and raise a ValueError if any required stats are missing or invalid. This change ensures better error handling and provides clear error messages.

Copy link
Collaborator

@piotrlaczkowski piotrlaczkowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes to add + I think we need to test for each test-cases that the model serializes/deserializes correctly (save + load/predict) -> we need to be sure that in every part of the graph we have well defined parameters (you can check that in the model visualization -> if params of every node are not well defined, model will fail in deserialize

@piotrlaczkowski piotrlaczkowski merged commit 84bfdc6 into UnicoLab:main Jan 27, 2025
1 of 2 checks passed
github-actions bot pushed a commit that referenced this pull request Mar 31, 2025
## 1.10.0 (2025-03-31)

* ops(KDP): fixing action wrapper version ([cae2785](cae2785))
* ops(KDP): fixing actions and node versions ([28cd93b](28cd93b))
* ops(KDP): fixing semantic release version problem ([c6b612e](c6b612e))
* ops(KDP): improving release workflows ([3f91e19](3f91e19))
* ops(KDP): increating python version allowance ([1d06b76](1d06b76))
* ops(KDP): increating python version allowance ([6070e69](6070e69))
* ops(KDP): relaxing python requirements ([fa903e4](fa903e4))
* docs(KDP): add an example_usages ([e980e12](e980e12))
* docs(KDP): add to the docs to showcase new features ([790f102](790f102))
* docs(KDP): added a diagram ([48d2e57](48d2e57))
* docs(KDP): added docs ([d8da4c5](d8da4c5))
* docs(KDP): added docs ([dd728ef](dd728ef))
* docs(KDP): adding concrete examples of usage of advance features to docs (#21) ([ea419cf](ea419cf)), closes [#21](#21)
* docs(KDP): adding new styling ([55ae7ce](55ae7ce))
* docs(KDP): adding some images and API documentation tools ([a4be43f](a4be43f))
* docs(KDP): adjusting numerical embeddings docs ([45f8872](45f8872))
* docs(KDP): fixing missing links ([076c9b3](076c9b3))
* docs(KDP): improving documentation ([02a137a](02a137a))
* docs(KDP): improving documentation ([91d4f85](91d4f85))
* docs(KDP): improving some remaining docs ([0c932b0](0c932b0))
* docs(KDP): refactoring documentation ([1ccc3f2](1ccc3f2))
* docs(KDP): removing whta we do not have ([93706fb](93706fb))
* docs(KDP): reorganising documentation for better UX ([a67c634](a67c634))
* docs(KDP): reorganising documentation for better UX ([052454e](052454e))
* docs(KDP): revamping entire docs ([d0ef7b7](d0ef7b7))
* docs(KDP): smart processing for custom pipelines ([8e7d0a7](8e7d0a7))
* docs(KDP): updating DistributionEncoder docs ([1916c40](1916c40))
* feat(KDP): add the AdvancedNumericalEmbedding feature ([8fa90e7](8fa90e7))
* feat(KDP): added option to specify the distribution manually ([d3cce76](d3cce76))
* feat(KDP): added selective retention of outputs based on dependencies among layers ([c17acd2](c17acd2))
* feat(KDP): addin DistributionAwareEncored layer and numeric preprocessing ([9bfe276](9bfe276))
* feat(KDP): adding auto config / recommender ([00e75d6](00e75d6))
* feat(KDP): adding categorical features hashing ([078df9d](078df9d))
* feat(KDP): adding DistributionAwareEncored layer for numeric features preprocessing. (#20) ([c988087](c988087)), closes [#20](#20)
* feat(KDP): adding feature selection mechanism to the preprocessor (docs, tests) (#19) ([462bfc1](462bfc1)), closes [#19](#19)
* feat(KDP): adding Gater Residual Variable / Features Selection Network capability ([f82b788](f82b788))
* feat(KDP): adding MoE feature and tests ([fac7806](fac7806))
* feat(KDP): adding numerical embedding layers (#26) ([5c3a974](5c3a974)), closes [#26](#26)
* feat(KDP): adding passthrough feature ([2965916](2965916))
* feat(kdp): adding TabularAttentionLayers and implementation ([f567585](f567585))
* feat(kdp): adding TabularAttentionLayers and implementation (#11) ([cfbd38b](cfbd38b)), closes [#11](#11)
* feat(KDP): Enhance Dynamic Preprocessing Pipeline (#24) ([bd90f11](bd90f11)), closes [#24](#24)
* feat(KDP): global embedding for numeric features option added ([83f6996](83f6996))
* feat(KDP): Integrate Advanced Numerical Embedding (#25) ([185292c](185292c)), closes [#25](#25)
* feat(KDP): smart processing for custom pipelines ([448f63f](448f63f))
* fix(KDP): add new examples for tabular attention cases and more complex Mixed Transformers and Tabul ([16340f2](16340f2))
* fix(KDP): add transdormer() method to ProcessingModel ([0c6c65c](0c6c65c))
* fix(KDP): added a missing code for the example for disttribution aware layer for custom pipelines ([9ca9fad](9ca9fad))
* fix(KDP): added a missing code for the example for disttribution aware layer for custom pipelines ([9b92475](9b92475))
* fix(KDP): added docstrings ([0eb968d](0eb968d))
* fix(KDP): added fixes for the distribution estimator and tests ([da79bb9](da79bb9))
* fix(KDP): Added get_feature_importances() method and fixed the docs. ([664023f](664023f))
* fix(KDP): added prefered_distribution parameter for NumericalFeatures ([84b2eb5](84b2eb5))
* fix(KDP): adding FeatureSelection to Text and Date features (#28) ([e1f453f](e1f453f)), closes [#28](#28)
* fix(kdp): adding pre-commit fixes ([ec98d29](ec98d29))
* fix(KDP): broke transform method into 2 separate methods and end-to-end tests ([11f258d](11f258d))
* fix(KDP): changed the order of the transormers and the tabularAttention applications ([d387826](d387826))
* fix(KDP): DistributionAwareEncoder fix and tests for custom pipelines (#23) ([ad91096](ad91096)), closes [#23](#23)
* fix(KDP): edited some of the tests to reflect the changes in processor.py ([23b36ce](23b36ce))
* fix(KDP): edited the docs ([34476c6](34476c6))
* fix(KDP): Fix linter errors: unused variable and rearranged imports ([2c2a447](2c2a447))
* fix(KDP): Fix remaining unused imports with ruff ([fe3a014](fe3a014))
* fix(KDP): fix_tabukar_att_and_transfor_order_and_add_docs (#17) ([4b1c510](4b1c510)), closes [#17](#17)
* fix(KDP): fixed all the algorithms for distribution detection all tests pass now ([52dad69](52dad69))
* fix(KDP): Fixed dimensions micmatch for the input of the Tabular Attention ([5a66ad1](5a66ad1))
* fix(KDP): fixed issues between graph and eager mode plus others ([a3fe7e1](a3fe7e1))
* fix(KDP): fixes to the tests ([bc0c543](bc0c543))
* fix(KDP): Fixing Distribution-Aware Encoder and adding comprehensive testing (#22) ([97b41c3](97b41c3)), closes [#22](#22)
* fix(kdp): fixing docs requirements and release for docs ([9d8f8b3](9d8f8b3))
* fix(KDP): fixing failiing tests ([966434d](966434d))
* fix(kdp): fixing formatting issues ([352e72b](352e72b))
* fix(KDP): fixing layers functionality ([ffe8d89](ffe8d89))
* fix(kdp): fixing tests fromatting ([955ed08](955ed08))
* fix(KDP): fixing the doc ([fdaa101](fdaa101))
* fix(KDP): improving docs UX ([f51daf8](f51daf8))
* fix(KDP): reformatting with pre-commits ([2f01e67](2f01e67))
* fix(KDP): reformatting with pre-commits ([2523f89](2523f89))
* fix(KDP): removed an unused method ([a170db8](a170db8))
* fix(KDP): Removed some buggy feature ([da24b7b](da24b7b))
* fix(KDP): Removing data.csv ([3ca0daf](3ca0daf))
* fix(KDP): small fixes ([47a6267](47a6267))
* test(KDP): add end to end and unit test for the "TabularAttention" and the "MultiResolutionTabularAt ([1cf5d09](1cf5d09))
* test(kdp): add end to end tests ([a1b3018](a1b3018))
* test(kdp): add end to end tests (#13) ([5737d9b](5737d9b)), closes [#13](#13)
* test(KDP): add more tests ([a4d536c](a4d536c))
* test(KDP): add more tests (#15) ([84bfdc6](84bfdc6)), closes [#15](#15)
* test(KDP): add tests ([5cb4e8e](5cb4e8e))
* test(KDP): add tests ([55cbbb3](55cbbb3))
* test(KDP): add tests and fix (#14) ([5308029](5308029)), closes [#14](#14)
* test(KDP): add tests for various cases and also a ValueError for missing vocab scenario ([82a99a3](82a99a3))
* test(KDP): add unit test for gates res network ([1d980e0](1d980e0))
* test(KDP): added test for advanced features ([d4fc5f3](d4fc5f3))
* test(KDP): added test for advanced features ([c18a59b](c18a59b))
* test(KDP): adding passthrough tests ([f98dc2c](f98dc2c))
* test(KDP): dummy ([4181bb3](4181bb3))
* test(KDP): dummy commit ([2883d79](2883d79))
* test(KDP): dummy commit ([1f1e35b](1f1e35b))
* test(KDP): empty commit for testing ([9b0f386](9b0f386))
* test(KDP): extending testes for preprocessor module ([2e23b3d](2e23b3d))
* refactor(KDP): impreoving auto configuration functionality and UX ([7b76a99](7b76a99))
* refactor(KDP): maintainance on preprocessor to optimize code and refactor ([7146afe](7146afe))
* refactor(KDP): removing tf-proba dependency ([83cb73d](83cb73d))
* refactor(KDP): splitting custom_layers ([f029f77](f029f77))
* refactor(KDP): splitting layers into separate files ([c188267](c188267))
* refactor(KDP): splitting more tests for layers ([84293f0](84293f0))
* feat(validation): add day of the month add assertions and error handling ([fa88c24](fa88c24))
* fix: update distribution aware encoder and tests ([62f0dba](62f0dba))
* fix(validation): added unit tests and fixed some small bugs (#12) ([3042e2a](3042e2a)), closes [#12](#12)
* Merge branch 'main' into feat_adding_grvs ([147bceb](147bceb))
* Merge branch 'main' into feat_dist_aware_embedding_numerical ([27dff94](27dff94))
* Merge branch 'main' into fix_tab_att_and_transfor ([6312d09](6312d09))
* Merge branch 'main' into tabular_attention_tests ([9491590](9491590))
* Merge branch 'piotrlaczkowski:main' into cutom_preprocess_smart ([1d7e48a](1d7e48a))
* Merge branch 'piotrlaczkowski:main' into feat_num_emb ([0bf0cc3](0bf0cc3))
* Merge branch 'piotrlaczkowski:main' into feat_num_emb ([1dd8e83](1dd8e83))
* Merge pull request #1 from piotrlaczkowski/main ([609dd5b](609dd5b)), closes [#1](#1)
* test(validation): added unit tests and fixed a little type mismatch ([d0303cb](d0303cb))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants