Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

milkshakeiii
Copy link
Contributor

@milkshakeiii milkshakeiii commented Oct 26, 2023

  • Emulate most aspects of the pandas get_dummies interface
  • Tests and doctest examples
  • Performance bottleneck is BigQuery column count in most cases.

@milkshakeiii milkshakeiii requested review from a team as code owners October 26, 2023 06:39
@milkshakeiii milkshakeiii requested a review from shobsi October 26, 2023 06:39
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Oct 26, 2023
@milkshakeiii milkshakeiii requested a review from shobsi October 27, 2023 19:19
@milkshakeiii
Copy link
Contributor Author

Thanks for the review! Working on addressing these comments today.

Copy link
Contributor

@TrevorBergeron TrevorBergeron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good, just a few suggestions on cutting down the method length a bit

@milkshakeiii milkshakeiii merged commit d8baad5 into main Nov 1, 2023
@milkshakeiii milkshakeiii deleted the b297352026-get-dummies branch November 1, 2023 20:41
ashleyxuu pushed a commit that referenced this pull request Nov 1, 2023
* feat: add pd.get_dummies

* remove unneeded prefix case

* param/documentation fixes

* be stricter about types in test

* be stricter about types in series test

* remove unneeded comment

* adjust for type difference in pandas 1

* add example code (tested)

* fix None columns and add test cases

* variable names and _get_unique_values per-column

* account for pandas 1 behavior difference

* remove already_seen set

* avoid unnecessary join/projection

* fix column ordering edge case

* adjust for picky examples checker

* example tweak

* make part of the example comments

* use ellipsis in doctest comment

* add <BLANKLINES> to doctest string

* extract parameter standardization

* extract submethods

---------

Co-authored-by: Henry J Solberg <[email protected]>
ashleyxuu pushed a commit that referenced this pull request Nov 1, 2023
* feat: add pd.get_dummies

* remove unneeded prefix case

* param/documentation fixes

* be stricter about types in test

* be stricter about types in series test

* remove unneeded comment

* adjust for type difference in pandas 1

* add example code (tested)

* fix None columns and add test cases

* variable names and _get_unique_values per-column

* account for pandas 1 behavior difference

* remove already_seen set

* avoid unnecessary join/projection

* fix column ordering edge case

* adjust for picky examples checker

* example tweak

* make part of the example comments

* use ellipsis in doctest comment

* add <BLANKLINES> to doctest string

* extract parameter standardization

* extract submethods

---------

Co-authored-by: Henry J Solberg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants