-
Notifications
You must be signed in to change notification settings - Fork 378
feat: autocompletion added for dataset and relation #2891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
rudolfix
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but please read the comment because I added a few fixes.
this is working very well in Jupyter Notebook now and I'll try to merge this ASAP if tests will pass
| raise NotImplementedError("Schema may not be set") | ||
|
|
||
| @property | ||
| def columns(self) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: this is part of implementation and not exposed to the user. we can keep it here but at some point we should decide which props to promote to public interface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a purposeful addition. .columns is common across pd.DataFrame, pl.DataFrame, pyarrow.Table, etc.
dlt/destinations/dataset/dataset.py
Outdated
|
|
||
| def __getattr__(self, name: str) -> Any: | ||
| """Retrieve a `Relation` via `__getitem__` if standard `__getattr__` returns `None`.""" | ||
| attribute = self.__dict__.get(name, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__getattr__ is called only when name is not in __dict__ so you have none here every time. Old implementation was not raising AttributeError but ValueError (via self.table) which IMO was the cause of your problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have a suggestion for the fix? Without the line I've added, the completion mechanism fails to call Dataset._ipython_key_completions_().
It has to do with getattr(dataset_obj, "_ipython_key_completions_") failing inside IPython (i.e., code we can't change). I haven't looked into it more
| def columns(self) -> list[str]: | ||
| return list(self.schema.get("columns", {}).keys()) | ||
|
|
||
| def _ipython_key_completions_(self) -> list[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above, only complete columns should be returned here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we're retrieving columns mentioned on the relation, not physical columns. For example, if you have a SELECT foo AS bar FROM table, calling .columns on the relation should return bar despite not existing on table.
It's different from completion on dataset where you made a good point that only want actually available tables
|
... |
* autocompletion added for dataset and relation * fixed getattr * raises correct exceptions in __getitem__ and __getattr__ in relation / dataset * adds more tests to get notebook column completion * shows only complete tables --------- Co-authored-by: Marcin Rudolf <[email protected]>
* autocompletion added for dataset and relation * fixed getattr * raises correct exceptions in __getitem__ and __getattr__ in relation / dataset * adds more tests to get notebook column completion * shows only complete tables --------- Co-authored-by: Marcin Rudolf <[email protected]>
This adds tab completion for
dlt.Datasetanddlt.Relation. It respectively suggests table names and column names.Related
I had to modify
__getattr__onReadableDBAPIDatasetclass because it was breaking the_ipython_key_completions_. Its implementation effectively prevented any use ofgetattr()to dynamically retrieve attributes or methods by name. My fix involved manually searching the object's__dict__.