Thanks to visit codestin.com
Credit goes to github.com

Skip to content

integration with typing annotations for declarative #7535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zzzeek opened this issue Jan 5, 2022 · 20 comments
Closed

integration with typing annotations for declarative #7535

zzzeek opened this issue Jan 5, 2022 · 20 comments
Labels
alchemy 2 goes along with the 2.0 milestone to aid in searching Epic Use this tag to define an Issue as an Epic that contains multiple issues for a targeted feature typing pep -484 typing issues. independent of "mypy"
Milestone

Comments

@zzzeek
Copy link
Member

zzzeek commented Jan 5, 2022

this is a continuation of the thinking from sqlalchemy/sqlalchemy2-stubs#170 with some experimentation at sqlalchemy/sqlalchemy2-stubs@a436c38 .

Current thinking:

  1. the mypy plugin does not have much future. Other type checkers like pylance are vastly more widely used at this point via IDEs, and they will never support plugins nor do we want to have to maintain a whole suite of plugins. mypy plugins done by third parties are essentially unworkable as they rely upon intricate details of their fairly vast and complex codebase which change all the time.
  2. SQLAlchemy 2.0 will have typing integrated, so we can iterate much better on various typing techniques. how we get type checkers to not rely upon the previous two stub packages is an issue to be solved but we assume that will be figured out.
  3. the approach at sqlalchemy/sqlalchemy2-stubs@a436c38 shows that we can do a dataclasses-esque thing without much difficulty. Let's just build it into ORM constructs like relationship(), column_property(), synonym(), etc. directly.
  4. Column is from core, and that's not changing to do anything ORM-ish. so that's the one construct that declarative users will use something else that wraps around it, like mapped_column(). This will basically be column_property() with some helpers.
  5. RelationshipProperty, ColumnProperty will all need to implement Mapped directly. via a subclass _DeclartativeMappedPlaceholder, something like that. The __get__ __set__ etc methods can raise NotImplementedError for instance level access, since under normal use there will never be an instance of a class with these attributes set, they will be replaced by InstrumentedAttribute
  6. all the ORM declarative elements should be able to pull their info from __annotations__, if arguments are not present. this shouldn't be that hard to accomplish. Yes it means I'd like a lookup of Mapped[int] -> Column(Integer), stuff like that. Big deal.
  7. if __annotations__ only returned strings, as was the issue in PEP 563, PEP 649 and pydantic pydantic/pydantic#2678 , we could still work with that because we already do relationship() using strings, and column types are trivial. looks like that area of python is getting a lot of attention though so there should be good behaviors there.
  8. Yes we will make Optional work with "nullable" the way people usually want. If a type isnt Optional then it gets nullable=False by default. Yes you can have Object() and "id" will be None, which isn't in the annotations, we can have options to do it in both ways, this is just going to have to be a practicality over purity thing.
  9. we lose with __init__() being added. There's no way to have a class with attributes where the attributes magically create an __init__ method, unless you use dataclasses, which the type checkers have added hardcoded rules to accomplish. same with __table__ and all that. Propose a stub class that people will use, like:
class DeclarativeStub:
    if typing.TYPE_CHECKING:
        def __init__(self, **kw) -> None: ...

        __table__: FromClause
        __mapper__: Mapper

if someone wants their __init__ to have full typing information, they have to write it out. I dont see any way around that other than trying to trick the static checkers into thinking we are using dataclasses and that does not seem like a wise choice.

so here's a mapping:

from typing import Optional

from sqlalchemy.orm import registry
from sqlalchemy.orm import DeclarativeStub
from sqlalchemy.orm import mapped_column, relationship, Mapped

@mapper_registry.as_declarative_base()
class Base(DeclarativeStub):
    pass

class User(Base):
    __tablename__ = 'user'

   id: Mapped[int] = mapped_column(primary_key=True)
   name: Mapped[str] = mapped_column()
   some_date : Mapped[Optional[datetime.datetime]] = mapped_column()
   addresses : Mapped[List["Address"]] = relationship(back_populates="user")

class Address(Base):
    __tablename__ = "address"
   id: Mapped[int] = mapped_column(primary_key=True)
   user_id: Mapped[int] = mapped_column(ForeignKey('user.id'))
   email_address: Mapped[str] = mapped_column()
   user: Mapped["User"] = relationship(back_populates="addresses")

so I think that can work? but im not sure, if mapped_column() is typed as Mapped[Any] can I declare that as Mapped[int] ? I think so? I can't get to testing the above until I get a good chunk of SQLAlchemy 2.0 typed; right now things are a mess with it not being typed and the stubs being more wrong every day.

so that's one way, now can it also work the other way? I think to some extent at least? here's that:

from typing import Optional

from sqlalchemy.orm import registry
from sqlalchemy.orm import DeclarativeStub
from sqlalchemy.orm import mapped_column, relationship, Mapped

@mapper_registry.as_declarative_base()
class Base(DeclarativeStub):
    pass

class User(Base):
    __tablename__ = 'user'

   id = mapped_column(Integer, primary_key=True)
   name = mapped_column(String, nullable=False)
   some_date = mapped_column(DateTime)
   addresses relationship("Address", back_populates="user", uselist=True)

class Address(Base):
    __tablename__ = "address"
   id = mapped_column(Integer, primary_key=True)
   user_id = mapped_column(Integer, ForeignKey('user.id'), nullable=False)
   email_address = mapped_column(String, nullable=False)
   user = relationship("User", back_populates="addresses", uselist=False)

so here, I think it works if we type things as:

@overload
def mapped_column(TypeEngine[_T], primary_key: bool = Literal[True]) -> Mapped[_T]: ...

@overload
def mapped_column(TypeEngine[_T], nullable: bool = Literal[False]) -> Mapped[_T]: ...

@overload
def mapped_column(TypeEngine[_T], nullable: bool = Literal[True]) -> Mapped[Optional[_T]]: ...

@overload
def mapped_column() -> Mapped[Any]: ...

similar idea for relationship, etc. with the constructs all working in both ways it should make for a lot more typing happening automatically at least? not sure. this is seeming easier than it did previously which is making me not trust that im not missing something. anyway, thinking this. need to get lots of annotations into 2.0 main before we can fluently play with this stuff.

@zzzeek zzzeek added feature alchemy 2 goes along with the 2.0 milestone to aid in searching typing pep -484 typing issues. independent of "mypy" labels Jan 5, 2022
@zzzeek zzzeek added this to the 2.0 milestone Jan 5, 2022
@CaselIT
Copy link
Member

CaselIT commented Jan 5, 2022

6. all the ORM declarative elements should be able to pull their info from __annotations__, if arguments are not present. this shouldn't be that hard to accomplish. Yes it means I'd like a lookup of Mapped[int] -> Column(Integer), stuff like that. Big deal.

not sure if I like this idea, I would need to think a bit abut it.
Maybe we can require a lockup dict to be passed in the registry? something like {int: Integer} or {int: lambda: Integer(some_option=True)? We may also opt to provide a default one?

@CaselIT CaselIT added the Epic Use this tag to define an Issue as an Epic that contains multiple issues for a targeted label Jan 5, 2022
@zzzeek
Copy link
Member Author

zzzeek commented Jan 5, 2022

for relationships, we usually will need the left-annotation to be the explicit part:

user: Mapped["User"] = relationship()

This is because we can't annotate relationship to produce the correct type in the other direction, when the class argument is a string:

# can't type this as Mapped[User] without explicit type
user = relationship("User")

So i was thinking of being consistent, or at least allowing consistency, to have the mapping start up from the "x: Mapped[y]" side.

for the type lookup, we already have this mapping right here: https://github.com/sqlalchemy/sqlalchemy/blob/main/lib/sqlalchemy/sql/sqltypes.py#L2998 we already do this, nobody needs to override it or anything. if you have a specific type you want to use with mapped[int] then you specify it in the mapped_column(), no differently than people do now anyway with "cast()" or whatever.

@zzzeek
Copy link
Member Author

zzzeek commented Jan 5, 2022

there's the general problem of str->Unicode for databases with explicit unicode datatypes and data conversion issues like pyodbc. so we might need to improve that part.

@zzzeek
Copy link
Member Author

zzzeek commented Jan 5, 2022

i'd propose a subclass of str: class plain_str(str): or something like that

@CaselIT
Copy link
Member

CaselIT commented Jan 5, 2022

I think it would be useful to provide an override at the registry level. Like for pg I would like to map str to Text, not varchar and float to double_precision. Also maybe folks like to map int to biging

@CaselIT
Copy link
Member

CaselIT commented Jan 5, 2022

The mapping we have is also not really complete since it's not dialect specific, like uuid is missing.

Also I'm not sure if we can support also list->array and extract the type from a list annotation to do list[str] -> Array(Text)

@zzzeek
Copy link
Member Author

zzzeek commented Jan 5, 2022

UUID is just going to be missing, this would be a very basic typing-saver only.

re: override registry it can be something local to the registry() object, not a global.

@zzzeek
Copy link
Member Author

zzzeek commented Jan 5, 2022

well then if there's a registry override there's where you put your UUID :)

@zzzeek
Copy link
Member Author

zzzeek commented Jan 5, 2022

things like Array are also very unusual edge cases

@CaselIT
Copy link
Member

CaselIT commented Jan 5, 2022

override registry it can be something local to the registry() object, not a global.

sure, that was my suggestion, since that's for declarative one we do have a registry always available, so it should not be that much of an issue to support. we can have a constructor argument called type_map and a property that returns a immutable dict with the current map

well then if there's a registry override there's where you put your UUID :)

indeed, it would be there

@sqla-tester
Copy link
Collaborator

Mike Bayer referenced this issue:

WIP for ORM typing https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3495

sqlalchemy-bot pushed a commit that referenced this issue Jan 14, 2022
introduces:

1. new mapped_column() helper
2. DeclarativeBase helper
3. declared_attr has been re-typed
4. rework of Mapped[] to return InstrumentedAtribute for
   class get, so works without Mapped itself having expression
   methods
5. ORM constructs now generic on [_T]

also includes some early typing work, most of which will
be in later commits:

1. URL and History become typing.NamedTuple
2. come up with type-checking friendly way of type
   checking cy extensions, where type checking will be applied
   to the py versions, just needed to come up with a succinct
   conditional pattern for the imports

References: #6810
References: #7535
References: #7562
Change-Id: Ie5d9a44631626c021d130ca4ce395aba623c71fb
@CaselIT
Copy link
Member

CaselIT commented Jan 23, 2022

I think this pep would be very useful to specify a particular sqlalchemy type for a column https://www.python.org/dev/peps/pep-0593/

We could have something like

class Foo:
  foo: int
  bar: Annotated[str, String(42)]

This would also solve the type map issue we talked above ( or could be used as an alternative / in conjunction with it )

@zzzeek
Copy link
Member Author

zzzeek commented Jan 23, 2022

possibly. it looks like "yet another syntax" so far, as opposed to "bar: mapped_column(String(42))" but something to consider

@CaselIT
Copy link
Member

CaselIT commented Jan 23, 2022

Maybe it could be in place of other syntaxes? This seems the official python way of doing what this kind of things.

@zzzeek
Copy link
Member Author

zzzeek commented Jan 23, 2022

the Column construct accepts a dozen different kinds of parameters and arguments. Where do things like ForeignKey(), "key", "name" etc. get specified if Annotated[str, String] takes the place of Column /mapped_column? also it feels redundant to have to type things like Annotated[str, TEXT] - mapped_column(TEXT) is more succinct. there is more variability in a single Python type to multiple database types, than the other way around (the use case for the "type map", which would be much less prominent since it is usually not needed).

@zzzeek
Copy link
Member Author

zzzeek commented Jan 23, 2022

that is, I think the canonical way to map declaratively is going to stay mostly like it is:

class User(Base):
    __tablename__ = 'user'

   id = mapped_column(Integer, primary_key=True)
   name = mapped_column(String, nullable=False)
   some_date = mapped_column(DateTime)
   addresses = relationship(lambda: Address, back_populates="user", uselist=True)

the above is what people are used to, it's extremely close to what's documented everywhere and we can derive all the typings above without any annotations. obviously relationship() will usually not be able to use lambda like that so still some exception there.

@CaselIT
Copy link
Member

CaselIT commented Jan 23, 2022

Ok so I remembered that we wanted to allow mapping without specifying mapped_column / relationship.

Where do things like ForeignKey(), "key", "name" etc. get specified if Annotated[str, String]

I think annotated allows any type after the first one (and also allows multiple arguments) so I guess Annotated[str, Column(...)] would be an option.

Probably just something to take into consideration if we want to experiment

obviously relationship() will usually not be able to use lambda like that so still some exception there.

I just though of this, and most likely the type checker do not like it but we could make relationship a generic class and use __new__ to return instances of RelationshipProperty.
Is something like this allowed?

_T = TypeVar('_T')
class relationship(Generic[_T]):
  @overload
  def __new__(self, thing, uselist=False, ....) -> _T

  @overload
  def __new__(self, thing, uselist=True, ....) -> List[_T]

  def __new__(self, thing, ...) -> _T
    ...
    return RelationshipProperty(...)

so that we could use it as

if TYPE_CHECKING:
  from .other import Address

class User(Base):
    __tablename__ = 'user'

   id = mapped_column(Integer, primary_key=True)
   name = mapped_column(String, nullable=False)
   some_date = mapped_column(DateTime)
   addresses = relationship['Address']('Address', back_populates="user", uselist=True)

@sqla-tester
Copy link
Collaborator

Mike Bayer referenced this issue:

establish mypy / typing approach for v2.0 https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3562

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the main branch:

establish mypy / typing approach for v2.0 https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3562

@zzzeek
Copy link
Member Author

zzzeek commented Feb 7, 2022

docs for all this stuff are going to be a separate commit. typing affects a lot of things so this commit is just grouping all of it together towards the goal of expediency.

sqlalchemy-bot pushed a commit that referenced this issue Feb 9, 2022
The Mypy plugin is not maintainable long-term and will be replaced
by new APIs that allow for typing to work inline without the need
for plugins.

Change-Id: Icc7a203df1d0b19bde2fd852719b7b7215774c58
References: #7535
(cherry picked from commit 491e850)
sqlalchemy-bot pushed a commit that referenced this issue Feb 9, 2022
The Mypy plugin is not maintainable long-term and will be replaced
by new APIs that allow for typing to work inline without the need
for plugins.

Change-Id: Icc7a203df1d0b19bde2fd852719b7b7215774c58
References: #7535
relsunkaev pushed a commit to relsunkaev/sqlalchemy that referenced this issue Feb 15, 2022
The Mypy plugin is not maintainable long-term and will be replaced
by new APIs that allow for typing to work inline without the need
for plugins.

Change-Id: Icc7a203df1d0b19bde2fd852719b7b7215774c58
References: sqlalchemy#7535
relsunkaev pushed a commit to relsunkaev/sqlalchemy that referenced this issue Feb 15, 2022
large patch to get ORM / typing efforts started.
this is to support adding new test cases to mypy,
support dropping sqlalchemy2-stubs entirely from the
test suite, validate major ORM typing reorganization
to eliminate the need for the mypy plugin.

* New declarative approach which uses annotation
  introspection, fixes: sqlalchemy#7535
* Mapped[] is now at the base of all ORM constructs
  that find themselves in classes, to support direct
  typing without plugins
* Mypy plugin updated for new typing structures
* Mypy test suite broken out into "plugin" tests vs.
  "plain" tests, and enhanced to better support test
  structures where we assert that various objects are
  introspected by the type checker as we expect.
  as we go forward with typing, we will
  add new use cases to "plain" where we can assert that
  types are introspected as we expect.
* For typing support, users will be much more exposed to the
  class names of things.  Add these all to "sqlalchemy" import
  space.
* Column(ForeignKey()) no longer needs to be `@declared_attr`
  if the FK refers to a remote table
* composite() attributes mapped to a dataclass no longer
  need to implement a `__composite_values__()` method
* with_variant() accepts multiple dialect names

Change-Id: I22797c0be73a8fbbd2d6f5e0c0b7258b17fe145d
Fixes: sqlalchemy#7535
Fixes: sqlalchemy#7551
References: sqlalchemy#6810
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alchemy 2 goes along with the 2.0 milestone to aid in searching Epic Use this tag to define an Issue as an Epic that contains multiple issues for a targeted feature typing pep -484 typing issues. independent of "mypy"
Projects
None yet
Development

No branches or pull requests

3 participants