Thanks to visit codestin.com
Credit goes to github.com

Skip to content

my approach to ColumnElement[TypeEngine] was wrong, is still wrong, sqlalchemy-stubs had it right, and we have to change it #7519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zzzeek opened this issue Jan 1, 2022 · 7 comments
Labels
SQLA mypy plugin mypy plugin issues only. general pep-484 issues should be "typing" typing pep -484 typing issues. independent of "mypy"
Milestone

Comments

@zzzeek
Copy link
Member

zzzeek commented Jan 1, 2022

working with the pep-646 stuff I can now see the whole way I had us do sqlalchemy2-stubs, while it still looks more correct to me, is not workable, even in non-pep-646 cases.

The issue is this. If we have ColumnElement[TypeEngine[_T]] like we do now, there is no way in pep-484 to get that to resolve into "_T", which is what we need when we execute SQL statements. A hypothetical method def scalar(self, scalar_select_statement) as illustrated in the program below would need to be typed as:

class Connection:
    def scalar(self, statement: ScalarSelect[_TE[_T]]) -> _T:

which is not supported: "TypeVar "Type[_TE@scalar]" is not subscriptable"

first program ilustrates the typing approach we have now, which ends down at the non-possible scalar method below:

import typing
from typing import Protocol, cast, overload
from typing import Generic, Any, List
from typing import Tuple
from typing import TypeVar

# represents Python data values.  int, str, etc.
_T = TypeVar("_T", bound=Any)



class TypeEngine(Generic[_T]):
    """describe a SQL datatype.

    the _T here will point to a Python datatype.

    """

class String(TypeEngine[str]):
    """describe a string SQL datatype.

    Python datatype here is str.

    """


class Integer(TypeEngine[int]):
    """describe an integer SQL datatype.

    Python datatype here is int.

    """

class JSON(TypeEngine[Any]):
    """describe a JSON SQL datatype.

    Since JSON can return scalar values and there can be a JSON decoder
    in place, JSON's python datatype can't be realistically constrained.


    """

class ARRAY(TypeEngine[List[_T]]):
    """describe an ARRAY SQL datatype.

    Here we have a bigger problem which is that ARRAY has another
    TypeEngine inside of it.

    """

"""so far ao good!  here's where it goes wrong.
"""




_TE = TypeVar("_TE", bound="TypeEngine")



class ColumnExpression(Generic[_TE]):
    """describe a SQL column expression.

    a SQL column expression represents some kind of SQL and a SQL
    datatype that's returned.   So where we have the first syntactical
    challenge is that when we want to get the Python datatype from
    a ColumnExpression.  So that's like ``ColumnExpression[_TE[_T]]`` if
    we want to extract that.

    Which, I believe we cannot!  Which means the whole way I did
    sqlalchemy2-stubs with ColumnExpression[TypeEngine] is wrong, and
    what sqlalchemy-stubs did was correct in this regard.

    However!  There are also behaviors that are implied by _TE here, like
    if I say column('q') + column('y'), the "+" operator is different if
    TE is String or Integer.  there are also JSON / ARRAY index operators
    that have different behavior on ColumnExpression.

    """
    def __init__(self, name: str, type_: _TE):
        self.name = name
        self.type_ = type_

    # add operator.  in these cases, the type coming in is preserved
    # for both integer addition and string concatenation, so we don't
    # actually need to know about String / Integer.  OK
    @overload
    def __add__(self, other: "ColumnExpression[String]") -> "ColumnExpression[String]":
        ...

    @overload
    def __add__(self, other: "ColumnExpression[Integer]") -> "ColumnExpression[Integer]":
        ...

    # here, we know how to type __getitem__ because the ColumnExpression has
    # the ARRAY datatype inside of it.  can we distinguish against JSON /
    # HSTORE?
    #
    # a: yes, because List[_T].__getitem__[int] is always going to return
    # a _T, whether it's JSON or ARRAY,  Mapping[str, T].__getitem__[str]
    # will always return _T, etc.

    @overload
    def __getitem__(self: "ColumnExpression[ARRAY[List[_TE]]]", index: int) -> "ColumnExpression[_TE]":
        ...




class ScalarSelect(Generic[_TE]):
    """scalar select statement.  has one column.

    here we illustrate it typed to the column's TypeEngine.

    """

    def __init__(self, column: ColumnExpression[_TE]):
        self.column = column


"""here's where the whole thing breaks!   time to run the statement and get
Python values.

Here we illustrate a scalar column / scalar python value.  This same process
applies to pep-646 where we have Result/Row with TypeVarTuple, same problem
is there as well, this is just simpler to illustrate.

"""
class Connection:
    def scalar(self, statement: ScalarSelect[_TE[_T]]) -> _T:
        """this can't be done!  we have ScalarSelect[_TE] and we need to
        unwrap the _T from it.  not supported.   this is game over
        for how I did sqlalchemy2-stubs - you can't resolve two levels of
        TypeVar.

        same issue happens in the pep-646 case.

        """
        ...


    """pep 646 methods look like these below.  again, we can't go from

    Select[Integer, String] to Row[int, str].  it's not possible AFAICT,
    but not for any reason beyond that we can't do it with scalar values
    either

    """

    # def execute(self, statement: Select[Unpack[ColumnTypeTuple]]) -> Result[PythonTypeTuple]:
    #     ...

    # def first(self, statement: Select[Unpack[ColumnTypeTuple]]) -> Row[PythonTypeTuple]:
    #     ...

if __name__ == '__main__':
    # (variable) c1: ColumnExpression[String] - I like how this looks
    c1 = ColumnExpression("c1", String())

    # (variable) stmt: ScalarSelect[String] - I also like
    stmt = ScalarSelect(c1)

    conn = Connection()

    # (variable) value: _T@scalar - this part is impossible, so non-starter
    value = conn.scalar(stmt)

second program illustrates how sqlalchemy-stubs did it, which while it still makes me wince to see Column[int], is the only way we can get Python types back at the end, and this also will work in the pep-646 case. It's also how we did Mapping[int] in any case. So I was wrong.

import typing
from typing import Protocol, cast, overload
from typing import Generic, Any, List
from typing import Tuple
from typing import TypeVar

# represents Python data values.  int, str, etc.
_T = TypeVar("_T", bound=Any)



class TypeEngine(Generic[_T]):
    """describe a SQL datatype.

    the _T here will point to a Python datatype.

    """

class String(TypeEngine[str]):
    """describe a string SQL datatype.

    Python datatype here is str.

    """


class Integer(TypeEngine[int]):
    """describe an integer SQL datatype.

    Python datatype here is int.

    """

class JSON(TypeEngine[Any]):
    """describe a JSON SQL datatype.

    Since JSON can return scalar values and there can be a JSON decoder
    in place, JSON's python datatype can't be realistically constrained.


    """

class ARRAY(TypeEngine[List[_T]]):
    """describe an ARRAY SQL datatype.

    Here we have a bigger problem which is that ARRAY has another
    TypeEngine inside of it.

    """

"""so far ao good!  here's where it goes wrong.
"""




class ColumnExpression(Generic[_T]):
    """describe a SQL column expression.

    a SQL column expression represents some kind of SQL and a SQL
    datatype that's returned.

    Here, we do what sqlalchemy-stubs does, and BTW also our own
    Mapping[_T] in the ORM anyway, and just put the Python type there.

    We always can get that Python type because we are handed TypeEngine
    objects that give it to us.  It feels "weird" to have
    ``Column[int]`` to me, because we are losing the "Integer"- ness of it,
    and particularly I think ``Column[JSON]``, ``Column[ARRAY[Integer]]```
    is way more descriptive than ``Column[Any]``, ``Column[List[int]]``.

    The Column object etc. is descriptive of a SQL / schema construct on the
    database.  There is no Python type intrinsic to that, so it feels wrong
    to have things like ``Column[int]``, you might just be running CREATE
    TABLE and your program has nothing to do with ints or anything.  But
    as I've said myself in my own docs, pep-484 typing is about python
    types, not database types, hence when i have "nullable=False" I still
    have Mapping[] assuming Optional because in Python-land the attribute can
    still be None (we probably should change that too
    because it just annoys people).

    from the spec: https://www.python.org/dev/peps/pep-0484/#generics
    "Since type information about objects kept in containers cannot be
    statically inferred in a generic way, abstract base classes have been
    extended to support subscription to denote expected types for
    container elements. "  - the ColumnElement contains a TypeEngine, not
    a str or int.  It *indirectly* refers to those things.   This is where
    I can't get my head around this feeling correct.

    then again the whole way SQLAlchemy works is based on a fairly meta
    concept.



    """
    def __init__(self, name: str, type_: TypeEngine[_T]):
        self.name = name
        self.type_ = type_

    # add operator.  in these cases, the type coming in is preserved
    # for both integer addition and string concatenation, so we don't
    # actually need to know about String / Integer.  OK
    @overload
    def __add__(self, other: "ColumnExpression[str]") -> "ColumnExpression[str]":
        ...

    @overload
    def __add__(self, other: "ColumnExpression[int]") -> "ColumnExpression[int]":
        ...

    # this is OK
    @overload
    def __getitem__(self: "ColumnExpression[List[_T]]", index: int) -> "ColumnExpression[_T]":
        ...




class ScalarSelect(Generic[_T]):
    """scalar select statement.  has one column.

    here we illustrate it typed to the column's TypeEngine.

    """

    def __init__(self, column: ColumnExpression[_T]):
        self.column = column


"""
this all works now

"""
class Connection:
    def scalar(self, statement: ScalarSelect[_T]) -> _T:
        """no problem

        """
        ...


    """pep 646 methods look like these below, these are fine because we just
    have TypeTuple that's a tuple of Python types, and that's it.

    """

    # def execute(self, statement: Select[Unpack[TypeTuple]]) -> Result[TypeTuple]:
    #     ...

    # def first(self, statement: Select[Unpack[TypeTuple]]) -> Row[TypeTuple]:
    #     ...

if __name__ == '__main__':
    # (variable) c1: ColumnExpression[str]  - ugly
    c1 = ColumnExpression("c1", String())


    # (variable) stmt: ScalarSelect[str]  - ugly
    stmt = ScalarSelect(c1)

    conn = Connection()

    # value is a str, what we want
    value = conn.scalar(stmt)

just as a note it's also not possible to have both types in the generic either, no way to extract "_T":

class ColumnExpression(Generic[_TE, _T]):
    """describe a SQL column expression.

    Here's another way.  Put *both* types in the generic.


    """

    def __init__(self, name: str, type_: _TE[_T]):
        """doesn't work.  I can't get _T out of _TE here, same reasons
        as before."""
        self.name = name
        self.type_ = type_

this significantly impacts the mypy plugin where the type indirection logic I have there would need to be removed. column expressions would just be typed to Python types directly and anything special about the specific TypeEngine in use has to be lost.

@zzzeek zzzeek added SQLA mypy plugin mypy plugin issues only. general pep-484 issues should be "typing" typing pep -484 typing issues. independent of "mypy" labels Jan 1, 2022
@zzzeek zzzeek added this to the 2.0 milestone Jan 1, 2022
@zzzeek
Copy link
Member Author

zzzeek commented Jan 1, 2022

I've posted python/typing#999 just so i can get the most complete understanding of this possible.

@zzzeek
Copy link
Member Author

zzzeek commented Jan 1, 2022

and they replied with where the feature is being discussed at python/typing#548

@CaselIT
Copy link
Member

CaselIT commented Jan 2, 2022

Type operations seem to be really limited in python at the moment, so even using a Protocol I don't see a way of doing what we seek.

@zzzeek
Copy link
Member Author

zzzeek commented Jan 2, 2022

the linked discussions make it very clear that what I thought we could do is not supported at all right now, that a lot of people are looking for it, and maybe we'll get it "someday". So our direction now as far as that is clear.

for pep-646 I'm still stumped, just posted python/typing#1001 .

@zzzeek
Copy link
Member Author

zzzeek commented Jan 2, 2022

well while I wait for an answer to my pep 646 question, in the meantime, if we can't do pep646 and get a Row from a Select that's typed, this whole issue doesn't matter.

re: the whole issue of declarative mapping like i proposed in https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3430, I'm thinking of just adding a new callable to ORM called mapped_column(). mapped_column(), column_property(), relationship() and all of those, return MapperProperty like they always do, and we get MapperProperty to act like Mapped[], and basically we're done. it doesn't actually matter how Column etc. are typed, either Column[TypeEngine[_T]] or Column[_T] should be able to work with this scheme.

@CaselIT
Copy link
Member

CaselIT commented Jan 2, 2022

I'm thinking of just adding a new callable to ORM called mapped_column()

that's probably the best solution.

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the main branch:

initial reorganize for static typing https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3447

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SQLA mypy plugin mypy plugin issues only. general pep-484 issues should be "typing" typing pep -484 typing issues. independent of "mypy"
Projects
None yet
Development

No branches or pull requests

3 participants