Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Annotation-based syntax for ctypes structs #104533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
orent opened this issue May 16, 2023 · 9 comments
Open

Annotation-based syntax for ctypes structs #104533

orent opened this issue May 16, 2023 · 9 comments
Labels
topic-ctypes type-feature A feature request or enhancement

Comments

@orent
Copy link

orent commented May 16, 2023

Turn this:

class S(ctypes.Structure):
   _fields_ = [ ('a', ctypes.c_int), ('b', ctypes.c_char_p) ]

Into this:

class S(ctypes.Structure):
    a : ctypes.c_int
    b : ctypes.c_char_p

See discussion on https://discuss.python.org/t/annotation-based-sugar-for-ctypes/26579

Working on implementation

@orent orent added the type-feature A feature request or enhancement label May 16, 2023
@zwergziege
Copy link

I would welcome this very much. It would also be very nice if we had annotations for incomplete types if possible. A natural next step would be using generics for pointers. I don't know though whether generics have to stabilize regarding how generic parameters are accessed during runtime (see python/typing#629). If ctypes wouldn't insist on capitalized types, we would get code like this

class Incomplete(Struct):
  some_data : c_int
  children : Pointer['Incomplete']

which looks very clean and natural imho.

@orent
Copy link
Author

orent commented Jun 18, 2023

You are welcome to open a separate issue. Your proposal is orthogonal to this one.

@junkmd
Copy link
Contributor

junkmd commented Aug 13, 2023

I would like to point out the problems with this approach.

The problem is that type checkers interpret the return type of fields solely as c_foo.

As shown below, even if c_int is specified for a field, the type returned at runtime is int.

>>> import ctypes
>>>      
>>> class Foo(ctypes.Structure):
...     pass
... 
>>> Foo._fields_ = [('x', ctypes.c_int)]
>>> Foo.x
<Field type=c_long, ofs=0, size=4>
>>> type(Foo.x) 
<class '_ctypes.CField'>
>>> foo = Foo()
>>> foo
<__main__.Foo object at 0x0000023CDB80B8C0>
>>> foo.x
0
>>> foo.x = 3
>>> foo.x
3
>>> foo.x = ctypes.c_int(2)
>>> foo.x
2

In order to convey appropriate type information by annotating for fields like below, a special treatment needs to be introduced to type checkers.

class Foo(Structure):
    x: c_int

This effort requires reaching out not only to the cpython community but also to the broader community of type checker developers.

  • Currently, in typeshed, fields are annotated with subclasses of _SimpleCData, like ctypes.wintypes.POINT. However, since this approach leads to type information mismatches between runtime and static, I am considering submitting an issue in typeshed to address this matter.

  • If implementing such functionality, it might be beneficial to make CField public and allow its definition similar to fields within Django models.
    However, same as in the discussion, I also think "there is a need to maintain both syntaxes and make them mutually exclusive".

     class Foo(Structure):
         x = CField(c_int)

@DrInfiniteExplorer
Copy link

I made a dumb wrapper for ctypes to support this sort of thing at DrInfiniteExplorer/dtypes a few years back after I got tired of writing hundreds of definitions with _fields_ for a few days straight.
I didn't work extensively with proper typecheckers at the time as my main project was kind of rushed, but I did hack in a simple way to forward-declare structs to make pointers, as well as simple this-type-pointers.
I've recently gotten a bit more active with my projects and might spend more time on this.One thing that has been nagging me is that (as I've learned) tuples aren't valid types for annotations, so I'm thinking that bitfields could be declared with something like bitfield : Annotated[ctypes.c_uint8, Bitfield(2)] and have structify turn that into a proper _fields_ tuple.
What are your thoughts and plans for this kind of thing? I think I'll continue making small improvements to dtypes, but if the official ctypes gets cooler then I'm down with that, and either way we might all benefit from discussing and sharing ideas.

@picnixz
Copy link
Member

picnixz commented Jun 15, 2024

I had something similar because I'm too lazy to declare structs and unions using the regular syntax. Instead, I have something like that:

@cschema
class MyStruct(cStruct):
    i: ctypes.c_int

    class a(cStruct):
        x: ctypes.c_longlong
        y: ctypes.c_longlong

which makes it equivalent to

class MyInnerStruct(ctypes.Structure):
    x: ctypes.c_longlong
    y: ctypes.c_longlong

class MyStruct(ctypes.Structure):
    _fields_ = [('i', ctypes.c_int), ('a', MyInnerStruct)]

With this approach, I can have nested classes and unions. also, I used cStruct and cUnion as new classes in order to add metaclass keyword arguments support (which I cannot do on th native ctypes.Structure and ctypes.Union). For instance:

@cschema
class MyStruct(cStruct):
    class _(cStruct, anon=True):
        x: ctypes.c_longlong
        y: ctypes.c_longlong

becomes equivalent to

class MyInnerStruct(ctypes.Structure):
    x: ctypes.c_longlong
    y: ctypes.c_longlong

class MyStruct(ctypes.Structure):
    _anonymous_ = ['_']
    _fields_ = [('_', MyInnerStruct)]

Similarly, I can have something like:

class MyStruct(cStruct, pack=32):
    pass

class MyUnion(cUnion, pack=32):
    pass

instead of

class MyStruct(ctypes.Structure):
    _pack_ = 32

class MyUnion(ctypes.Union):
    _pack_ = 32

Note that this last construction is only useful when using nested unions. I've added other features such as:

@cschema
class MyClass(cStruct):
    field: cArray[ctypes.c_int, 32]

to be equivalent to

class MyClass(ctypes.Structure):
    _fields_ = [('field', ctypes.c_int * 32)]

Again, cArray is a custom class with a special __class_getitem__ implementation. Similary, I have something like:

@cschema
class MyClass(cStruct):
    field: cPointer[ctypes.c_int]

instead of

class MyClass(ctypes.Structure):
    _fields_ = [('field', ctypes.POINTER(ctypes.c_int))]

For type-checkers, I implemented my own mypy plugin since this construction is quite hacky. Note that I needed this @cschema decorator to be able to process the class body like for dataclasses and the special cStruct & co classes to support metaclass keyword arguments.

If someone is interested, I could try to make this implementation more elegant and perhaps close to how dataclasses are used.

@encukou
Copy link
Member

encukou commented Jan 20, 2025

I think, there's space for third-party libraries to experiment with such syntax. As far as I know, there are no roadblocks for a decorator that wants to read annotations and generate _fields_.

In the long run (that is in stdlib), I'm not convinced annotations are the way to go. For specifying extra info other than the type, the options feel rather ugly. (That's bitfields currently, but maybe we also want, for example, explicit offsets.)

With changes to ctypes, we could eventually have something like this instead:

class Foo(Structure):
     x = CField(c_int)
     flag = CField(c_int, bit_size=1)
     bits = CField(c_int, bit_size=3)

But that's a long way off; a third-party annotation-based decorator looks like a good idea for current CPython. (And if it needs anything from CPython that's not a clean, well-supported API, then we should fix ctypes.)

@picnixz
Copy link
Member

picnixz commented Jan 20, 2025

For specifying extra info other than the type, the options feel rather ugly. (That's bitfields currently, but maybe we also want, for example, explicit offsets.)

I haven't thought about this one (ideally, the annotation-based syntax would have been ok if PEP-637 had been accepted, but that's not the case :(). If I have time, I'll try making a 3rd-party package.

With changes to ctypes, we could eventually have something like this instead:

I may have done this locally as well actually. I don't remember (note that I never used my own library, it was just for fun :'))

@ZeroIntensity
Copy link
Member

FWIW, I implemented this in my silly library a while back, but with mappings of Python types to C types (e.g. a type annotation of str was converted to c_wchar_t).

@zwergziege
Copy link

In the long run (that is in stdlib), I'm not convinced annotations are the way to go. For specifying extra info other than the type, the options feel rather ugly. (That's bitfields currently, but maybe we also want, for example, explicit offsets.)

Shouldn't it be possible to just write smth similar to dataclasses with default values, such as this:

class Foo(Structure):
      x : c_int
      flag : cint = CFieldInfo(bit_size=1)
      bits : cint = CFieldInfo(bit_size=3)

Then we can extract type info from annotations and the rest from the CFieldInfo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-ctypes type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

8 participants