Thanks to visit codestin.com
Credit goes to github.com

Skip to content

BUG: Modules from Fortran interface not accessible in Numpy 2.x (current 2.1.2) #27622

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2sn opened this issue Oct 23, 2024 · 12 comments · Fixed by #27695
Closed

BUG: Modules from Fortran interface not accessible in Numpy 2.x (current 2.1.2) #27622

2sn opened this issue Oct 23, 2024 · 12 comments · Fixed by #27695

Comments

@2sn
Copy link
Contributor

2sn commented Oct 23, 2024

Describe the issue:

Using Numpy 2.x (tried 2.02, 2.1.2) I can no longer access module data.

After f2py I get the pyf file

!    -*- f90 -*-
! Note: the context of this file is case sensitive.

python module _solver ! in 
    interface  ! in :_solver
        module types ! in :_solver:solver.f90
            integer, parameter,optional :: int32=selected_int_kind(8)
            integer, parameter,optional :: int64=selected_int_kind(16)
            integer, parameter,optional :: real64=selected_real_kind(15)
        end module types
        module lanedata ! in :_solver:solver.f90
            use types, only: real64,int32
            integer(kind=4), parameter,optional :: maxdata=1048575
            real(kind=8), allocatable,dimension(:,:) :: theta
            integer(kind=4) :: ndata
        end module lanedata
        subroutine freelanedata ! in :_solver:solver.f90
            use lanedata, only: ndata,theta
        end subroutine freelanedata
        subroutine lane(dx,n,w) ! in :_solver:solver.f90
            use types, only: int32,real64
            use lanedata, only: maxdata,theta,ndata
            real(kind=8) intent(in) :: dx
            real(kind=8) intent(in) :: n
            real(kind=8) intent(in) :: w
        end subroutine lane
        subroutine rk4(x0,y0,y1,dx,n,w,z0,z1) ! in :_solver:solver.f90
            use types, only: real64
            real(kind=8) intent(in) :: x0
            real(kind=8) intent(in) :: y0
            real(kind=8) intent(in) :: y1
            real(kind=8) intent(in) :: dx
            real(kind=8) intent(in) :: n
            real(kind=8) intent(in) :: w
            real(kind=8) intent(out) :: z0
            real(kind=8) intent(out) :: z1
        end subroutine rk4
    end interface 
end python module _solver

! This file was auto-generated with f2py (version:2.1.2).
! See:
! https://web.archive.org/web/20140822061353/http://cens.ioc.ee/projects/f2py2e

but when I try to access the data, as suggested in the online doc (just checking that this is still as it used to be, https://numpy.org/doc/2.1/f2py/python-usage.html#fortran-90-module-data), I get

 In [2]: print(laneemden._solver.__doc__)
This module '_solver' is auto-generated with f2py (version:2.1.2).
Functions:
    freelanedata()
    lane(dx,n,w)
    z0,z1 = rk4(x0,y0,y1,dx,n,w)
.

or

In [3]: dir(laneemden._solver)
Out[3]: 
['__doc__',
 '__f2py_numpy_version__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__solver_error',
 '__spec__',
 '__version__',
 'freelanedata',
 'lane',
 'rk4']

so, not trace of module lanedata. This used to work (and still does) in Numpy 1.x.

Reproduce the code example:

from laneemden._solver import lanedata

Here my code, solver.f90

! 20111212 Alexander Heger

! 1/z**2 d/dz (z**2 d/dz theta(z)) + theta(z)**n = 0
! for small z one can approximate
! theta(z) = 1. + (-1/6.)*z**2 + (n/120.)*z**4 + O(z**6)
! Therefore lim(z)-->0 d**2 theta(z)/d z**2 = -1/3

! 20120423 Alexander Heger

! if we include a constant rotation rate Omega the equation becomes
! 1/z**2 d/dz (z**2 d/dz theta(z)) + theta(z)**n - w = 0
! where
! w = W/rho_c
! W = 2 Omega**2 / 4 pi G
! for small z one can approximate
! theta(z) = 1. + (w - 1)/6. *z**2 + (1.-w)*(n/120.)*z**4 + O(z**6)
! Therefore lim(z)-->0 d**2 theta(z)/d z**2 = (w-1.)/3

module types

  implicit none

  INTEGER, PARAMETER :: int32 = SELECTED_INT_KIND(8)
  INTEGER, PARAMETER :: int64 = SELECTED_INT_KIND(16)
  INTEGER, PARAMETER :: real64 = SELECTED_REAL_KIND(15)

end module types


module lanedata

  use types, only: &
       real64, int32

  implicit none

  save

  integer(kind=int32), parameter :: &
       maxdata = 2**20-1

  real(kind=real64), dimension(:,:), allocatable :: &
       theta

  integer(kind=int32) :: &
       ndata

end module lanedata


subroutine freelanedata()

  use lanedata, only: &
       ndata, theta

  implicit none

  if (allocated(theta)) deallocate(theta)
  ndata = -1

end subroutine freelanedata


subroutine lane(dx, n, w)

  ! lane emden integration, return data,
  ! including first invalid (rho < 0) point for interpolation.

  use types, only: &
       int32, real64

  use lanedata, only: &
       maxdata, theta, ndata

  implicit none

!f2py real(kind=real64), intent(in) :: dx, n, w

  real(kind=real64), intent(in) :: &
       dx, n, w

  real(kind=real64) :: &
       x
  integer(kind=int32) :: &
       i, j

  real(kind=real64), dimension(0:1) :: &
       y, z

  real(kind=real64), dimension(:,:), allocatable :: &
       theta_

  if (allocated(theta)) deallocate(theta)
  j = maxdata
  allocate(theta_(0:j, 0:1))

  x = 0.d0
  y(0:1) = (/1.d0, 0.d0/)

  i = 0
  theta_(i,:) = y(:)
  do while (.True.)
     call rk4(x,y(0),y(1),dx,n,w,z(0),z(1))
     if (i == j) then
        j = j + maxdata
        allocate(theta(0:j,0:1))
        theta(0:i,:) = theta_(0:i,:)
        deallocate(theta_)
        call move_alloc(theta, theta_)
     endif

     i = i+1
     theta_(i, :) = z(:)
     x = x + dx
     y(:) = z(:)

     if (y(0) < 0.d0) exit
  enddo
  ndata = i

  allocate(theta(0:ndata, 0:1))
  theta(0:ndata,0:1) = theta_(0:ndata,0:1)
  deallocate(theta_)

end subroutine lane


subroutine rk4(x0, y0, y1, dx, n, w, z0, z1)

  use types, only: &
       real64

  implicit none

  real(kind=real64), parameter :: &
       p13 = 1.d0 / 3.d0, &
       p16 = 1.d0 / 6.d0

  real(kind=real64), intent(in)  :: &
       x0, y0, y1
  real(kind=real64), intent(in)  :: &
       dx, n, w
  real(kind=real64), intent(out) :: &
       z0, z1

  real(kind=real64) :: &
       xh, dh
  real(kind=real64) :: &
       k10, k11, k20, k21, k30, k31, k40, k41

!f2py real(8), intent(in) :: x0, dx, n, w
!f2py real(8), intent(in) :: y0, y1
!f2py real(8), intent(out) :: z0, z1

  xh = x0 + 0.5d0 * dx
  dh = 0.5d0 * dx

  k10 = y1
  if (x0 == 0) then
     k11 = (w - 1.d0) * p13
  else
     k11 = -2.d0 / x0 * y1 - (max(y0, 0.d0))**n + w
  endif

  k20 = y1 + dh*k11
  k21 = -2.d0 / xh * k20 - (max(y0 + dh*k10, 0.d0))**n

  k30 = y1 + dh*k21
  k31 = -2.d0 / xh * k30 - (max(y0 + dh*k20, 0.d0))**n

  k40 = y1 + dx*k31
  k41 = -2.d0 / (x0+dx) * k40 - (max(y0 + dx*k30,0.d0))**n

  z0 = y0 + dx*(k10 + 2.d0 * (k20 + k30) + k40) * p16
  z1 = y1 + dx*(k11 + 2.d0 * (k21 + k31) + k41) * p16

end subroutine rk4

I use a custom build package that does not work with meson (and sadly, the usually very helpful Numpy developers out of principle refused to include fixes to obvious bugs in the build script to this day and a small patch to make my script continue to work, for both of which I had posted patches) and that are lengthy, so I won't include.

The issue seems to be that there is on interface in module lanedata. If I include an interface into the definition of module lanedata so it becomes, e.g.,

module lanedata

  use types, only: &
       real64, int32

  implicit none

  save

  integer(kind=int32), parameter :: &
       maxdata = 2**20-1

  real(kind=real64), dimension(:,:), allocatable :: &
       theta

  integer(kind=int32) :: &
       ndata

contains

  subroutine phoney
  end subroutine phoney
  
end module lanedata

then the module is included and an interface generated. This behaviour to skip modules w/o interface contents (just adding the contains line but an empty section does not suffice).

Maybe the Python 1.x behaviour can be restored, as it is common to have plain data modules w/o definition of functions or subroutines.

It is also possible that I missed a flag / behaviour change, my apologies in this case and please advise.

Error message:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[4], line 1
----> 1 from laneemden._solver import lanedata

ImportError: cannot import name 'lanedata' from 'laneemden._solver' (/home/alex/python/source/laneemden/_solver.cpython-311-x86_64-linux-gnu.so)

Python and NumPy Versions:

2.1.2
3.11.10 (main, Sep 8 2024, 14:25:06) [GCC 14.2.1 20240801 (Red Hat 14.2.1-1)]

Runtime Environment:

[{'numpy_version': '2.1.2',
'python': '3.11.10 (main, Sep 8 2024, 14:25:06) [GCC 14.2.1 20240801 (Red '
'Hat 14.2.1-1)]',
'uname': uname_result(system='Linux', node='w.2sn.net', release='6.11.3-200.fc40.x86_64', version='#1 SMP PREEMPT_DYNAMIC Thu Oct 10 22:31:19 UTC 2024', machine='x86_64')},
{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
'found': ['SSSE3',
'SSE41',
'POPCNT',
'SSE42',
'AVX',
'F16C',
'FMA3',
'AVX2',
'AVX512F',
'AVX512CD',
'AVX512_SKX'],
'not_found': ['AVX512_KNL',
'AVX512_KNM',
'AVX512_CLX',
'AVX512_CNL',
'AVX512_ICL']}},
{'architecture': 'SkylakeX',
'filepath': '/home/alex/Python_3.11.10/lib/python3.11/site-packages/numpy.libs/libscipy_openblas64_-ff651d7f.so',
'internal_api': 'openblas',
'num_threads': 16,
'prefix': 'libscipy_openblas',
'threading_layer': 'pthreads',
'user_api': 'blas',
'version': '0.3.27'}]

Context for the issue:

breaks code, can no longer access module data that worked in Numpy < 2.0.

@ngoldbaum
Copy link
Member

ping @HaoZeke

@DWesl
Copy link
Contributor

DWesl commented Oct 24, 2024

Ran into a similar issue recently, a workaround is to add

contains
subroutine has_module_lanedata()
end subroutine has_module_lanedata

before the end module lanedata (and similarly before end module types). NumPy assumes that data-only modules used by other Fortran code are for Fortran reference purposes only, and are not to be exported to Python.

@2sn
Copy link
Contributor Author

2sn commented Oct 24, 2024

@DWesl Thank you. Yes, that this is a workaround for the current behaviour, as I wrote.

However,

  • this is not what f2py did in the past, so breaks backward compatibility w/o need.
  • this assumption would be very unreasonable as I have a lot more moldes with subroutines that are used internally only then the other way round.

If it was desired to exclude modules, a f2py directive to hide modules are parts thereof should be introduced.

Things I do not want in the interface I compile in external libraries. There used to be f2py commands to include/exclude specific subroutines/symbols on interface creation. Things I put in the interface and for which I use f2py to make the pyf file are there because I explicitly want them in the interface on the first place.

@ngoldbaum
Copy link
Member

FWIW I don't think this was an intended change. f2py is complicated, has poor test coverage, and few people understand the internals. We would have left it alone, but the removal of distutils by Python forced us to make some changes and you're seeing the fallout.

It's not a perfect situation at all. If you want to help to fix this issue that would be very appreciated.

@HaoZeke
Copy link
Member

HaoZeke commented Oct 25, 2024

Thanks everyone for the discussion here, I've been thinking of how best to resolve this and will have a PR up this weekend most likely. As far as I can remember this (poorly documented) breaking change had to do with the way F2PY was extended to handle derived types.

The solution is better docs + an explicit hiding directive to handle declaration only derived type modules.

@ngoldbaum
Copy link
Member

@HaoZeke is one of the people who have a lot of knowledge of f2py internals, so I'll defer to them :)

@2sn
Copy link
Contributor Author

2sn commented Oct 25, 2024

@HaoZeke You have been always very helpful fixing things swiftly. Thank you so much for that!

My recent small patch suggestions to allow the setup for my f2py build framework to continue work with Python 3.12 after the discontinuation of distutils (at least one of them seems a real bug, the other was adding a simple option for backward compatibility that would not have broken anything else or had side effects other than having to "maintain" three extra lines of code), however, were rejected, so I found it futile to suggest further patches myself.

What do people do in this case, maybe need to make a branch and rebase every time there is a numpy update? (For matplotlib I also just have a mplfixes module where obviously useful natural extensions were not accepted.)

@HaoZeke
Copy link
Member

HaoZeke commented Oct 25, 2024

@HaoZeke You have been always very helpful fixing things swiftly. Thank you so much for that!

Thanks, sorry this one took a while to get to.

My recent small patch suggestions to allow the setup for my f2py build framework to continue work with Python 3.12 after the discontinuation of distutils (at least one of them seems a real bug, the other was adding a simple option for backward compatibility that would not have broken anything else or had side effects other than having to "maintain" three extra lines of code), however, were rejected, so I found it futile to suggest further patches myself.

Sorry could you refresh my memory on this? Generally I'd love to try to get either existing working patchsets in or equivalent changes (i.e. those which pass the original test failure)

What do people do in this case, maybe need to make a branch and rebase every time there is a numpy update? (For matplotlib I also just have a mplfixes module where obviously useful natural extensions were not accepted.)

There's (IMO terrible) precedent set in the form of f90wrap which has a hard dependency on f2py but is technically a different project; and eventually diverges and winds down..

For build systems, it is probable that a separate extension module will be used (since NumPy will not vendor any distutils code anymore), but making a branch and rebasing is not ergonomic (and confusing to everyone using it too).

@2sn
Copy link
Contributor Author

2sn commented Oct 25, 2024

@HaoZeke OK, I may try to cast these into real pull requests.

It is related to the thread #24874 with items
#24874 (comment)
and after
#24874 (comment)
as well as
#24874 (comment)
It does not seem obvious to me why not just allow users pass their own flags if they really want to, even if it is not in the spirit of usual package distribution; some --- possibly many --- people just want to develop and compile locally for their own use, no distribution intended.

@HaoZeke
Copy link
Member

HaoZeke commented Nov 3, 2024

I dug into this a bit, and it certainly is an interesting one, thanks for reporting! For starters, it seems to only show up in certain cases, for instance, this works:

module types

  implicit none

  INTEGER, PARAMETER :: int32 = SELECTED_INT_KIND(8)
  INTEGER, PARAMETER :: int64 = SELECTED_INT_KIND(16)
  INTEGER, PARAMETER :: real64 = SELECTED_REAL_KIND(15)
end module types

module lanedata

  use types, only: &
       real64, int32

  implicit none

  save

  integer(kind=int32), parameter :: &
       maxdata = 2**20-1

  real(kind=real64), dimension(:,:), allocatable :: &
       theta

  integer(kind=int32) :: &
       ndata

end module lanedata

in blah.f90 with f2py -m ltest blah.f90 -c works fine:

In [1]: import ltest

In [2]: dir(ltest)
Out[2]: 
['__doc__',
 '__f2py_numpy_version__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '__version__',
 '_ltest_error',
 'lanedata']

In [3]: dir(ltest.lanedata)
Out[3]: 
['__call__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'maxdata',
 'ndata']

In [4]: ltest.lanedata.ndata
Out[4]: array(0, dtype=int32)

In [5]: ltest.lanedata.maxdata
Out[5]: array(1048575, dtype=int32)

However, the bug is reproducible, if a subroutine outside a module is present (which a situation we don't currently have tests for), if there is more than one module present in a single file.

module types

  implicit none

  INTEGER, PARAMETER :: int32 = SELECTED_INT_KIND(8)
  INTEGER, PARAMETER :: int64 = SELECTED_INT_KIND(16)
  INTEGER, PARAMETER :: real64 = SELECTED_REAL_KIND(15)

end module types

module lanedata

  use types, only: &
       real64, int32

  implicit none

  save

  integer(kind=int32), parameter :: &
       maxdata = 2**20-1

  real(kind=real64), dimension(:,:), allocatable :: &
       theta

  integer(kind=int32) :: &
       ndata

end module lanedata

subroutine freelanedata()

  use lanedata, only: &
       ndata, theta

  implicit none

  if (allocated(theta)) deallocate(theta)
  ndata = -1

end subroutine freelanedata

Where now, with f2py -m ltest blah.f90 -c:

In [1]: import ltest

In [2]: dir(ltest)
Out[2]: 
['__doc__',
 '__f2py_numpy_version__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '__version__',
 '_ltest_error',
 'freelanedata']

In [3]: 

There are a few things I need to understand conceptually about what the intended behavior is though. The reason this happens is because when there are subroutines they get exported into a module of the same name as requested on the command line (ltest in this case), and there is likely a bug in the way the modules are (not) being exported. Should have a fix soon.

Bizarrely, I think there's something more specific to the modules reported here by @2sn; since this also works:

module datonly

  implicit none

  integer, parameter :: max_value = 100

  real, dimension(:), allocatable :: data_array

end module datonly


module dat

  implicit none

  integer, parameter :: max_= 1009

end module dat

subroutine simple_subroutine(arg)
  integer, intent(inout) :: arg
  arg = arg * 5
end subroutine simple_subroutine

This also works..

In [1]: import datonly

In [2]: datonly.datonly.max_value
Out[2]: array(100, dtype=int32)

In [3]: datonly.datonly.max_value
Out[3]: array(100, dtype=int32)

In [4]: datonly.dat.max_
Out[4]: array(1009, dtype=int32)

In [5]: dir(datonly)
Out[5]: 
['__doc__',
 '__f2py_numpy_version__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '__version__',
 '_datonly_error',
 'dat',
 'datonly',
 'simple_subroutine']

@2sn
Copy link
Contributor Author

2sn commented Nov 4, 2024

@HaoZeke Thank you very much for looking into this.

I am not sure whether your question about the intent/use case is directed at me. What my Python code dies is trying to access the date in module lanedata directly. Obviously, one could add a function to return the data, but such calls also tend to be rather expensive.

I do not know how access to module data is done internally by f2py, maybe there is a similar implicit function call in the first place, but it is certainly easier in terms of syntax (unless one creates a fancy Python class wrapping such function calls).

Do you plan to have a f2py directive to hide or explicitly include modules (there used to be some f2py command line flags to include or exclude symbols from wapping), or just include all? Except for a small bit of overhead at compile time (and maybe small increase in module size), what would be reasons to not include all module and their public members by default? Hiding private implementation details? Looking at hour case study, it seems what is included and what not is based upon some guesses what the user may likely have intended and desired, but it seems hard to anticipate actual use cases in general.

@HaoZeke
Copy link
Member

HaoZeke commented Nov 4, 2024

@HaoZeke Thank you very much for looking into this.

I am not sure whether your question about the intent/use case is directed at me. What my Python code dies is trying to access the date in module lanedata directly. Obviously, one could add a function to return the data, but such calls also tend to be rather expensive.

I do not know how access to module data is done internally by f2py, maybe there is a similar implicit function call in the first place, but it is certainly easier in terms of syntax (unless one creates a fancy Python class wrapping such function calls).

Do you plan to have a f2py directive to hide or explicitly include modules (there used to be some f2py command line flags to include or exclude symbols from wapping), or just include all? Except for a small bit of overhead at compile time (and maybe small increase in module size), what would be reasons to not include all module and their public members by default? Hiding private implementation details? Looking at hour case study, it seems what is included and what not is based upon some guesses what the user may likely have intended and desired, but it seems hard to anticipate actual use cases in general.

Include all should definitely be the default (and is again after #27695). There are tests for the hide directives, so that should be OK.

I'm thinking of adding some examples on using derived types soon, but otherwise, mostly just thinking of bugfixes / docs in the short term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants