Thanks to visit codestin.com
Credit goes to github.com

Skip to content

DEP: Deprecate aliases of builtin types in python 3.7+ #14882

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 16, 2020

Conversation

eric-wieser
Copy link
Member

@eric-wieser eric-wieser commented Nov 11, 2019

This:

  • Makes accessing these attributes emit a deprecation warning, such as:

    np.float is a deprecated alias for the builtin float. Use float by itself, which is identical in behavior, to silence this warning. If you specifically wanted the numpy scalar type, use np.float_ here.

  • Removes them from dir(numpy), so as not to emit warnings for user of inspect.getmembers

Fixes #6103


Marking as draft until #14881 is merged

Marking as WIP until gh-14901 is merged

@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch 3 times, most recently from 9858131 to 4f41f7f Compare November 12, 2019 01:48
@eric-wieser eric-wieser added 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes component: numpy._core labels Nov 12, 2019
@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch 3 times, most recently from 4c06004 to 13a2e0d Compare November 12, 2019 09:54
@eric-wieser eric-wieser reopened this Nov 12, 2019
@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch from 13a2e0d to b812d1a Compare November 13, 2019 10:01
@eric-wieser eric-wieser requested a review from njsmith November 13, 2019 10:05
@eric-wieser eric-wieser marked this pull request as ready for review November 13, 2019 10:11
@eric-wieser eric-wieser removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Nov 13, 2019
Comment on lines 162 to 190
import builtins as _builtins
__deprecated_attrs__.update({
n: (
getattr(_builtins, n),
"`np.{n}` is a deprecated alias for the builtin `{n}`. "
"Use `np.{n}_` if you meant the numpy type, and use `{n}` by "
"itself if you meant the builtin.".format(n=n)
)
for n in ["bool", "int", "float", "complex", "object", "str"]
})
__deprecated_attrs__.update({
n: (
getattr(compat, n),
"`np.{n}` is a deprecated alias for `np.compat.{n}`. "
"Use `np.{n}_` if you meant the numpy type, use `np.compat.{n}` "
"if you support python 2, and otherwise use `{n3}`."
.format(n=n, n3=n3)
)
for n, n3 in [("long", "int"), ("unicode", "str")]
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, here's the original wording from @njsmith, which I'd forgotten about when I wrote this:

                "Writing 'np.{0}' is almost always a mistake. Historically "
                "and currently it is identical to writing plain '{0}' "
                "(i.e., it refers to the Python builtin), and at some point "
                "in the future it will become an error. Replace with '{0}', "
                "or if you want the numpy-specific type, use 'np.{1}'."
                .format(_name, _numpy_equiv)

If anyone prefers parts of this wording, I'd appreciate if they could suggest a combined version as a reply to this comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The majority of uses is likely as a dtype= argument, in which case the differences do not really matter (unless we change int on windows). I do not prefer the history/mistake information though, its cute for super-users but probably confusing for most users looking to update code quickly.

I am considering appending something like "Most users should prefer the Python builtins `{0}(2)` when creating scalars, and `dtype=np.{0}_` when creating arrays.?

But, I admit e.g. for bool and object it does not really matter and the Python scalars are just as good to signal dtype.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we maybe should just use the same message in the second case with Py2 gone, except that it needs the two different names as inputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on those comments with a suggested change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, a bit nitpicking and brainstorming... We are going to spend some "annoying users" chips on this, so wanted to get this right.

But, I have asked (non super-user), and rethought... And I think we should just keep it as is. I am not sure the typical user will really understand it. But the main point is: The typical user will not care at all, they will simply choose either of the replacements and be happy that it works, assuming (correctly) that if it likely mattered, we would tell them.

If we start getting questions We could add a URL to keep the message short while providing details elsewhere.

Copy link
Member Author

@eric-wieser eric-wieser Jun 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the message could emphasize that the way to make the warning go away without changing anything is to use int instead of np.int. The current warning steers them more towards np.int_, which is marginally more likely to break something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea. As mentioned above, I think I slightly prefer float64 over float_ as well. But no need to squabble about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for mentioning float_ is that people will then hopefully spot the rule that the numpy types are the builtins with a _ appended.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, sounds fine to. I am happy with whatever small (or none) you still want to make, I don't have any more concrete proposals that may make a difference anymore.

If anyone else comes up with concrete proposals that is fine of course.

The DeprecationTest would be the main thing I would want to add before merging, but its not there in a few days, I may just add it myself (pinged the mailing list just in case, so want to wait a tiny bit).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with the test.

@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch 2 times, most recently from 7470b51 to 23c0f06 Compare November 13, 2019 10:41
@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch from 23c0f06 to e0b2723 Compare November 13, 2019 17:50
@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch from e0b2723 to 773ca45 Compare November 14, 2019 12:51
@eric-wieser eric-wieser force-pushed the 6103-alias-__getattr__-deprecation branch 3 times, most recently from 6f157c6 to 7065287 Compare November 14, 2019 16:44
bryanwweber added a commit to bryanwweber/cantera that referenced this pull request Feb 3, 2021
NumPy 1.20 deprecates numpy namespace aliases to built-in types, such as
int, float, and object. Explicit specification of the precision with
float64 is still supported and does not need to be changed. See:
https://github.com/numpy/numpy/releases/tag/v1.20.0 and
numpy/numpy#14882
speth pushed a commit to Cantera/cantera that referenced this pull request Feb 3, 2021
NumPy 1.20 deprecates numpy namespace aliases to built-in types, such as
int, float, and object. Explicit specification of the precision with
float64 is still supported and does not need to be changed. See:
https://github.com/numpy/numpy/releases/tag/v1.20.0 and
numpy/numpy#14882
PicoCentauri pushed a commit to PicoCentauri/mdanalysis that referenced this pull request Mar 30, 2021
* fixes needed to get full MDA test suite passing
with NumPy `master` branch

* remove usage of pertinent deprecated NumPy type aliases
like `np.int`; see numpy/numpy#14882

* this only removes the deprecated type aliases that cause
MDAnalysis test suite failures; there are still deprecated uses
that do not cause failures, however the volume of warnings
emitted by the MDAnalysis test suite is so large that only
actual test failures are addressed at this time (I will try
to follow-up for the remainder later)

* it may be worthwhile to consider addition of a CI matrix
entry using a NumPy pre-release wheel, although that addition
is not made here
kboone added a commit to kboone/sncosmo that referenced this pull request Sep 8, 2021
srowen pushed a commit to apache/spark that referenced this pull request Mar 3, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes #40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
srowen pushed a commit to apache/spark that referenced this pull request Mar 3, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes #40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit b3c26b8)
Signed-off-by: Sean Owen <[email protected]>
srowen pushed a commit to apache/spark that referenced this pull request Mar 3, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes #40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit b3c26b8)
Signed-off-by: Sean Owen <[email protected]>
snmvaughan pushed a commit to snmvaughan/spark that referenced this pull request Jun 20, 2023
…ypes

### Problem description
Numpy has started changing the alias to some of its data-types. This means that users with the latest version of numpy they will face either warnings or errors according to the type that they are using. This affects all the users using numoy > 1.20.0
One of the types was fixed back in September with this [pull](apache#37817) request

[numpy 1.24.0](numpy/numpy#22607): The scalar type aliases ending in a 0 bit size: np.object0, np.str0, np.bytes0, np.void0, np.int0, np.uint0 as well as np.bool8 are now deprecated and will eventually be removed.
[numpy 1.20.0](numpy/numpy#14882): Using the aliases of builtin types like np.int is deprecated

### What changes were proposed in this pull request?
From numpy 1.20.0 we receive a deprecattion warning on np.object(https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations) and from numpy 1.24.0 we received an attribute error:

```
attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

From numpy version 1.24.0 we receive a deprecation warning on np.object0 and every np.datatype0 and np.bool8
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)

### Why are the changes needed?
The changes are needed so pyspark can be compatible with the latest numpy and avoid

- attribute errors on data types being deprecated from version 1.20.0: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
- warnings on deprecated data types from version 1.24.0: https://numpy.org/devdocs/release/1.24.0-notes.html#deprecations

### Does this PR introduce _any_ user-facing change?
The change will suppress the warning coming from numpy 1.24.0 and the error coming from numpy 1.22.0

### How was this patch tested?
I assume that the existing tests should catch this. (see all section Extra questions)

I found this to be a problem in my work's project where we use for our unit tests the toPandas() function to convert to np.object. Attaching the run result of our test:

```

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/usr/local/lib/python3.9/dist-packages/<my-pkg>/unit/spark_test.py:64: in run_testcase
    self.handler.compare_df(result, expected, config=self.compare_config)
/usr/local/lib/python3.9/dist-packages/<my-pkg>/spark_test_handler.py:38: in compare_df
    actual_pd = actual.toPandas().sort_values(by=sort_columns, ignore_index=True)
/usr/local/lib/python3.9/dist-packages/pyspark/sql/pandas/conversion.py:232: in toPandas
    corrected_dtypes[index] = np.object  # type: ignore[attr-defined]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

attr = 'object'

    def __getattr__(attr):
        # Warn for expired attributes, and return a dummy function
        # that always raises an exception.
        import warnings
        try:
            msg = __expired_functions__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)

            def _expired(*args, **kwds):
                raise RuntimeError(msg)

            return _expired

        # Emit warnings for deprecated attributes
        try:
            val, msg = __deprecated_attrs__[attr]
        except KeyError:
            pass
        else:
            warnings.warn(msg, DeprecationWarning, stacklevel=2)
            return val

        if attr in __future_scalars__:
            # And future warnings for those that will change, but also give
            # the AttributeError
            warnings.warn(
                f"In the future `np.{attr}` will be defined as the "
                "corresponding NumPy scalar.", FutureWarning, stacklevel=2)

        if attr in __former_attrs__:
>           raise AttributeError(__former_attrs__[attr])
E           AttributeError: module 'numpy' has no attribute 'object'.
E           `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
E           The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
E               https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

/usr/local/lib/python3.9/dist-packages/numpy/__init__.py:305: AttributeError
```

Although i cannot provide the code doing in python the following should show the problem:
```
>>> import numpy as np
>>> np.object0(123)
<stdin>:1: DeprecationWarning: `np.object0` is a deprecated alias for ``np.object0` is a deprecated alias for `np.object_`. `object` can be used instead.  (Deprecated NumPy 1.24)`.  (Deprecated NumPy 1.24)
123
>>> np.object(123)
<stdin>:1: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
```

I do not have a use-case in my tests for np.object0 but I fixed like the suggestion from numpy

### Supported Versions:
I propose this fix to be included in all pyspark 3.3 and onwards

### JIRA
I know a JIRA ticket should be created I sent an email and I am waiting for the answer to document the case also there.

### Extra questions:
By grepping for np.bool and np.object I see that the tests include them. Shall we change them also? Data types with _ I think they are not affected.

```
git grep np.object
python/pyspark/ml/functions.py:        return data.dtype == np.object_ and isinstance(data.iloc[0], (np.ndarray, list))
python/pyspark/ml/functions.py:        return any(data.dtypes == np.object_) and any(
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[4], np.object)  # datetime.date
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[1], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[6], np.object)
python/pyspark/sql/tests/test_dataframe.py:                self.assertEqual(types[7], np.object)

git grep np.bool
python/docs/source/user_guide/pandas_on_spark/types.rst:np.bool       BooleanType
python/pyspark/pandas/indexing.py:            isinstance(key, np.bool_) for key in cols_sel
python/pyspark/pandas/tests/test_typedef.py:            np.bool: (np.bool, BooleanType()),
python/pyspark/pandas/tests/test_typedef.py:            bool: (np.bool, BooleanType()),
python/pyspark/pandas/typedef/typehints.py:    elif tpe in (bool, np.bool_, "bool", "?"):
python/pyspark/sql/connect/expressions.py:                assert isinstance(value, (bool, np.bool_))
python/pyspark/sql/connect/expressions.py:                elif isinstance(value, np.bool_):
python/pyspark/sql/tests/test_dataframe.py:        self.assertEqual(types[2], np.bool)
python/pyspark/sql/tests/test_functions.py:            (np.bool_, [("true", "boolean")]),
```

If yes concerning bool was merged already should we fix it too?

Closes apache#40220 from aimtsou/numpy-patch.

Authored-by: Aimilios Tsouvelekakis <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
(cherry picked from commit b3c26b8)
Signed-off-by: Sean Owen <[email protected]>
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 23, 2023
….19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 23, 2023
….19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 25, 2023
….19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607
sanjibansg added a commit to sanjibansg/root that referenced this pull request Aug 30, 2023
…ed within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base
sanjibansg added a commit to sanjibansg/root that referenced this pull request Sep 1, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

Co-Authored-By: moneta <[email protected]>

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel
sanjibansg added a commit to sanjibansg/root that referenced this pull request Sep 1, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

Co-Authored-By: moneta <[email protected]>

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <[email protected]>
sanjibansg added a commit to sanjibansg/root that referenced this pull request Sep 1, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <[email protected]>
lmoneta added a commit to root-project/root that referenced this pull request Sep 4, 2023
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <[email protected]>
maksgraczyk pushed a commit to maksgraczyk/root that referenced this pull request Jan 12, 2024
…s) and restricting numpy version

avoid trying to load sonnet and graph_nets if not installed

[tmva][sofie-gnn] numpy version for sofie-gnn test should be restricted within <=1.19 or >=1.24

Because of the changed behavior of np.bool and similar aliases for builtin
data types, we need to restrict the numpy version to the stated range for sonnet.

For more information, refer here:
numpy/numpy#14882
numpy/numpy#22607

fix: definition of OutputGenerated in RModel_Base

[tmva][sofie-gnn] Suppress warnings for cases other than .dat file in method WriteInitializedTensorsToFile in RModel

[tmva][sofie-gnn] Fix node update in GNN and size of global features in GraphIndependent

[tmva][sofie-gnn] Fix node update in RModel_GNN generated code

[tmva][sofie-gnn] Fix for correct size of global features in GraphIndependent

fix also the way the computation of output features in RModel_GNN

Fix dimension of global feature tensor during node update

If the number of nodes is larger than the edges the tensor storing the global feature needs to be resize to the correct number of nodes * number of feature

[tmva][sofie-gnn] Fix importing _gnn if python version is less than 3.8

Improve also gnn test and address some of the Vincenzo's comments

Changes addressing comments by @vepadulano

Co-authored-by: moneta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants