Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

celestinoxp
Copy link
Contributor

PyCaret is not compatible with the latest versions of its dependencies, as development has been stalled for a long time. To address this, a strategy is proposed to resume development in a controlled manner.

Strategy:

  1. Pin dependency versions: Restrict dependencies to a specific point (e.g., ~1 year ago), such as sktime>=0.31,<0.32, to prevent newer, incompatible versions (e.g., 0.36) from being installed and causing errors.
  2. Run tests: Ensure GitHub CI tests pass with these pinned versions to establish a stable baseline.
  3. Gradual updates: Incrementally update dependencies, adapting PyCaret's code as needed and verifying test results at each step.

This PR starts by downgrading key dependencies to their last known compatible versions and will evolve as the codebase is brought up to date.

celestinoxp and others added 13 commits February 20, 2025 16:14
- update fugue to latest version. 0.9.1
- Added support for `report` parameter in FugueBackend and tests for compatibility with fugue-0.9.1.
this tests have a bug but we need to investigate if its a shap bug or pycaret bug.
I created a issue to discuss and investigate this problem: pycaret#4152
…guments

Correctly pass call_id to _persist_input in FastMemorizedFunc.call to resolve
TypeError: MemorizedFunc._persist_input() missing 1 required positional argument:
'kwargs'. This ensures compatibility with joblib's signature while maintaining
PyCaret's caching optimizations.
@celestinoxp
Copy link
Contributor Author

celestinoxp commented Feb 25, 2025

@Smartappli @amotl Would you be available to lend a hand here? I'm having trouble understanding what's happening with the tests. joblib does not display an explanatory message that allows you to get to the root of the problem.

@amotl
Copy link
Contributor

amotl commented Feb 25, 2025

Run tests: Ensure GitHub CI tests pass with these pinned versions to establish a stable baseline.

Hi @celestinoxp. Thanks a stack for dedicating work to this. 💯

This PR run responded like:

This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

oO. That's probably one of those super hard errors to debug, exaggerated by that it is happening on a remote CI system.

But hey, is it really true that you brought it down to just 1 failed test run? That would be so excellent already!

I don't know what I could contribute to improve the situation here. Recently, I heard colleagues talking about ways to log into GHA's execution context from your workstation, in order to be able to interactively use the environment where the problem happens, at hand. Would you be happy if I would dig out corresponding information, in order to present it here for anyone who would like to have a closer look what's happening within C/GHA/PyCaret at this particular spot where the CI workflow goes south?

Is the test suite actually succeeding on your machine, and just failing on CI/GHA?

@celestinoxp
Copy link
Contributor Author

celestinoxp commented Feb 26, 2025

@amotl I will fix everything :) The problem is in compatibility with joblib 1.4, and I will solve it as soon as I have some time available**.
There are still some issues to resolve, but these are other issues, such as the latest versions of some packages. But for now to avoid future problems, it is necessary to mark all packages with the "<" symbol so that they do not break in the future. example package < version max+1.
If you want to talk to me directly, we can talk on pycaret slack or sktime discord.
** (If I could get funding, I could dedicate myself full time to pycaret)

@amotl
Copy link
Contributor

amotl commented Feb 26, 2025

Hi.

I will fix everything.

Excellent! 🚀

It is necessary to mark all packages with the "<" symbol.

I am doing the same with my libraries where I do not aim for unrestricted downstream havocs. 👍

The problem is in compatibility with joblib 1.4.

All right. Is it possible to also fix it by limiting the upper version boundary like with the other dependencies?

If I could get funding, I could dedicate myself full time to PyCaret.

That sounds interesting, and I think we should emphasize it collaboratively. I can also ask about deriving funding from fragments of our next financing round, but I am not a C-level person, so it is not likely I can make a significant difference in convincing others about it.

With kind regards,
Andreas.

Increase tolerance for execution time in setup performance test

- Increased the tolerance for execution time differences from 0.2 to 0.6 in `test_setup_performance`.
- This change addresses occasional test failures due to small variations in execution time.
- A TODO comment has been added to investigate the root cause of the increased execution time in the future.
Copy link

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest:

  • leave lower bounds if they were in the last release
  • add upper bounds to last known stable test run
  • do not use pins (see above)

More generally, the de-facto policy used in sktime is this:
sktime/sktime#1480

I think the defensive policy might also be advisable, due to the larger number of soft deps. But that is a discussion for a later day.

@amotl
Copy link
Contributor

amotl commented Feb 28, 2025

Hi @celestinoxp. FWIW, I've whipped up your branch into the CI workflow runs of ours, that invoke a few Jupyter Notebooks which use PyCaret.

More

You may want to inspect a failed test run over here, errors are mostly revolving around that:

AttributeError: 'list' object has no attribute 'get'

However, because this happens in pycaret/loggers/mlflow_logger.py, and we are using the software together with our mlflow-cratedb, which is also not up-to-date, we think it might be a compatibility issue in that regard, so PyCaret should not be concerned about it.

@celestinoxp
Copy link
Contributor Author

@amotl there is a problem with sktime which probably has a bug that causes a joblib error to appear.
The current error is reported by: "RecursionError: maximum recursion depth exceeded".
I'm investigating the source of the problem, I even did a temporary monkey patch, but it doesn't seem to work well.
If you want to help investigate, it would be a plus, as that is all that is missing for all the tests to pass.

@amotl
Copy link
Contributor

amotl commented Mar 1, 2025

Hi. Are you able to improve the situation by increasing the recursion limit, using sys.setrecursionlimit()?

@celestinoxp
Copy link
Contributor Author

celestinoxp commented Mar 1, 2025

@amotl i tried a temporary solution with setrecursionlimit, but tests/test_time_series_blending.py still not passing... because have a infinite loop (maybe from sktime).

fkiraly pushed a commit to sktime/sktime that referenced this pull request Mar 2, 2025
…etitemError` in `SummaryTransformer` (#7903)

### Issue
The `_custom_showwarning` method in `sktime/utils/warnings.py` causes
infinite recursion when handling certain warnings (e.g., pandas
`LossySetitemError`), resulting in a `RecursionError: maximum recursion
depth exceeded`. Additionally, in
`sktime/transformations/series/summarize.py`, the line `func_dict.loc[:,
"window"] = func_dict["window"].astype("object")` throws a
`LossySetitemError` in pandas 2.x due to incompatible type coercion.

This issue was identified while using PyCaret with sktime, where tests
failed with a crash in a joblib worker due to a `PicklingError` caused
by excessive recursion.

### Solution
- Modified the `_custom_showwarning` method in the
`_SuppressWarningPattern` class to add a guard against recursion using
an `_in_warning` flag, while preserving delegation to
`original_showwarning`.
- Fixed `SummaryTransformer` to use a type-safe conversion with
`func_dict["window"] = func_dict["window"].astype("object",
copy=False)`, avoiding the `LossySetitemError`.

### Changes
- `sktime/utils/warnings.py`: Added protection against infinite
recursion in the `_custom_showwarning` method of
`_SuppressWarningPattern` using an `_in_warning` flag.
- `sktime/transformations/series/summarize.py`: Replaced the problematic
assignment with an explicit conversion of the `"window"` column to
`object` using `copy=False` for greater efficiency.
- `sktime/utils/tests/test_warnings.py`: Added a new test file to verify
that `_custom_showwarning` in `_SuppressWarningPattern` emits warnings
without recursion.
- `sktime/transformations/series/tests/test_summarize.py`: Added a new
test `test_summarize_no_lossy_setitem` to confirm that
`SummaryTransformer.fit` does not raise `LossySetitemError`.

### Related
- Investigated in PyCaret PR
[#4150](pycaret/pycaret#4150)

### Verification
Tested locally with PyCaret and sktime, resolving the crash in the
`test_blend_model_predict` test. The change in `summarize.py` ensures
compatibility with pandas 2.x while maintaining efficiency with
`copy=False`.
@celestinoxp celestinoxp marked this pull request as ready for review March 3, 2025 13:39
@celestinoxp
Copy link
Contributor Author

celestinoxp commented Mar 3, 2025

@moezali1 @Yard1 @ngupta23 @fkiraly @amotl @PabloJMoreno I made the necessary changes for the tests to pass. From here on, each change must have its specific PR.
Anyone interested in new corrections or improvements should contact me via slack or email.

@amotl
Copy link
Contributor

amotl commented Mar 3, 2025

@celestinoxp: This is so sweet. Thank you for your excellent work on this. 🌻

PranavBhatP pushed a commit to PranavBhatP/sktime that referenced this pull request Mar 5, 2025
…etitemError` in `SummaryTransformer` (sktime#7903)

### Issue
The `_custom_showwarning` method in `sktime/utils/warnings.py` causes
infinite recursion when handling certain warnings (e.g., pandas
`LossySetitemError`), resulting in a `RecursionError: maximum recursion
depth exceeded`. Additionally, in
`sktime/transformations/series/summarize.py`, the line `func_dict.loc[:,
"window"] = func_dict["window"].astype("object")` throws a
`LossySetitemError` in pandas 2.x due to incompatible type coercion.

This issue was identified while using PyCaret with sktime, where tests
failed with a crash in a joblib worker due to a `PicklingError` caused
by excessive recursion.

### Solution
- Modified the `_custom_showwarning` method in the
`_SuppressWarningPattern` class to add a guard against recursion using
an `_in_warning` flag, while preserving delegation to
`original_showwarning`.
- Fixed `SummaryTransformer` to use a type-safe conversion with
`func_dict["window"] = func_dict["window"].astype("object",
copy=False)`, avoiding the `LossySetitemError`.

### Changes
- `sktime/utils/warnings.py`: Added protection against infinite
recursion in the `_custom_showwarning` method of
`_SuppressWarningPattern` using an `_in_warning` flag.
- `sktime/transformations/series/summarize.py`: Replaced the problematic
assignment with an explicit conversion of the `"window"` column to
`object` using `copy=False` for greater efficiency.
- `sktime/utils/tests/test_warnings.py`: Added a new test file to verify
that `_custom_showwarning` in `_SuppressWarningPattern` emits warnings
without recursion.
- `sktime/transformations/series/tests/test_summarize.py`: Added a new
test `test_summarize_no_lossy_setitem` to confirm that
`SummaryTransformer.fit` does not raise `LossySetitemError`.

### Related
- Investigated in PyCaret PR
[sktime#4150](pycaret/pycaret#4150)

### Verification
Tested locally with PyCaret and sktime, resolving the crash in the
`test_blend_model_predict` test. The change in `summarize.py` ensures
compatibility with pandas 2.x while maintaining efficiency with
`copy=False`.
@ngupta23 ngupta23 enabled auto-merge (squash) March 6, 2025 20:05
@ngupta23 ngupta23 merged commit 58ec3c2 into pycaret:master Mar 6, 2025
21 of 22 checks passed
Spinachboul pushed a commit to Spinachboul/sktime that referenced this pull request Mar 23, 2025
…etitemError` in `SummaryTransformer` (sktime#7903)

### Issue
The `_custom_showwarning` method in `sktime/utils/warnings.py` causes
infinite recursion when handling certain warnings (e.g., pandas
`LossySetitemError`), resulting in a `RecursionError: maximum recursion
depth exceeded`. Additionally, in
`sktime/transformations/series/summarize.py`, the line `func_dict.loc[:,
"window"] = func_dict["window"].astype("object")` throws a
`LossySetitemError` in pandas 2.x due to incompatible type coercion.

This issue was identified while using PyCaret with sktime, where tests
failed with a crash in a joblib worker due to a `PicklingError` caused
by excessive recursion.

### Solution
- Modified the `_custom_showwarning` method in the
`_SuppressWarningPattern` class to add a guard against recursion using
an `_in_warning` flag, while preserving delegation to
`original_showwarning`.
- Fixed `SummaryTransformer` to use a type-safe conversion with
`func_dict["window"] = func_dict["window"].astype("object",
copy=False)`, avoiding the `LossySetitemError`.

### Changes
- `sktime/utils/warnings.py`: Added protection against infinite
recursion in the `_custom_showwarning` method of
`_SuppressWarningPattern` using an `_in_warning` flag.
- `sktime/transformations/series/summarize.py`: Replaced the problematic
assignment with an explicit conversion of the `"window"` column to
`object` using `copy=False` for greater efficiency.
- `sktime/utils/tests/test_warnings.py`: Added a new test file to verify
that `_custom_showwarning` in `_SuppressWarningPattern` emits warnings
without recursion.
- `sktime/transformations/series/tests/test_summarize.py`: Added a new
test `test_summarize_no_lossy_setitem` to confirm that
`SummaryTransformer.fit` does not raise `LossySetitemError`.

### Related
- Investigated in PyCaret PR
[sktime#4150](pycaret/pycaret#4150)

### Verification
Tested locally with PyCaret and sktime, resolving the crash in the
`test_blend_model_predict` test. The change in `summarize.py` ensures
compatibility with pandas 2.x while maintaining efficiency with
`copy=False`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants