Closed
Description
Hello Scikit-learn team,
I am encountering an issue while running inference VotingClassifier model with voting="hard"
argument, I found that this issue may related to NEP 34 restriction of dtype=object
in numpy and the solution is downgrading to numpy 1.23.1
. However, it doesn't work in my case due to dependency conflicts with pandas and other packages. I'd appreciate if you could analyze this issue and provide an update when possible.
Traceback (most recent call last):
File "/home/mtoan65/Documents/Sentiment_Analysis/training.py", line 135, in <module>
ensemble_model, trained_models, model_results, ensemble_results = main(sparse=False)
^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/training.py", line 127, in main
trained_ensemble, ensemble_results = train_ensemble_model(
^^^^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/training.py", line 89, in train_ensemble_model
ensemble_results, trained_ensemble = train_and_evaluate_ensemble(voting_clf, X_train, X_test, y_train, y_test)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/training/ensemble_trainer.py", line 33, in train_and_evaluate_ensemble
y_pred_ensemble = voting_clf.predict(X_test)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/.venv/lib/python3.11/site-packages/sklearn/ensemble/_voting.py", line 443, in predict
predictions = self._predict(X)
^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/.venv/lib/python3.11/site-packages/sklearn/ensemble/_voting.py", line 80, in _predict
return np.asarray([est.predict(X) for est in self.estimators_]).T
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (6, 33810) + inhomogeneous part.
Steps/Code to Reproduce
try:
main_logger.info("Training ensemble")
voting_clf.fit(X_train, y_train)
main_logger.info("Evaluating ensemble")
y_pred_ensemble = voting_clf.predict(X_test)
results = classification_report(y_test, y_pred_ensemble, output_dict=True)
main_logger.info(f"Ensemble Results:\n{classification_report(y_test, y_pred_ensemble)}")
return results, voting_clf
except Exception as e:
main_logger.error(f"Error in ensemble training: {str(e)}")
raise
Expected Results
Finish training
Actual Results
Traceback (most recent call last):
File "/home/mtoan65/Documents/Sentiment_Analysis/training.py", line 135, in <module>
ensemble_model, trained_models, model_results, ensemble_results = main(sparse=False)
^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/training.py", line 127, in main
trained_ensemble, ensemble_results = train_ensemble_model(
^^^^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/training.py", line 89, in train_ensemble_model
ensemble_results, trained_ensemble = train_and_evaluate_ensemble(voting_clf, X_train, X_test, y_train, y_test)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/training/ensemble_trainer.py", line 33, in train_and_evaluate_ensemble
y_pred_ensemble = voting_clf.predict(X_test)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/.venv/lib/python3.11/site-packages/sklearn/ensemble/_voting.py", line 443, in predict
predictions = self._predict(X)
^^^^^^^^^^^^^^^^
File "/home/mtoan65/Documents/Sentiment_Analysis/.venv/lib/python3.11/site-packages/sklearn/ensemble/_voting.py", line 80, in _predict
return np.asarray([est.predict(X) for est in self.estimators_]).T
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (6, 33810) + inhomogeneous part.
Versions
1.5.2