You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am a graduate student utilizing sklean for some data work.
And when I handle the data using the Decision Trees library, I found there are some inconvenience:
Neither the classificationTree nor the regressionTree supports categorical feature. That means the Decision trees model can only accept continuous feature.
For example, the categorical feature like app name such as google, facebook can not be input into the model, because they can not be transformed to continuous value properly. And there don not exist a corresponding algorithm to divide discrete feature in the Decision Trees library.
However, the CART algorithm itself has considered the use of categorical feature. So I have made some modification of Decision Trees library based on CART and apply the new model on my own work. And it proves that the support for categorical feature indeed improves the performance, which is very necessary for decision tree, I think.
I am very willing to contribute this to sklearn community, but I`m new to this community, not so familiar about the procedure.
Could u give some suggestions or comments on this new feature? Or has anyone already processed on this feature? Thank you so much.
The text was updated successfully, but these errors were encountered:
using dummies in decision trees is not the same as categorical handling in
CART. nor is using dummies especially beneficial in decision trees.
#4899 remains the primary proposal for categorical support implementation.
But it has been awaiting review and some work for a long time. Anyone able
to help review it should please do so.
Closing as duplicate
Uh oh!
There was an error while loading. Please reload this page.
Hi, I am a graduate student utilizing sklean for some data work.
And when I handle the data using the Decision Trees library, I found there are some inconvenience:
Neither the classificationTree nor the regressionTree supports categorical feature. That means the Decision trees model can only accept continuous feature.
For example, the categorical feature like app name such as google, facebook can not be input into the model, because they can not be transformed to continuous value properly. And there don not exist a corresponding algorithm to divide discrete feature in the Decision Trees library.
However, the CART algorithm itself has considered the use of categorical feature. So I have made some modification of Decision Trees library based on CART and apply the new model on my own work. And it proves that the support for categorical feature indeed improves the performance, which is very necessary for decision tree, I think.
I am very willing to contribute this to sklearn community, but I`m new to this community, not so familiar about the procedure.
Could u give some suggestions or comments on this new feature? Or has anyone already processed on this feature? Thank you so much.
The text was updated successfully, but these errors were encountered: