-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Describe the workflow you want to enable
Do we want to add a transformer for pandas datetimes?
We haven't really added much based on pandas yet but this would be a pretty natural thing to add.
You could argue that you can do something like FunctionTransformer(lambda X: X.dt.dayofweek)
or similar for the other features (year, hour, minute, month...) but the problem with that is that you don't get feature names, which is terrible for interpretation.
Featurizing datetimes is super common (the last ~10 datasets I worked on had it) and I think it's a workflow we should make easier.
Describe your proposed solution
Implement a DateTimeTransformer that takes in maybe just a single column (that would work well with ColumnTransformer but is a bit different from other transformers, but quite similar to the CountVectorizer, so maybe if it takes a single column it should be DateTimeVectorizer
) and a list of features to derive, like dayofweek
, dayofyear
etc, but which creates meaningful feature names.
Describe alternatives you've considered, if relevant
An alternative would be to improve attaching feature names to FunctionTransformer
, but this would still require some non-trivial code for datetimes, and they are just very very common.