更新时间:2022-02-12 08:35:43
OneHotEncoder
不支持字符串功能,并且使用[(d, OneHotEncoder()) for d in dummies]
会将其应用于所有假人列.使用LabelBinarizer
代替:
OneHotEncoder
doesn't support string features, and with [(d, OneHotEncoder()) for d in dummies]
you are applying it to all dummies columns. Use LabelBinarizer
instead:
mapper = DataFrameMapper(
[(d, LabelBinarizer()) for d in dummies]
)
另一种选择是将LabelEncoder
与第二个OneHotEncoder
步骤一起使用.
An alternative would be to use the LabelEncoder
with a second OneHotEncoder
step.
mapper = DataFrameMapper(
[(d, LabelEncoder()) for d in dummies]
)
lm = PMMLPipeline([("mapper", mapper),
("onehot", OneHotEncoder()),
("regressor", LinearRegression())])