且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

在Scikit-learn中将smote与Gridsearchcv一起使用

更新时间:2022-11-28 22:52:48

可以,但是可以使用您会看到,imblearn有自己的管道来正确处理采样器.我在此处有类似问题中对此进行了描述.

You see, imblearn has its own Pipeline to handle the samplers correctly. I described this in a similar question here.

imblearn.Pipeline对象上调用predict()时,它将跳过采样方法,并保留要传递给下一个转换器的数据. 您可以通过查看源代码来确认代码在这里:

When called predict() on a imblearn.Pipeline object, it will skip the sampling method and leave the data as it is to be passed to next transformer. You can confirm that by looking at the source code here:

        if hasattr(transform, "fit_sample"):
            pass
        else:
            Xt = transform.transform(Xt)

因此,要使其正常工作,您需要执行以下操作:

So for this to work correctly, you need the following:

from imblearn.pipeline import Pipeline
model = Pipeline([
        ('sampling', SMOTE()),
        ('classification', LogisticRegression())
    ])

grid = GridSearchCV(model, params, ...)
grid.fit(X, y)

根据需要填写详细信息,管道将负责其余的工作.

Fill the details as necessary, and the pipeline will take care of the rest.