且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

sklearn LogisticRegressionCV是否将所有数据用于最终模型

更新时间:2023-12-01 23:19:58

您在超级参数和参数之间产生了误解.最终具有CV的所有scikit-learn估计量(例如LogisticRegressionCVGridSearchCVRandomizedSearchCV)都会调整超参数.

You are mistaking between hyper-parameters and parameters. All scikit-learn estimators which have CV in the end, like LogisticRegressionCV, GridSearchCV, or RandomizedSearchCV tune the hyper-parameters.

无法从数据训练中学到超参数.在学习之前设置它们,前提是它们将有助于***学习.有关更多信息,请在此处显示:

Hyper-parameters are not learnt from training on the data. They are set prior to learning assuming that they will contribute to optimal learning. More information is present here:

超参数是无法在内部直接学习的参数 估计量.在scikit-learn中,它们作为参数传递给 估计器类的构造函数.典型的例子包括C, 支持向量分类器的内核和gamma,套索的alpha等等.

Hyper-parameters are parameters that are not directly learnt within estimators. In scikit-learn they are passed as arguments to the constructor of the estimator classes. Typical examples include C, kernel and gamma for Support Vector Classifier, alpha for Lasso, etc.

对于LogisticRegression,C是一个超参数,描述了正则化强度的倒数. C越高,对训练进行的正则化越少.并非C会在培训期间更改.它将得到解决.

In case of LogisticRegression, C is a hyper-parameter which describes the inverse of regularization strength. The higher the C, the less regularization is applied on the training. Its not that C will be changed during training. It will be fixed.

现在进入coef_. coef_包含特征的系数(也称为权重),这些系数在训练过程中学习(并更新).现在,根据C的值(以及构造器中存在的其他超参数),这些值在训练过程中可能会有所不同.

Now coming to coef_. coef_ contains coefficient (also called weights) of the features, which are learnt (and updated) during the training. Now depending on the value of C (and other hyper-parameters present in contructor), these can vary during the training.

现在还有另一个主题,关于如何获得coef_的***初始值,从而使训练更快,更好.多数民众赞成在优化.一些以0-1之间的随机权重开始,其他的以0等开始,依此类推.但是对于您的问题范围,这是不相关的.不使用LogisticRegressionCV.

Now there is another topic on how to get the optimum initial values of coef_, so that the training is faster and better. Thats optimization. Some start with random weights between 0-1, others start with 0, etc etc. But for the scope of your question, that is not relevant. LogisticRegressionCV is not used for that.

这是LogisticRegressionCV的作用:

This is what LogisticRegressionCV does:

  1. 从构造函数中获取不同的C的值(在您的示例中,您传递了1.0).
  2. 对于每个C值,请对提供的数据进行交叉验证,其中对当前折弯的训练数据的LogisticRegression将为fit(),并在测试数据上进行评分.来自所有倍数的测试数据的分数被平均,并且成为当前C的分数.这是对您提供的所有C值完成的,并且将会选择平均得分最高的C.
  3. 现在将所选的C设置为最终的C,并且再次对整个数据(此处为Xdata,ylabels)进行LogisticRegression训练(通过调用fit()).
  1. Get the values of different C from constructor (In your example you passed 1.0).
  2. For each value of C, do the cross-validation of supplied data, in which the LogisticRegression will be fit() on training data of the current fold, and scored on the test data. The scores from test data of all folds are averaged and that becomes the score of the current C. This is done for all C values you provided, and the C with the highest average score will be chosen.
  3. Now the chosen C is set as the final C and LogisticRegression is again trained (by calling fit()) on the whole data (Xdata,ylabels here).

这就是GridSearchCV或LogisticRegressionCV或LassoCV等所有超参数调整器的作用.

Thats what all the hyper-parameter tuners do, be it GridSearchCV, or LogisticRegressionCV, or LassoCV etc.

coef_特征权重的初始化和更新是在算法的fit()函数内部完成的,该功能超出了超参数调整的范围.该优化部分取决于过程的内部优化算法.例如,在LogisticRegression情况下的solver参数.

The initializing and updating of coef_ feature weights is done inside the fit() function of the algorithm which is out of scope for the hyper-parameter tuning. That optimization part is dependent on the internal optimization algorithm of the process. For example solver param in case of LogisticRegression.

希望这可以使事情变得清楚.随时询问是否还有疑问.

Hope this makes things clear. Feel free to ask if still any doubt.