将结果与python中的逻辑回归相混淆

更新时间：2022-05-28 23:30:27

问题"是scikit-learn中的LogisticRegression使用有关逻辑回归的sklearn用户指南，以了解实施细节.

The "problem" is that LogisticRegression in scikit-learn uses L2-regularization (aka Tikhonov regularization, aka Ridge, aka normal prior). Please read sklearn user guide about logistic regression for implementational details.

实际上，这意味着LogisticRegression具有参数C，默认情况下等于1. C越小，则正则化越多-这意味着coef_变小，而intercept_变大，这增加了数值稳定性并减少了过拟合.

In practice, it means that LogisticRegression has a parameter C, which by default equals 1. The smaller C, the more regularization there is - it means, coef_ grows smaller, and intercept_ larger, which increases numerical stability and reduces overfitting.

如果将C设置得很大，则正则化的效果将消失.与

If you set C very large, the effect of regularization will vanish. With

lr = LogisticRegression(C=100500000)

您分别获得coef_和intercept _

you get coef_ and intercept_ respectively

[[ 1.50464535]]
[-4.07771322]

就像Wikipedia文章中一样.

just like in the Wikipedia article.

更多理论.过度拟合是一个有很多功能但没有太多示例的问题.一个简单的经验法则:如果n_obs/n_features小于10，则使用小C.在Wiki示例中，有一个功能和20个观察值，因此即使大C，简单的逻辑回归也不会过拟合.

Some more theory. Overfitting is a problem where there are lots of features, but not too much examples. A simple rule of thumb: use small C, if n_obs/n_features is less that 10. In the wiki example, there is one feature and 20 observations, so simple logistic regression would not overfit even with large C.

小型C的另一个用例是收敛问题.如果正例和负例可以完全分开，或者在多重共线性的情况下(如果n_obs/n_features较小，则可能性更大)，它们可能会发生，并且在非正规化的情况下会导致系数的无限增长.

Another use case for small C is convergence problems. They may happen if positive and negative examples can be perfectly separated or in case of multicollinearity (which again is more likely if n_obs/n_features is small), and lead to infinite growth of coefficient in the non-regularized case.

上一篇 : ：有没有办法覆盖 ConfigurationManager.AppSettings?下一篇 : 有没有办法从ElementTree元素获取行号

将结果与python中的逻辑回归相混淆

相关阅读

技术问答最新文章