更新时间:2023-12-01 08:03:58
另一种可能性是将一个功能级别专门用于训练数据而不是测试数据.这种情况大多发生后,同时一个热码,其结果是大矩阵具有的类别特征,每个级别的水平.在你的情况下,它看起来像f5232"是在训练或测试数据是独占的.如果这两种情况下的模型评分都可能引发错误(在ML包的大多数实现中),原因是:
One another possibility is to have one feature level exclusively in training data not in testing data. This situation happens mostly while post one hot encoding whose resultant is big matrix have level for each level of categorical features. In your case it looks like "f5232" is either exclusive in training or test data. If either case model scoring likely to throw error (in most implementations of ML packages) because:
解决方案: