且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

测试示例集属性应等于训练示例集 Rapidminer SVM 的 OR Superset

更新时间:2023-02-18 12:29:56

Nominal to Numeric 运算符将创建新属性,其名称将从输入属性的值中派生.当 dummy encoding 用于 coding type 参数时会发生这种情况.如果与训练数据相比,测试数据包含不同的值,则结果属性将不同.

The Nominal to Numeric operator will make new attributes whose names will be derived from the values of the input attributes. This happens when dummy encoding is used for the coding type parameter. If the test data contains different values when compared to the training data then the resulting attributes will be different.

要确认这是问题所在,请在 Nominal to Numeric 运算符之后设置断点并检查每个示例集的属性.

To confirm this is the problem, set a breakpoint after the Nominal to Numeric operators and examine the attributes of each example set.

您可以通过将参数设置为 unique integers 来更改运算符的工作方式,但这可能不适合您要解决的问题.

You can change how the operator works by setting the parameter to unique integers but this might not suit the problem you are trying to solve.

一种可能的解决方法是合并两个数据集,然后再次拆分它们.这具有为每个名义属性创建允许级别的效果,即使数据可能没有值的示例.然后每个拆分都可以与 Nominal to Numeric 运算符一起使用,它应该创建所有必需的属性.

One possible way to solve it is to combine the two data sets then split them again. This has the effect of creating allowed levels for each nominal attribute even though the data may not have an example of the value. Each split can then be used with the Nominal to Numeric operator and it should create all the required attributes.