且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

寻找一个C ++实现的C4.5算法

更新时间:2022-06-27 01:05:47

我可能已经找到了的可能C ++的C5.0(See5.0)执行,但我一直没能挖掘到源$ C ​​$ C,足以确定它是否真的像宣传的那样。

I may have found a possible C++ "implementation" of C5.0 (See5.0), but I haven't been able to dig into the source code enough to determine if it really works as advertised.

要重申,我原来的顾虑,港口笔者指出以下有关C5.0算法:

To reiterate my original concerns, the author of the port states the following about the C5.0 algorithm:

与See5Sam [C5.0]的另一个缺点是不可能有多于   一个应用树在相同的时间。应用程序从读   每个运行可执行文件,并存储在全局时间文件   变量在这里和那里。

Another drawback with See5Sam [C5.0] is the impossibility to have more than one application tree at the same time. An application is read from files each time the executable is run and is stored in global variables here and there.

我会尽快更新我的答案,因为我得到一些时间寻找到源$ C ​​$ C。

I will update my answer as soon as I get some time to look into the source code.

它看起来pretty的好,这里是C ++接口:

It's looking pretty good, here is the C++ interface:

class CMee5
{
  public:

    /**
      Create a See 5 engine from tree/rules files.
      \param pcFileStem The stem of the See 5 file system. The engine
             initialisation will look for the following files:
              - pcFileStem.names Vanilla See 5 names file (mandatory)
              - pcFileStem.tree or pcFileStem.rules Vanilla See 5 tree or rules
                file (mandatory)
              - pcFileStem.costs Vanilla See 5 costs file (mandatory)
    */
    inline CMee5(const char* pcFileStem, bool bUseRules);

    /**
      Release allocated memory for this engine.
    */
    inline ~CMee5();

    /**
      General classification routine accepting a data record.
    */
    inline unsigned int classifyDataRec(DataRec Case, float* pOutConfidence);

    /**
      Show rules that were used to classify the last case.
      Classify() will have set RulesUsed[] to
      number of active rules for trial 0,
      first active rule, second active rule, ..., last active rule,
      number of active rules for trial 1,
      first active rule, second active rule, ..., last active rule,
      and so on.
    */
    inline void showRules(int Spaces);

    /**
      Open file with given extension for read/write with the actual file stem.
    */
    inline FILE* GetFile(String Extension, String RW);

    /**
      Read a raw case from file Df.

      For each attribute, read the attribute value from the file.
      If it is a discrete valued attribute, find the associated no.
      of this attribute value (if the value is unknown this is 0).

      Returns the array of attribute values.
    */
    inline DataRec GetDataRec(FILE *Df, Boolean Train);
    inline DataRec GetDataRecFromVec(float* pfVals, Boolean Train);
    inline float TranslateStringField(int Att, const char* Name);

    inline void Error(int ErrNo, String S1, String S2);

    inline int getMaxClass() const;
    inline int getClassAtt() const;
    inline int getLabelAtt() const;
    inline int getCWtAtt() const;
    inline unsigned int getMaxAtt() const;
    inline const char* getClassName(int nClassNo) const;
    inline char* getIgnoredVals();

    inline void FreeLastCase(void* DVec);
}

我会说,这是我迄今为止发现的***的选择。

I would say that this is the best alternative I've found so far.