且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

C ++向后正则表达式搜索

更新时间:2023-02-17 21:38:22

我没有足够的声誉来评论XD。我看不到以下内容作为答案,它是另一种选择,不过我必须做出答案,否则我将无法到达您。

I have not enough reputation to comment XD. I don't see the following as an answer, its more an alternative, nevertheless I have to make an answer, else I won't reach you.

我想你不会找到使性能独立于位置的窍门(猜测这种简单的正则表达式或其他东西的线性关系)。

I guess you won't find a trick to make performance independent of the position (guess its going linear for such simple regex or whatever).

一个非常简单的解决方案是用例如

A very simple solution is to replace this horrible regex lib with e.g. the posix regex.h (old but gold ;) or boost regex.

这里是一个示例:

#include <iostream>
#include <regex>
#include <regex.h>
#include <chrono>
#include <boost/regex.hpp>
inline auto now = std::chrono::steady_clock::now;
inline auto toMs = [](auto &&x){
    return std::chrono::duration_cast<std::chrono::milliseconds>(x).count();
};

void cregex(std::string const&s, std::string const&p)
{
    auto start = now();
    regex_t r;
    regcomp(&r,p.data(),REG_EXTENDED);
    std::vector<regmatch_t> m(r.re_nsub+1);
    regexec(&r,s.data(),m.size(),m.data(),0);
    regfree(&r);
    std::cout << toMs(now()-start) << "ms " << std::string{s.cbegin()+m[1].rm_so,s.cbegin()+m[1].rm_eo} << std::endl;
}

void cxxregex(std::string const&s, std::string const&p)
{
    using namespace std;
    auto start = now();
    regex r(p.data(),regex::extended);
    smatch m;
    regex_search(s.begin(),s.end(),m,r);
    std::cout << toMs(now()-start) << "ms " << m[1] << std::endl;
}
void boostregex(std::string const&s, std::string const&p)
{
    using namespace boost;
    auto start = now();
    regex r(p.data(),regex::extended);
    smatch m;
    regex_search(s.begin(),s.end(),m,r);
    std::cout << toMs(now()-start) << "ms " << m[1] << std::endl;
}

int main()
{
    std::string s(100000000,'x');
    std::string s1 = "yolo" + s;
    std::string s2 = s + "yolo";
    std::cout << "yolo + ... -> cregex "; cregex(s1,"^(yolo)");
    std::cout << "yolo + ... -> cxxregex "; cxxregex(s1,"^(yolo)");
    std::cout << "yolo + ... -> boostregex "; boostregex(s1,"^(yolo)");
    std::cout << "... + yolo -> cregex "; cregex(s2,"(yolo)$");
    std::cout << "... + yolo -> cxxregex "; cxxregex(s2,"(yolo)$");
    std::cout << "... + yolo -> boostregex "; boostregex(s2,"(yolo)$");
}

礼物:

yolo + ... -> cregex 5ms yolo
yolo + ... -> cxxregex 0ms yolo
yolo + ... -> boostregex 0ms yolo
... + yolo -> cregex 69ms yolo
... + yolo -> cxxregex 2594ms yolo
... + yolo -> boostregex 62ms yolo