且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

Antlr 的优点(相对于 lex/yacc/bison)

更新时间:2022-06-27 00:41:00

更新/警告:此答案可能已过时!


一个主要区别是 ANTLR 生成一个 LL(*) 解析器,而 YACC 和 Bison 生成的解析器都是 LALR.这是许多应用程序的重要区别,最明显的是运算符:

Update/warning: This answer may be out of date!


One major difference is that ANTLR generates an LL(*) parser, whereas YACC and Bison both generate parsers that are LALR. This is an important distinction for a number of applications, the most obvious being operators:

expr ::= expr '+' expr
       | expr '-' expr
       | '(' expr ')'
       | NUM ;

ANTLR 完全无法按原样处理这种语法.要使用 ANTLR(或任何其他 LL 解析器生成器),您需要将此语法转换为非左递归的语法.但是,Bison 对这种形式的语法没有问题.您需要将+"和-"声明为左关联运算符,但这并不是左递归所严格要求的.一个更好的例子可能是调度:

ANTLR is entirely incapable of handling this grammar as-is. To use ANTLR (or any other LL parser generator), you would need to convert this grammar to something that is not left-recursive. However, Bison has no problem with grammars of this form. You would need to declare '+' and '-' as left-associative operators, but that is not strictly required for left recursion. A better example might be dispatch:

expr ::= expr '.' ID '(' actuals ')' ;

actuals ::= actuals ',' expr | expr ;

注意 expractuals 规则都是左递归的.当需要生成代码时,这会产生更高效的 AST,因为它避免了对多个寄存器的需求和不必要的溢出(左倾树可以折叠,而右倾树不能).

Notice that both the expr and the actuals rules are left-recursive. This produces a much more efficient AST when it comes time for code generation because it avoids the need for multiple registers and unnecessary spilling (a left-leaning tree can be collapsed whereas a right-leaning tree cannot).

就个人品味而言,我认为 LALR 语法更容易构建和调试.缺点是你必须处理一些神秘的错误,比如 shift-reduce 和(可怕的)reduce-reduce.这些是 Bison 在生成解析器时捕获的错误,因此不会影响最终用户体验,但可以使开发过程更有趣一些.正是因为这个原因,ANTLR 通常被认为比 YACC/Bison 更容易使用.

In terms of personal taste, I think that LALR grammars are a lot easier to construct and debug. The downside is you have to deal with somewhat cryptic errors like shift-reduce and (the dreaded) reduce-reduce. These are errors that Bison catches when generating the parser, so it doesn't affect the end-user experience, but it can make the development process a bit more interesting. ANTLR is generally considered to be easier to use than YACC/Bison for precisely this reason.