且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

map() 返回 LIST 时出现语法错误

更新时间:2022-06-20 22:36:31

为什么 perl 认为这应该是 map EXPR, LIST 而不是 map BLOCK LIST?

相关部分代码在toke.c,Perl的词法分析器(以下来自Perl 5.22.0):

The relevant section of code is in toke.c, Perl's lexer (the below is from Perl 5.22.0):

/* This hack serves to disambiguate a pair of curlies
 * as being a block or an anon hash.  Normally, expectation
 * determines that, but in cases where we're not in a
 * position to expect anything in particular (like inside
 * eval"") we have to resolve the ambiguity.  This code
 * covers the case where the first term in the curlies is a
 * quoted string.  Most other cases need to be explicitly
 * disambiguated by prepending a "+" before the opening
 * curly in order to force resolution as an anon hash.
 *
 * XXX should probably propagate the outer expectation
 * into eval"" to rely less on this hack, but that could
 * potentially break current behavior of eval"".
 * GSAR 97-07-21
 */
t = s;
if (*s == '\'' || *s == '"' || *s == '`') {
    /* common case: get past first string, handling escapes */
    for (t++; t < PL_bufend && *t != *s;)
        if (*t++ == '\\')
            t++;
    t++;
}
else if (*s == 'q') {
    if (++t < PL_bufend
        && (!isWORDCHAR(*t)
            || ((*t == 'q' || *t == 'x') && ++t < PL_bufend
                && !isWORDCHAR(*t))))
    {   
        /* skip q//-like construct */
        const char *tmps;
        char open, close, term;
        I32 brackets = 1;

        while (t < PL_bufend && isSPACE(*t))
            t++;
        /* check for q => */
        if (t+1 < PL_bufend && t[0] == '=' && t[1] == '>') {
            OPERATOR(HASHBRACK);
        }
        term = *t;
        open = term;
        if (term && (tmps = strchr("([{< )]}> )]}>",term)))
            term = tmps[5];
        close = term;
        if (open == close)
            for (t++; t < PL_bufend; t++) {
                if (*t == '\\' && t+1 < PL_bufend && open != '\\')
                    t++;
                else if (*t == open)
                    break;
            }
        else {
            for (t++; t < PL_bufend; t++) {
                if (*t == '\\' && t+1 < PL_bufend)
                    t++;
                else if (*t == close && --brackets <= 0)
                    break;
                else if (*t == open)
                    brackets++;
            }
        }
        t++;
    }
    else
        /* skip plain q word */
        while (t < PL_bufend && isWORDCHAR_lazy_if(t,UTF))
             t += UTF8SKIP(t);
}
else if (isWORDCHAR_lazy_if(t,UTF)) {
    t += UTF8SKIP(t);
    while (t < PL_bufend && isWORDCHAR_lazy_if(t,UTF))
         t += UTF8SKIP(t);
}
while (t < PL_bufend && isSPACE(*t))
    t++;
/* if comma follows first term, call it an anon hash */
/* XXX it could be a comma expression with loop modifiers */
if (t < PL_bufend && ((*t == ',' && (*s == 'q' || !isLOWER(*s)))
                   || (*t == '=' && t[1] == '>')))
    OPERATOR(HASHBRACK);
if (PL_expect == XREF)
{
  block_expectation:
    /* If there is an opening brace or 'sub:', treat it
       as a term to make ${{...}}{k} and &{sub:attr...}
       dwim.  Otherwise, treat it as a statement, so
       map {no strict; ...} works.
     */
    s = skipspace(s);
    if (*s == '{') {
        PL_expect = XTERM;
        break;
    }
    if (strnEQ(s, "sub", 3)) {
        d = s + 3;
        d = skipspace(d);
        if (*d == ':') {
            PL_expect = XTERM;
            break;
        }
    }
    PL_expect = XSTATE;
}
else {
    PL_lex_brackstack[PL_lex_brackets-1] = XSTATE;
    PL_expect = XSTATE;
}


说明

如果开头卷曲后的第一个术语是字符串(由 '"` 分隔)或裸字开头用大写字母,后面的术语是 ,=>,卷曲被视为匿名散列的开头(这就是 OPERATOR(HASHBRACK)); 表示).


Explanation

If the first term after the opening curly is a string (delimited by ', ", or `) or a bareword beginning with a capital letter, and the following term is , or =>, the curly is treated as the beginning of an anonymous hash (that's what OPERATOR(HASHBRACK); means).

其他情况对我来说有点难以理解.我通过 gdb 运行了以下程序:

The other cases are a little harder for me to understand. I ran the following program through gdb:

{ (x => 1) }

并在最后的 else 块中结束:

and ended up in the final else block:

else {
    PL_lex_brackstack[PL_lex_brackets-1] = XSTATE;
    PL_expect = XSTATE;
}

可以这么说,执行路径明显不同;它最终被解析为一个块.

Suffice it to say, the execution path is clearly different; it ends up being parsed as a block.