lex&yacc系列(4)— yacc语法分析探索及calculator实例

  • Post author:
  • Post category:其他




音乐是人生的艺术



1


移进/


规约

When yacc processes a parser, it creates a set of

states

each of which reflects a possible

position

in one or more

partially parsed

rules.

As the parser reads tokens, each time it reads a token that

doesn’t


complete a rule

it pushes the token on an

internal stack

and switches to

a new state

reflecting the token it just read. This action is called a


shift


.

When it has found

all the symbols

that constitute the right-hand side of a rule, it pops the right-hand side symbols off the stack,

pushes

the left-hand side symbol onto the stack, and switches to a new state  reflecting the new symbol on the stack. This action is called a


reduction


, since it usually reduces the number of items on the stack. (

Not always, since it is possible to have rules with empty right-hand sides

.)

Whenever yacc reduces a rule, it executes user code associated with the rule.

This is how you actually do something with the material that the parser parses.

Often the action code builds a parse tree corresponding to the input, so that later code can process a whole statement or even a whole program at a time。—– crow_bar语言就是这个样子的。






lex & yacc


》书上举的一个有




移进





规约




冲突的例子:

Shift指的是将一个token加入堆栈,reduce指的是将堆栈中现存的进行一次替换:

Parser是可以提前向后看一个token的,这时它会提前看到3后面的*,从而有下面的两种选择:

书中给出的解决方案有两个,一个是利用yacc的语法规则去强行指定运算的优先级(书中不推荐使用这种),另一种是用下面类似分层的方法:

为什么用上面这种方法改写了之后,就可以让乘法的优先级更高而消掉不确定性,


需要进一步探索




yacc








实验:用下面的例子没有移进





规约冲突,乘法优先级别高于加法的例子做实验:

编译运行后,输入2+3*4 计算过程:


从运行结果来看,


parser


接受到


3


之后,将


3


规约为


term


之后,就不继续将其规约为


expression


了,为什么呢?这是


yacc


的一个语法规则?我还没发现?

有帖子说 yacc先规约小的规则:



https://blog.csdn.net/nosources/article/details/38943265



, term向对于expression是更小更底层的规则。

另,书《lex and yacc》的第8章中有对移进规约冲突的进一步解释,已经看了一部分,继续看可能有收获。


实验用.y


文件:

%{
#include <stdio.h>
#include <stdlib.h>
#define YYDEBUG 1
%}
%union {
    int          int_value;
    double       double_value;
}
%token <double_value>      DOUBLE_LITERAL
%token ADD SUB MUL DIV CR
%type <double_value> expression term primary
%% 

expression
    :expression ADD term
    {
        $$ = $1 + $3;
	printf("expression: expression ADD term %lf + %lf = %lf\n", $1, $3, $$);
    }
    | expression SUB term
    {
        $$ = $1 - $3;
    }
    | term
    {
        printf("expression: term  %lf\n", $$);
    }
    ;

term
    :term MUL primary
    {
        $$ = $1 * $3;
        printf("term: term MUL primary %lf * %lf = %lf\n", $1, $3, $$);
    }
    | primary{printf("term: primary %lf\n",$$);}
    ;

primary
    :
    DOUBLE_LITERAL{printf("primary: DOUBLE_LITERAL %lf\n",$$);} 
       
%%
int
yyerror(char const *str)
{
    extern char *yytext;
    //fprintf(stderr, "parser error near %s\n", yytext);
    return 0;
}

int main(void)
{
    extern int yyparse(void);
    extern FILE *yyin;

    yyin = stdin;
    if (yyparse()) {
        fprintf(stderr, "Error ! Error ! Error !\n");
        exit(1);
    }
}


实验用.l


文件:

%{
#include <stdio.h>
#include "calcul.tab.h"

int
yywrap(void)
{
    return 1;
}
%}
%%
"+"             {printf("flex found + \n");return ADD;}
"-"             return SUB;
"*"             {printf("flex found * \n");return MUL;}
"/"             return DIV;
"\n"            return CR;
([1-9][0-9]*)|0|([0-9]+\.[0-9]*) {
    double temp;
    sscanf(yytext, "%lf", &temp);
    /* yylval是yacc的变量,对应.y文件中定义的union类型,在yacc生成的calcul.tab.h文件中有     extern YYSTYPE yylval;这个变量用于flex与 yacc之间传递数据,yyval传递到yacc后对应我们使用的$1,$2*/
    yylval.double_value = temp; 
    return DOUBLE_LITERAL;
}
[ \t] ;
. {
    fprintf(stderr, "lexical error.\n");
    exit(1);
}
%%



Ref:


《lex and yacc–second edition


》 –


作者:John R. Levine



版权声明:本文为qq_35865125原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。