音乐是人生的艺术
1
移进/
规约
When yacc processes a parser, it creates a set of
states
each of which reflects a possible
position
in one or more
partially parsed
rules.
As the parser reads tokens, each time it reads a token that
doesn’t
complete a rule
it pushes the token on an
internal stack
and switches to
a new state
reflecting the token it just read. This action is called a
shift
.
When it has found
all the symbols
that constitute the right-hand side of a rule, it pops the right-hand side symbols off the stack,
pushes
the left-hand side symbol onto the stack, and switches to a new state reflecting the new symbol on the stack. This action is called a
reduction
, since it usually reduces the number of items on the stack. (
Not always, since it is possible to have rules with empty right-hand sides
.)
Whenever yacc reduces a rule, it executes user code associated with the rule.
This is how you actually do something with the material that the parser parses.
Often the action code builds a parse tree corresponding to the input, so that later code can process a whole statement or even a whole program at a time。—– crow_bar语言就是这个样子的。
《
lex & yacc
》书上举的一个有
移进
—
规约
冲突的例子:
Shift指的是将一个token加入堆栈,reduce指的是将堆栈中现存的进行一次替换:
Parser是可以提前向后看一个token的,这时它会提前看到3后面的*,从而有下面的两种选择:
书中给出的解决方案有两个,一个是利用yacc的语法规则去强行指定运算的优先级(书中不推荐使用这种),另一种是用下面类似分层的方法:
为什么用上面这种方法改写了之后,就可以让乘法的优先级更高而消掉不确定性,
需要进一步探索
yacc
。
实验:用下面的例子没有移进
–
规约冲突,乘法优先级别高于加法的例子做实验:
编译运行后,输入2+3*4 计算过程:
从运行结果来看,
parser
接受到
3
之后,将
3
规约为
term
之后,就不继续将其规约为
expression
了,为什么呢?这是
yacc
的一个语法规则?我还没发现?
有帖子说 yacc先规约小的规则:
https://blog.csdn.net/nosources/article/details/38943265
, term向对于expression是更小更底层的规则。
另,书《lex and yacc》的第8章中有对移进规约冲突的进一步解释,已经看了一部分,继续看可能有收获。
实验用.y
文件:
%{
#include <stdio.h>
#include <stdlib.h>
#define YYDEBUG 1
%}
%union {
int int_value;
double double_value;
}
%token <double_value> DOUBLE_LITERAL
%token ADD SUB MUL DIV CR
%type <double_value> expression term primary
%%
expression
:expression ADD term
{
$$ = $1 + $3;
printf("expression: expression ADD term %lf + %lf = %lf\n", $1, $3, $$);
}
| expression SUB term
{
$$ = $1 - $3;
}
| term
{
printf("expression: term %lf\n", $$);
}
;
term
:term MUL primary
{
$$ = $1 * $3;
printf("term: term MUL primary %lf * %lf = %lf\n", $1, $3, $$);
}
| primary{printf("term: primary %lf\n",$$);}
;
primary
:
DOUBLE_LITERAL{printf("primary: DOUBLE_LITERAL %lf\n",$$);}
%%
int
yyerror(char const *str)
{
extern char *yytext;
//fprintf(stderr, "parser error near %s\n", yytext);
return 0;
}
int main(void)
{
extern int yyparse(void);
extern FILE *yyin;
yyin = stdin;
if (yyparse()) {
fprintf(stderr, "Error ! Error ! Error !\n");
exit(1);
}
}
实验用.l
文件:
%{
#include <stdio.h>
#include "calcul.tab.h"
int
yywrap(void)
{
return 1;
}
%}
%%
"+" {printf("flex found + \n");return ADD;}
"-" return SUB;
"*" {printf("flex found * \n");return MUL;}
"/" return DIV;
"\n" return CR;
([1-9][0-9]*)|0|([0-9]+\.[0-9]*) {
double temp;
sscanf(yytext, "%lf", &temp);
/* yylval是yacc的变量,对应.y文件中定义的union类型,在yacc生成的calcul.tab.h文件中有 extern YYSTYPE yylval;这个变量用于flex与 yacc之间传递数据,yyval传递到yacc后对应我们使用的$1,$2*/
yylval.double_value = temp;
return DOUBLE_LITERAL;
}
[ \t] ;
. {
fprintf(stderr, "lexical error.\n");
exit(1);
}
%%
Ref:
《lex and yacc–second edition
》 –
作者:John R. Levine