Manually inlining an inlined nonterminal results in a difference of behaviour
Hi,
I am currently using Menhir and I noticed an unexpected behaviour. The following file is accepted without any error or warning message through the menhir file.mly --infer
command line:
%token NEW_LINE
%token LPAR RPAR
%token UNIT
%start<unit> main
%%
main:
| e = expr_or_assign (empty) { e }
expr_or_assign (el):
| e = expr (el) { e }
%inline expr_or_assign_cr:
| e = expr_or_assign (cr) { e }
expr (el):
| el; p = LPAR; e = expr_or_assign_cr; cr; RPAR { e }
| e = UNIT { e }
cr:
| NEW_LINE cr { }
| { }
empty:
| { }
%%
However, it complains in the following one. The only difference is that I have manually inlined the nonterminal expr_or_assign_cr, which was already marked as an inlined nonterminal.
%token NEW_LINE
%token LPAR RPAR
%token UNIT
%start<unit> main
%%
main:
| e = expr_or_assign (empty) { e }
expr_or_assign (el):
| e = expr (el) { e }
expr (el):
| el; p = LPAR; e = expr_or_assign (cr); cr; RPAR { e }
| e = UNIT { e }
cr:
| NEW_LINE cr { }
| { }
empty:
| { }
%%
Here is the error message:
File "test2.mly", line 17, characters 22-36:
Error: mutually recursive definitions must have the same parameters.
This is not the case for expr_or_assign and expr.
To provide some context, I am currently working on formalising R, and I wanted in particular to translate its parser with (if possible) a one-to-one correspondence with its own Bison parser. The original parser can be found at https://github.com/wch/r-source/blob/trunk/src/main/gram.y, and my attempt at translating it at https://github.com/Mbodin/proveR/blob/master/low/parser.mly. The issue is that the original Bison parser is using side effects on its lexer through a global variable EatLines, as well as storing global stack information in a global variable contextp. I find these two side effects very confusing and I wanted to try removing them by incorporating them into the grammar, thanks to the possibility in Menhir to parameterise nonterminals. I thus added an argument “el”, whose value can either be “cr” (meaning that Eatlines is true in the original Bison parser) or “empty” (meaning that Eatlines is false).
I guess that this error is here to avoid Menhir looping when generating the set of states. But in this case, the set of state is quite small (as “el” can only take the values “cr” or “empty”, it at maximum doubles the number of generated state). In this example, we can easily introduce a nonterminal “expr_or_assign_cr” to avoid the problem. My surprise then arised when I tried to add the “%inline” to this nonterminal, as it was accepted by Menhir. I thus have the feeling that either this warning message was unnecessary in this case, or that some warnings are missing about the “%inline” flag. Anyway, this is confusing.
I have installed Menhir through Opam. Here are some useful information about my system.
$ menhir --version
menhir, version 20171013
$ uname -a
Linux tupungato 4.4.0-101-generic #124-Ubuntu SMP Fri Nov 10 18:29:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Best, Martin Bodin.