23.5 KB



  • Removed an undeclared dependency of MenhirSdk on Unix. (Reported and fixed by Frédéric Bour.)


  • Menhir now always places OCaml line number directives in the generated .ml file. (Until now, this was done only when --infer was off.) Thus, if a semantic action contains an assert statement, the file name and line number information carried by the Assert_failure exception should now be correct. (Reported by Helmut Brandl.)


  • Changed Menhir's license from QPL to GPLv2. MenhirLib remains under LGPLv2, with a linking exception.

  • Moved the repository to

  • Introduced a new command line switch, --cmly, which causes Menhir to create a .cmly file, containing a description of the grammar and automaton. (Suggested by Frédéric Bour.)

  • Introduced a new library, MenhirSdk, which allows reading a .cmly file. The purpose of this library is to allow external tools to take advantage of the work performed by Menhir's front-end. (Suggested by Frédéric Bour.)

  • Introduced new syntax for attributes in a .mly file. Attributes are ignored by Menhir's back-ends, but are written to .cmly files, thus can be exploited by external tools via MenhirSdk. (Suggested by Frédéric Bour.)

  • The definition of a %public nonterminal symbol can now be split into several parts within a single .mly file. (This used to be permitted only over multiple .mly files.) (Suggested by Frédéric Bour.)

  • New functions in the incremental API: shifts, acceptable, current_state_number.

  • New functions in the incremental API and inspection API: top, pop, pop_many, get, equal, force_reduction, feed, input_needed, state_has_default_reduction, production_index, find_production. (Suggested by Frédéric Bour.)

  • New module MenhirLib.ErrorReports. This module is supposed to offer auxiliary functions that help produce good syntax error messages. This module does not yet contain much functionality and is expected to evolve in the future.

  • Incompatible change in the incremental API: the type env becomes 'a env.

  • Incompatible change in the incremental API: the function has_default_reduction is renamed env_has_default_reduction.

  • The type stack and the function stack in the incremental API are deprecated. The new functions top and pop can be used instead to inspect the parser's stack. The module MenhirLib.General is deprecated as well. Deprecated functionality will be removed in the future.

  • Incompatible change in the incremental API: the type of the function print_stack in the result signature of the functor MenhirLib.Printers.Make changes to 'a env -> unit. (Anyway, as of now, MenhirLib.Printers remains undocumented.)

  • Improved the syntax error message that is displayed when a .mly file is incorrect: the previous and next token are shown.

  • Fixed a bug where the module name Basics was shadowed (that is, if the user's project happened to contain a toplevel module by this name, then it could not be referred to from a .mly file). (Reported by François Thiré.)


  • Add $MENHIR_STDLIB as a way of controlling where Menhir looks for the file standard.mly. This environment variable overrides the installation-time default setting, and is itself overridden by the --stdlib command line switch. (Requested by Jonathan Protzenko.)

  • Makefile fix: filter out '\r' in the output of menhir --suggest-ocamlfind, so that the Makefile works when Menhir is compiled as a Windows executable. (Suggested by Jonathan Protzenko.)


  • Updated the Coq back-end for compatibility with Coq 8.6. (Jacques-Henri Jourdan.)


  • Fix in --only-preprocess-for-ocamlyacc mode: avoid printing newline characters inside a %type declaration, as this is forbidden by ocamlyacc. (Reported by Kenji Maillard.)
  • Fix in --only-preprocess-for-ocamlyacc mode: avoid variable capture caused by ocamlyacc internally translating $i to _i. (Reported by Kenji Maillard.)


  • New command line switch --only-preprocess-for-ocamlyacc, supposed to print the grammar in a form that ocamlyacc can accept. As of now, this feature is incomplete (in particular, support for Menhir's position keywords is missing), untested, and undocumented. It could be removed in the future.


  • Fixes in the output of --only-preprocess:
    • The order of productions is now preserved. (It was not. This matters if there are reduce/reduce conflicts.)
    • %parameter directives are now printed. (They were not).
    • %on_error_reduce directives are now printed. (They were not.)


  • Makefile fix, undoing a change made on 2016/03/03, which caused installation to fail under (some versions of?) Windows where dynamic linking is not supported. (Reported by Andrew Appel.)


  • %on_error_reduce declarations now have implicit priority levels, so as to tell Menhir what to do when two such declarations are applicable. Also, the well-formedness checks on %type and %on_error_reduce declarations have been reinforced.


  • A small change in the generated code (both in the code and table back-ends) so as to avoid OCaml's warning 41. The warning would arise (when compiling a generated parser with OCaml 4.03) because Menhir's exception Error has the same name as the data constructor Error in OCaml's pervasive library. (Reported by Bernhard Schommer.)


  • Anonymous rules now work also when used inside a parameterized rule. (This did not work until now.) When an anonymous rule is hoisted out of a parameterized rule, it may itself become parameterized. Menhir parameterizes it only over the parameters that it actually needs.


  • In the Coq backend, split the largest definitions into smaller ones. This circumvents a limitation of vm_compute on 32 bit machines. This also enables us to perform sharing between definitions, so that the generated files are much smaller.


  • When printing a grammar (which is done by the --only-preprocess options), remove the leading bar |, for compatibility with yacc and bison.


  • In the code back-end, generate type annotations when extracting a semantic value out of the stack. When working with a semantic value of some function type, OCaml would incorrectly warn that this function does not use its argument. This warning should now be gone.


  • Makefile changes, so as to support ocamlbuild 4.03, which seems to have stricter hygiene rules than previous versions.


  • Prevented an incorrect installation that would take place if USE_OCAMLFIND was given during make all but not during make install. Added a command line directive --suggest-ocamlfind.


  • Fixed a severe bug in Menhir 20151110 which (when using the code back-end) could cause a generated parser to crash. Thanks to ygrek for reporting the bug.

  • The code produced by version XXXXXXXX of menhir --table can now be linked only against a matching version of MenhirLib. If an incorrect version of MenhirLib is installed, the OCaml compiler should complain that MenhirLib.StaticVersion.require_XXXXXXXX is undefined.


  • Optimized the computation of $symbolstartpos, based on a couple of assumptions about the lexer. (See the manual.)


  • Modified the treatment of %inline so that the positions that are computed are the same, regardless of whether %inline is used. This property did not hold until now. It now does. Of course, this means that the positions computed by the new Menhir are not the same as those computed by older versions of Menhir.

  • Fixed a bug in the treatment of %inline that would lead to an incorrect position being computed when the caller and callee had a variable by the same name.

  • Modified Menhir so as to compute the start and end positions in the exact same way as ocamlyacc. (There used to be a difference in the treatment of epsilon productions.) Of course, this means that the positions computed by the new Menhir are not the same as those computed by older versions of Menhir. Added the keyword $symbolstartpos so as to simulate Parsing.symbol_start_pos() in the ocamlyacc world. The keyword $startpos sometimes produces a position that is too far off to the left; $symbolstartpos produces a more accurate position.

  • Incompatible change of the incremental API: instead of a unit argument, the entry points (which are named after the start symbols) now require an initial position, which typically should be lexbuf.lex_curr_p.


  • Fix-fix-and-re-fix the Makefile in an attempt to allow installation under opam/Windows. Thanks to Daniel Weil for patient explanations and testing.


  • MenhirLib is now installed in both binary and source forms. menhir --suggest-menhirLib reports where MenhirLib is installed. This can be used to retrieve a snapshot of MenhirLib in source form and include it in your project (if you wish to use --table mode, yet do not wish to have a dependency on MenhirLib).


  • Allow --list-errors to work on 32-bit machines (with low hard limits). This should fix a problem whereby the 2015/10/23 release could not bootstrap on a 32-bit machine.


  • New declaration %on_error_reduce foo, where foo is a nonterminal symbol. This modifies the automaton as follows. In every state where a production of the form foo -> ... is ready to be reduced, every error action is replaced with a reduction of this production. (If there is a conflict between several productions that could be reduced in this manner, nothing is done.) This does not affect the language that is accepted by the automaton, but delays the detection of an error: more reductions take place before the error is detected.

  • Fixed a bug whereby Menhir would warn about a useless %prec declaration, even though it was useful. This would happen when the declaration was duplicated (by inlining or by macro-expansion) and some but not all of the copies were useful.

  • Added has_default_reduction to the incremental API.

  • Modified the meaning of --canonical to allow default reductions to take place. This implies no loss of precision in terms of lookahead sets, and should allow gaining more contextual information when a syntax error is encountered. (It should also lead to a smaller automaton.)

  • A brand new set of tools to work on syntax errors.

  • New command --list-errors, which produces a list of input sentences which are representative of all possible syntax errors. (Costly.)

  • New command --interpret-error, which confirms that one particular input sentence ends in a syntax error, and prints the number of the state in which this error occurs.

  • New command --compile-errors, which compiles a list of erroneous sentences (together with error messages) to OCaml code.

  • New command --compare-errors, which compares two lists of erroneous sentences to check if they cover the same error states.

  • New command --update-errors, which updates the auto-generated comments in a list of erroneous sentences.

  • New command --echo-errors, which removes all comments and messages from a list of erroneous sentences, and echoes just the sentences.


  • Additions to the incremental API.

    • A supplier is a function that produces tokens on demand.
    • lexer_lexbuf_to_supplier turns a lexer and a lexbuf into a supplier.
    • loop is a ready-made made main parsing loop.
    • loop_handle is a variant that lets the user do her own error handling.
    • loop_handle_undo is a variant that additionally allows undoing the last few "spurious" reductions.
    • number maps a state of the LR(1) automaton to its number.
  • Incompatible change of the incremental API: renamed the type 'a result to 'a checkpoint. This is a better name anyway, and should help avoid confusion with the type 'a result introduced in OCaml 4.03.


  • Avoid using $(shell pwd) in Makefile, for better Windows compatibility.


  • Fixed a bug where inconsistent OCaml code was generated when --table and --external-tokens were used together. (Reported by Darin Morrison.)

  • In --infer mode, leave the .ml file around (instead of removing it) if ocamlc fails, so we have a chance to understand what's wrong.


  • Re-established some error messages concerning the mis-use of $i which had disappeared on 2015/06/29.


  • Fixed the mysterious message that would appear when a nonterminal symbol begins with an uppercase letter and --infer is turned on. Clarified the documentation to indicate that a (non-start) nonterminal symbol can begin with an uppercase letter, but this is not recommended.


  • New option --inspection (added last January, documented only now). This generates an inspection API which allows inspecting the automaton's stack, among other things. This API can in principle be used to write custom code for error reporting, error recovery, etc. It is not yet mature and may change in the future.


  • Added the command line options --unused-token <symbol> and --unused-tokens.


  • Changed the treatment of the positional keywords $i. They are now rewritten into variables of the form _i where i is an integer. Users are advised not to use variables of this form inside semantic actions.


  • Added support for anonymous rules. This allows writing, e.g., list(e = expression SEMI { e }) whereas previously one should have written list(terminated(e, SEMI)).


  • Moved all of the demos to ocamlbuild (instead of make).


  • Incompatible change of the incremental API. The incremental API now exposes shift events too.


  • Fixed a couple bugs in Makefile and src/Makefile which would cause compilation and installation to fail with TARGET=byte. (Reported by Jérémie Courrèges-Anglas and Daniel Dickman.)


  • Incompatible change of the incremental API. The entry point main_incremental is now named Incremental.main.


  • Incompatible change of the incremental API.
    • The API now exposes reduction events.
    • The type 'a result is now private.
    • The type env is no longer parameterized.
    • handle is renamed to resume.
    • offer and resume now expect a result, not an environment.


  • Documented the Coq back-end (designed and implemented by Jacques-Henri Jourdan).


  • New incremental API (in --table mode only), inspired by Frédéric Bour.


  • Menhir now reports an error if one of the start symbols produces either the empty language or the singleton language {epsilon}.

  • Although some people out there actually define a start symbol that recognizes {epsilon} (and use it as a way of initializing or re-initializing some global state), this is considered bad style. Furthermore, by ruling out this case, we are able to simplify the table back-end a little bit.


  • A speed improvement in the code back-end.


  • Menhir now requires OCaml 4.02 (instead of 3.09).


  • Removed support for the $previouserror keyword.
  • Removed support for --error-recovery mode.


  • In the Coq backend, use ' instead of _ as separator in identifiers. Also, correct a serious bug that was inadvertently introduced on 2013/03/01 (r319).


  • Lexer fix so as to support an open variant type [> ...] within a %type<...> declaration.


  • Updated the Makefile so that install no longer depends on all.

  • Updated the demos so that the lexer does not invoke exit 0 when encoutering eof. (This should be more intuitive.)


  • Fixed a newline conversion problem that would prevent Menhir from building on Windows when using ocaml 4.01.


  • Switched to ocamlbuild. Many thanks to Daniel Weil for offering very useful guidance.


  • menhir --depend was broken since someone added new whitespace in the output of ocamldep. Fixed.


  • Fixed a compilation problem that would arise when a file produced by Menhir on a 64-bit platform was compiled by ocaml on a 32-bit platform.


  • Performance improvements in the computation of various information about the automaton (module Invariant). The improvements will be noticeable only for very large automata.


  • The option --log-grammar 3 (and above) now causes the FOLLOW sets for terminal symbols to be computed and displayed.


  • Added the flag --canonical, which causes Menhir to produce a canonical LR(1) automaton in the style of Knuth. This means that no merging of states takes place during the construction of the automaton, and that no default reductions are allowed.


  • Fixed a bug whereby a %nonassoc declaration was not respected. This declaration requests that a shift/reduce conflict be reduced in favor of neither shifting nor reducing, that is, a syntax error must occur. However, due to an unforeseen interaction with the default reduction mechanism, this declaration was sometimes ignored and reduction would take place.


  • Changes in the (undocumented) Coq back-end so as to match the ESOP 2012 paper.


  • The Makefile now tests whether Unix or Windows is used (the test is performed by evaluating Sys.os_type under ocaml) and changes a couple settings accordingly:

    • the executable file name is either menhir or menhir.exe
    • the object file suffix is either .o or .obj
  • Added --strict, which causes many warnings about the grammar and about the automaton to be considered errors.

  • The # annotations that are inserted in the generated .ml file now retain their full path. (That is, we no longer use Filename.basename.) This implies that the # annotations depend on how Menhir is invoked -- e.g., menhir foo/bar.mly and cd foo && menhir bar.mly will produce different results. Nevertheless, this seems reasonable and useful (e.g., in conjunction with ocamlbuild and a hierarchy of files). Thanks to Daniel Weil.


  • With the -lg 1 switch, Menhir now indicates whether the grammar is SLR(1).


  • Removed the lock in ocamldep.wrapper. It is the responsibility of the user to avoid interferences with other processes (or other instances of the script) that create and/or remove files.


  • The (internal) computation of the automaton's invariant was broken and has been fixed. Surprisingly, this does not seem to affect the generated code, (which was correct,) so no observable bug is fixed. Hopefully no bug is introduced!


  • The grammar description files (.mly) are now read in up front and stored in memory while they are parsed. This allows us to avoid the use of pos_in and seek_in, which do not work correctly when CRLF conversion is being performed.


  • Fixed a bug in the type inference module (for parameterized non-terminals) which would cause an infinite loop.


  • Fixed a bug that would cause an assertion failure in the generated parser in some situations where the input stream was incorrect and the grammar involved the error token. The fix might cause grammars that use the error token to behave differently (hopefully more accurately) as of now.


  • Makefile changes: build and install only the bytecode version of MenhirLib when TARGET=byte is set.


  • Fixed ocamldep.wrapper to avoid quoting the name of the ocaml command. This is hoped to fix a compilation problem under MinGW.


  • A Makefile fix to avoid a problem under Windows/Cygwin.
  • Renamed the ocaml-check-version script so as to avoid a warning.


  • Ocaml summer project: added --interpret, --table, and --suggest-*.


  • Fixed a problem that would cause the code inliner to abort when a semantic value and a non-terminal symbol happened to have the same name.

  • Removed code sharing.


  • Removed an incorrect assertion that caused failures (, line 134).


  • Disabled code sharing by default, as it is currently broken. (See Yann's message; assertion failure at runtime.)


  • Added an optimization to share code among states that have identical outgoing transition tables.


  • Small Makefile change: create an executable file for check-ocaml-version in order to work around the absence of dynamic loading on some platforms.


  • Made a fundamental change in the construction of the LR(1) automaton in order to eliminate a bug that could lead to spurious conflicts -- thanks to Ketti for submitting a bug report.


  • Added --follow-construction to help understand the construction of the LR(1) automaton (very verbose).


  • Code generation: more explicit qualifications with Pervasives so as to avoid capture when the user redefines some of the built-in operators, such as (+).
  • Added a new demo (calc-param) that shows how to use %parameter.


  • Makefile improvements (check for PREFIX; bootstrap in bytecode now also available). Slight changes to OMakefile.shared.


  • Portability fix in Makefile and Makefile.shared (avoided which).


  • Portability fix in Makefile.shared (replaced &> with 2>&1 >).


  • Made a slight restriction to Pager's criterion so as to never introduce fake conflict tokens (see Lr0.compatible). This might help make conflict explanations more accurate in the future.


  • Fixed bug that would cause positions to become invalid after inlining.


  • Fixed --depend to be more lenient when analyzing ocamldep's output.
  • Added --raw-depend which transmits ocamldep's output unchanged (for use in conjunction with omake).


  • Fixed bug that would cause --only-preprocess to print %token declarations also for pseudo-tokens.
  • Fixed bug that caused some precedence declarations to be incorrectly reported as useless.
  • Improved things so that useless pseudo-tokens now also cause warnings.
  • Fixed bug that would cause %type directives for terminal symbols to be incorrectly accepted.
  • Fixed bug that would occur when a semantic action containing $i keywords was inlined.


  • Fixed problem that caused some end-of-stream conflicts not to be reported.
  • Fixed Pager's compatibility criterion to avoid creating end-of-stream conflicts.


  • Fixed problem that allowed generating incorrect but apparently well-typed Objective Caml code when a semantic action was ill-typed and --infer was omitted.


  • Improved conflict reports by factoring out maximal common derivation contexts.


  • Fixed bug that could arise when explaining a conflict in a non-LALR(1) grammar.


  • Changed count of reduce/reduce conflicts to allow a comparison with ocamlyacc's diagnostics.
  • When refusing to resolve a conflict, report all diagnostics before dying.


  • Added display of FOLLOW sets when using --log-grammar 2.
  • Added --graph option.
  • Fixed behavior of --depend option.


  • Removed reversed lists from the standard library.