Commit 7fe180b8 authored by POTTIER Francois's avatar POTTIER Francois

Removed --error-recovery mode.

parent b565c874
2014/12/02:
Removed support for the $previouserror keyword.
Removed support for --error-recovery mode.
2014/02/18:
In the Coq backend, use ' instead of _ as separator in identifiers.
......
......@@ -55,14 +55,17 @@ first~\cite{aho-86,appel-tiger-98,hopcroft-motwani-ullman-00}. They are also
invited to have a look at the \distrib{demos} directory in \menhir's
distribution.
At this stage, potential users should be warned about two facts. First,
\menhir's feature set is not stable. There is a tension between preserving a
measure of compatibility with \ocamlyacc, on the one hand, and introducing new
ideas, on the other hand. Some aspects of the tool, such as the error handling
and recovery mechanism, are still potentially subject to incompatible
changes. Second, the present release is \emph{beta}-quality. There is much
room for improvement in the tool and in this reference manual. Bug reports and
suggestions are welcome!
Potential users of Menhir should be warned that \menhir's feature set is not
completely stable. There is a tension between preserving a measure of
compatibility with \ocamlyacc, on the one hand, and introducing new ideas, on
the other hand. Some aspects of the tool, such as the error handling
mechanism, are still potentially subject to incompatible changes: for
instance, in the future, the current error handling mechanism (which is based
on the \error token, see \sref{sec:errors}) could be removed and replaced with
an entirely different mechanism.
There is room for improvement in the tool and in this reference manual. Bug
reports and suggestions are welcome!
% ---------------------------------------------------------------------------------------------------------------------
......@@ -111,13 +114,6 @@ switch.
\docswitch{\odump} This switch causes a description of the automaton
to be written to the file \nt{basename}\automaton.
\docswitch{\oerrorrecovery} This switch causes error recovery code to be
generated. Error recovery, also known as re-synchronization, consists in
dropping tokens off the input stream, after an error has been detected,
until a token that can be shifted in the current state is found. This
behavior is made optional because it is seldom exploited and requires
extra code in the parser. See also \sref{sec:errors}.
\docswitch{\oexplain} This switch causes conflict explanations to be
written to the file \nt{basename}\conflicts. See also \sref{sec:conflicts}.
......@@ -1553,12 +1549,12 @@ dummy positions.
% ---------------------------------------------------------------------------------------------------------------------
\section{Error handling and recovery}
\section{Error handling}
\label{sec:errors}
\paragraph{Error handling}
\menhir's error handling and recovery is inspired by that of \yacc and
\menhir's error handling mechanism is inspired by that of \yacc and
\ocamlyacc, but is not identical. A special \error token is made available
for use within productions. The LR automaton is constructed exactly as if
\error was a regular terminal symbol. However, \error is never produced
......@@ -1577,9 +1573,7 @@ Since the lookahead token is still \error, the automaton remains in error
handling mode.
When a state that can shift on \error is found, the \error token is shifted.
At this point, the parser either enters \emph{error recovery} mode, if the
\oerrorrecovery switch was enabled at compile time, or returns to normal
mode.
At this point, the parser returns to normal mode.
When no state that can act on \error is found on the automaton's stack, the
parser stops and raises the exception \texttt{Error}. This exception carries
......@@ -1588,24 +1582,10 @@ lexical analyzer's environment record.
\paragraph{Error recovery}
Error recovery mode is entered immediately after an \error token was
successfully shifted, and only if \menhir's \oerrorrecovery switch was enabled
when the parser was produced. In error recovery mode, tokens are repeatedly
\ocamlyacc offers an error recovery mode, which is entered immediately after
an \error token was successfully shifted. In this mode, tokens are repeatedly
taken off the input stream and discarded until an acceptable token is found.
A token is acceptable if the current state has an action on that token. When
an acceptable token is found, the parser returns to normal mode and the action
takes place. Error recovery is also known as \emph{re-synchronization}.
Error recovery mode is peculiar, in that it can cause non-termination if the
token stream is infinite. In practice, token streams often \emph{are}
infinite, due to an \ocamllex peculiarity: every \ocamllex-generated analyzer
that maps the \kw{eof} pattern to an \basic{EOF} token will produce an
infinite stream of \basic{EOF} tokens, even if the underlying text that is
being scanned is finite. In order to address this issue, \menhir attributes
special meaning to the token named \basic{EOF}, if there is one in the grammar
specification, when \oerrorrecovery is enabled. It checks that every automaton
state that can be reached when in error recovery mode accepts this token, and
issues a warning otherwise. This ensures that the parser always terminates.
This feature is no longer offered by \menhir.
\paragraph{Error-related keywords}
......@@ -1829,10 +1809,10 @@ The list is roughly sorted by decreasing order of importance.
semantic actions are deprecated. The function \verb+parse_error+ is
deprecated. They are replaced with keywords (\sref{sec:errors}).
\item \menhir's error handling and error recovery mechanisms (\sref{sec:errors}) are inspired
\item \menhir's error handling mechanism (\sref{sec:errors}) isinspired
by \ocamlyacc's, but are not guaranteed to be fully
compatible. Error recovery, also known as re-synchronization, is now
optional.
compatible. Error recovery, also known as re-synchronization, is not
supported by \menhir.
\item The way in which severe conflicts (\sref{sec:conflicts}) are resolved
is not guaranteed to be fully compatible with \ocamlyacc.
......
......@@ -60,7 +60,7 @@ stage1:
# Stage 2.
# Build Menhir using Menhir (from stage 1).
FLAGS := -v -lg 1 -la 1 -lc 1 --comment --infer --error-recovery --stdlib . --strict --fixed-exception
FLAGS := -v -lg 1 -la 1 -lc 1 --comment --infer --stdlib . --strict --fixed-exception
stage2:
@$(OCAMLBUILD) -build-dir _stage2 -tag fancy_parser \
......
......@@ -115,7 +115,7 @@ open Interface
that branch with a simple [assert false]. TEMPORARY do it *)
(* ------------------------------------------------------------------------ *)
(* Here is a description of our error recovery mechanism.
(* Here is a description of our error handling mechanism.
With every state [s], we associate an [error] function.
......@@ -131,19 +131,9 @@ open Interface
cells do not physically hold a state, this description is somewhat
simpler than the truth, but that's the idea.)
When an error is detected in state [s], one of two things happens
(see [initiate]).
a. If [s] can do error recovery and if no token was successfully
shifted since the last [error] token was shifted, then the
current token is discarded and the current state remains
unchanged, that is, the [action] function associated with [s]
is re-entered.
b. Otherwise, the [error] function associated with [s] is
invoked.
In case (b), immediately before invoking the [error] function, the
When an error is detected in state [s], then (see [initiate]) the
[error] function associated with [s] is invoked. Immediately
before invoking the [error] function, the
counter [env.shifted] is reset to -1. By convention, this means
that the current token is discarded and replaced with an [error]
token. The [error] token transparently inherits the positions
......@@ -176,25 +166,7 @@ open Interface
reduction is unable to handle errors.
I note that a state that can handle [error] and has a default
reduction must in fact have a reduction action on [error].
A state that can perform error recovery (that is, a state whose
incoming symbol is [error]) never performs a default reduction. The
reason why this is so is given in [Invariant]. A consequence of
this decision is that reduction is not performed until error
recovery is successful. This behavior could be surprising if it
were the default behavior; however, recall that error recovery is
disabled unless [--error-recovery] was specified.
I note that error recovery, case (a) above, can cause the parser to
enter an infinite loop. Indeed, the token stream is in principle
infinite -- for instance, many lexers will return an EOF token
forever after some finite supply of tokens has been exhausted. If
we hit EOF while in error recovery mode, and if EOF is not accepted
at the current state, we will keep discarding EOF and asking for a
new token. The way out of this situation is to design the grammar
in such a way that it cannot happen. We provide a warning to help
with this task. *)
reduction must in fact have a reduction action on [error]. *)
(* The type of environments. *)
......@@ -909,39 +881,10 @@ let call_error_via_errorcase magic s = (* TEMPORARY document *)
let call_assertfalse =
EApp (EVar assertfalse, [ EVar "()" ])
(* ------------------------------------------------------------------------ *)
(* Emit a warning when a state can do error recovery but does not
accept EOF. This can lead to non-termination if the end of file
is reached while attempting to recover from an error. *)
let check_recoverer covered s =
match Terminal.eof with
| None ->
(* We do not know which token represents the end of file,
so we say nothing. *)
()
| Some eof ->
if not (TerminalSet.mem eof covered) then
(* This state has no (shift or reduce) action at EOF. *)
Error.warning []
(Printf.sprintf
"state %d can perform error recovery, but does not accept EOF.\n\
** Hitting the end of file during error recovery will cause non-termination."
(Lr1.number s))
(* ------------------------------------------------------------------------ *)
(* Code production for the automaton functions. *)
(* Count how many states actually perform error recovery. This figure
is, in general, inferior or equal to the number of states at which
[Invariant.recoverer] is true. Indeed, some of these states have a
default reduction, while some will accept every token; in either
case, error recovery is not performed. *)
let recoverers =
ref 0
(* Count how many states actually can peek at an error recovery. This
(* Count how many states actually can peek at an error token. This
figure is, in general, inferior or equal to the number of states at
which [Invariant.errorpeeker] is true, because some of these states
have a default reduction and will not consult the lookahead
......@@ -1146,15 +1089,9 @@ let errorbookkeeping e =
handle the error token, by a series of reductions followed by a
shift.
In the simplest case, the state [s] cannot do error recovery. In
that case, we initiate error handling, which is done by first
performing the standard bookkeeping described above, then
transferring control to the [error] function associated with [s].
If, on the other hand, [s] can do error recovery, then we check
whether any tokens at all were shifted since the last error
occurred. If none were, then we discard the current token and
transfer control back to the [action] function associated with [s].
We initiate error handling by first performing the standard
bookkeeping described above, then transferring control to the
[error] function associated with [s].
The token is discarded via a call to [discard], followed by
resetting [env.shifted] to zero, to counter-act the effect of
......@@ -1164,30 +1101,7 @@ let initiate covered s =
blet (
[ assertshifted ],
if Invariant.recoverer s then begin
incr recoverers;
check_recoverer covered s;
EIfThenElse (
EApp (EVar "Pervasives.(=)", [ ERecordAccess (EVar env, fshifted); EIntConst 0 ]),
blet (
trace "Discarding last token read (%s)"
[ EApp (EVar print_token, [ ERecordAccess (EVar env, ftoken) ]) ] @
[
PVar token, EApp (EVar discard, [ EVar env ]);
PUnit, ERecordWrite (EVar env, fshifted, EIntConst 0)
],
call_action s
),
errorbookkeeping (call_error_via_errorcase magic s)
)
end
else
errorbookkeeping (call_error_via_errorcase magic s)
errorbookkeeping (call_error_via_errorcase magic s)
)
(* This produces the definitions of the [run] and [action] functions
......@@ -1196,11 +1110,9 @@ let initiate covered s =
The [action] function implements the internal case analysis. It
receives the lookahead token as a parameter. It does not affect the
input stream. It does not set up exception handlers for dealing
with errors. The existence of this internal function is made
necessary by the error recovery mechanism (which discards tokens
when attempting to resynchronize after an error). In many states,
recovery can in fact not be performed, so no self-call to [action]
will be generated and [action] will be inlined into [run]. *)
with errors. *)
(* TEMPORARY I believe [action] could now be inlined into [run] *)
let rec runactiondef s : valdef list =
......@@ -1825,10 +1737,8 @@ let program = {
let () =
Error.logC 1 (fun f ->
Printf.fprintf f
"%d out of %d states can peek at an error.\n\
%d out of %d states can do error recovery.\n"
!errorpeekers Lr1.n
!recoverers Lr1.n)
"%d out of %d states can peek at an error.\n"
!errorpeekers Lr1.n)
let () =
if not !can_die then
......
......@@ -220,14 +220,7 @@ module Make (T : TABLE) = struct
and initiate env : void =
assert (env.shifted >= 0);
if T.recovery && env.shifted = 0 then begin
Log.discarding_last_token (T.token2terminal env.token);
discard env;
env.shifted <- 0;
action env
end
else
errorbookkeeping env
errorbookkeeping env
and errorbookkeeping env =
Log.initiating_error_handling();
......
......@@ -230,13 +230,6 @@ module type TABLE = sig
val semantic_action: production -> semantic_action
(* The LR engine can attempt error recovery. This consists in discarding
tokens, just after an error has been successfully handled, until a token
that can be successfully handled is found. This mechanism is optional.
The following flag enables it. *)
val recovery: bool
(* The LR engine requires a number of hooks, which are used for logging. *)
(* The comments below indicate the conventional messages that correspond
......@@ -276,10 +269,6 @@ module type TABLE = sig
val handling_error: state -> unit
(* Discarding last token read (<terminal>) *)
val discarding_last_token: terminal -> unit
end
end
......
......@@ -236,15 +236,7 @@ module Terminal = struct
Misc.mapi (n-1) f
(* If a token named [EOF] exists, then it is assumed to represent
ocamllex's [eof] pattern, which means that the lexer may
eventually produce an infinite stream of [EOF] tokens. This,
combined with our error recovery mechanism, may lead to
non-termination. We provide a warning against this somewhat
obscure situation.
Relying on the token's name is somewhat fragile, but this saves
introducing an extra keyword for declaring which token represents
[eof], and should not introduce much confusion. *)
ocamllex's [eof] pattern. *)
let eof =
try
......
......@@ -123,8 +123,8 @@ module Terminal : sig
(* This is the programmer-defined [EOF] token, if there is one. It
is recognized based solely on its name, which is fragile, but
this behavior is documented. This token is assumed to represent
[ocamllex]'s [eof] pattern. It is used only in emitting warnings
in [--error-recovery] mode. *)
[ocamllex]'s [eof] pattern. It is used only by the reference
interpreter, and in a rather non-essential way. *)
val eof: t option
......
......@@ -688,49 +688,6 @@ let universal symbol =
universal && (if represented s then SymbolMap.mem symbol (Lr1.transitions s) else true)
) true
(* ------------------------------------------------------------------------ *)
(* Discover which states potentially can do error recovery.
They are the states whose incoming symbol is [error]. At these
states, [env.shifted] is zero, that is, no tokens have been
successfully shifted since the last error token was shifted.
We do not include in this definition the states where [env.shifted]
*may be* zero. That would involve adding in all states reachable
from the above states via reductions. However, error recovery will
never be performed in these states. Indeed, imagine we shift an
error token and enter a state that can do error recovery, according
to the above definition. If, at this point, we consult the
lookahead token [tok] and perform a reduction, then the new state
that we reach is, by construction, able to act upon [tok], so no
error recovery will be performed at that state, even though
[env.shifted] is still zero. However, we must not perform default
reductions at states that can do error recovery, otherwise we break
this reasoning.
If the option [--error-recovery] was not provided on the command
line, then no states will perform error recovery. This makes things
simpler (and saves some code) in the common case where people are
not interested in error recovery. This also disables the warning
about states that can do error recovery but do not accept the EOF
token. *)
let recoverers =
if Settings.recovery then
Lr1.fold (fun recoverers node ->
match Lr1.incoming_symbol node with
| Some (Symbol.T tok)
when Terminal.equal tok Terminal.error ->
Lr1.NodeSet.add node recoverers
| _ ->
recoverers
) Lr1.NodeSet.empty
else
Lr1.NodeSet.empty
let recoverer node =
Lr1.NodeSet.mem node recoverers
(* ------------------------------------------------------------------------ *)
(* Discover which states can peek at an error. These are the states
where [env.shifted] may be -1, that is, where an error token may be
......@@ -782,15 +739,6 @@ let errorpeeker node =
the lookahead token. This saves code, but can alter the parser's
behavior in the presence of errors.
A state that can perform error recovery (that is, a state whose
incoming symbol is [error]) never performs a default
reduction. This is explained above. Actually, we allow one
exception: if the state has a single (reduction) action on "#", as
explained in the next paragraph, then we perform this default
reduction and do not allow error recovery to take place. Error
recovery would not make much sense, since we believe we are at the
end of file.
The check for default actions subsumes the check for the case where
[s] admits a reduce action with lookahead symbol "#". In that case,
it must be the only possible action -- see
......@@ -836,12 +784,9 @@ let (has_default_reduction : Lr1.node -> (Production.index * TerminalSet.t) opti
| Some (_, toks) as reduction
when SymbolMap.purelynonterminal (Lr1.transitions s) ->
if TerminalSet.mem Terminal.sharp toks then
if TerminalSet.mem Terminal.sharp toks then
(* Perform default reduction on "#". *)
reduction
else if recoverer s then
(* Do not perform default reduction. Allow error recovery. *)
None
else begin
(* Perform default reduction, unless [--canonical] has been specified. *)
match Settings.construction_mode with
......
......@@ -9,9 +9,8 @@
need to physically exist on the stack at runtime) and which symbols
need to keep track of (start or end) positions.
It also determines which automaton states could potentially perform
error recovery, and which states could have to deal with an [error]
token. *)
It also determines which automaton states could have to deal with an
[error] token. *)
open Grammar
......@@ -90,11 +89,6 @@ val endp: Symbol.t -> bool
(* ------------------------------------------------------------------------- *)
(* Information about error handling. *)
(* [recoverer s] tells whether state [s] can potentially do error
recovery. *)
val recoverer: Lr1.node -> bool
(* [errorpeeker s] tells whether state [s] can potentially peek at an
error. This is the case if, in state [s], [env.shifted] may be -1,
that is, if an error token may be on the stream. *)
......
......@@ -164,12 +164,6 @@ module T = struct
next = stack
}
(* The reference interpreter performs error recovery if and only if this
is requested via [--recovery]. *)
let recovery =
Settings.recovery
module Log = struct
open Printf
......@@ -227,11 +221,6 @@ module T = struct
fprintf stderr "Handling error in state %d" (Lr1.number s)
)
let discarding_last_token tok =
maybe (fun () ->
fprintf stderr "Discarding last token read (%s)" (Terminal.print tok)
)
end
end
......
......@@ -163,7 +163,7 @@ let options = Arg.align [
"--coq-no-actions", Arg.Set coq_no_actions, " (undocumented)";
"--depend", Arg.Unit (fun () -> depend := OMPostprocess), " Invoke ocamldep and display dependencies";
"--dump", Arg.Set dump, " Describe the automaton in <basename>.automaton";
"--error-recovery", Arg.Set recovery, " Attempt recovery by discarding tokens after errors";
"--error-recovery", Arg.Set recovery, " (no longer supported)";
"--explain", Arg.Set explain, " Explain conflicts in <basename>.conflicts";
"--external-tokens", Arg.String codeonly, "<module> Import token type definition from <module>";
"--fixed-exception", Arg.Set fixedexc, " Declares Error = Parsing.Parse_error";
......@@ -310,8 +310,11 @@ let graph =
let trace =
!trace
let recovery =
!recovery
let () =
if !recovery then begin
fprintf stderr "Error: --error-recovery mode is no longer supported.\n";
exit 1
end
let noprefix =
!noprefix
......
......@@ -44,12 +44,6 @@ val graph: bool
val trace: bool
(* Whether error recovery should be attempted. This consists
in discarding tokens, after the [error] token has been
shifted, until a token that can be accepted is found. *)
val recovery: bool
(* Whether one should stop and print the grammar after joining and
expanding the grammar. *)
......
......@@ -682,7 +682,6 @@ let application = {
lhs;
goto;
semantic_action;
define ("recovery", eboolconst Settings.recovery);
trace;
];
}
......
......@@ -106,10 +106,6 @@ module type TABLES = sig
exception Error
(* The parser indicates whether to perform error recovery. *)
val recovery: bool
(* The parser indicates whether to generate a trace. Generating a
trace requires two extra tables, which respectively map a
terminal symbol and a production to a string. *)
......
......@@ -101,9 +101,6 @@ module Make (T : TableFormat.TABLES)
let semantic_action prod =
T.semantic_action.(prod)
let recovery =
T.recovery
module Log = struct
open Printf
......@@ -160,13 +157,6 @@ module Make (T : TableFormat.TABLES)
| None ->
()
let discarding_last_token token =
match T.trace with
| Some (terminals, _) ->
fprintf stderr "Discarding last token read (%s)\n%!" terminals.(token)
| None ->
()
end
end)
/* This is the crude version of the parser. It is meant to be processed
by ocamlyacc. Its existence is necessary for bootstrapping. It is kept
in sync with [fancy-parser]. The two parsers accept the same language,
but [fancy-parser] performs more refined error recovery. */
but [fancy-parser] performs slightly more refined error handling. */
%{
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment