\chapter{Language Reference}
\label{chap:syntaxref}
%HEVEA\cutname{syntaxref.html}
This chapter gives the grammar and semantics for \why and \whyml input files.
\section{Lexical Conventions}
Lexical conventions are common to \why and \whyml.
% TODO: blanks
\subsection{Comments}
Comments are enclosed by \texttt{(*} and \texttt{*)} and can be nested.
\subsection{Strings}
Strings are enclosed in double quotes (\verb!"!). Double quotes can be
escaped in strings using the backslash character (\verb!\!).
The other special sequences are \verb!\n! for line feed and \verb!\t!
for horizontal tab.
In the following, strings are referred to with the non-terminal
\nonterm{string}{}.
\subsection{Identifiers}
Identifiers are non-empty sequences of characters among letters,
digits, the underscore character and the quote character. They cannot
start with a digit or a quote.
\begin{center}\framebox{\input{./qualid_bnf.tex}}\end{center}
The syntax distinguishes identifiers that start with a lowercase or an
uppercase letter (resp. \nonterm{lident}{} and \nonterm{uident}{}), and
similarly, lowercase and uppercase qualified identifiers.
The restricted classes of identifiers denoted by \nonterm{lident-nq}{}
and \nonterm{uident-nq}{} correspond to identifiers where the quote
character cannot be followed by a letter. Identifiers where a quote is
followed by a letter are reserved and cannot be used as identifier for
declarations introduced by the user (see Section~\ref{sec:rangetypes}).
\subsection{Constants}
The syntax for constants is given in Figure~\ref{fig:bnf:constant}.
Integer and real constants have arbitrary precision.
Integer constants may be given in base 16, 10, 8 or 2.
Real constants may be given in base 16 or 10.
\begin{figure}
\begin{center}\framebox{\input{./constant_bnf.tex}}\end{center}
\caption{Syntax for constants.}
\label{fig:bnf:constant}
\end{figure}
\subsection{Operators}
Prefix and infix operators are built from characters organized in four
categories (\textsl{op-char-1} to \textsl{op-char-4}).
\begin{center}\framebox{\input{./operator_bnf.tex}}\end{center}
Infix operators are classified into 4 categories, according to the
characters they are built from:
\begin{itemize}
\item level 4: operators containing only characters from
\textit{op-char-4};
\item level 3: those containing
characters from \textit{op-char-3} or \textit{op-char-4};
\item level 2: those containing
characters from \textit{op-char-2}, \textit{op-char-3} or
\textit{op-char-4};
\item level 1: all other operators (non-terminal \textit{infix-op-1}).
\end{itemize}
\subsection{Labels}
Identifiers, terms, formulas, program expressions
can all be labeled, either with a string, or with a location tag.
\begin{center}\framebox{\input{./label_bnf.tex}}\end{center}
A location tag consists of a file name, a line number, and starting
and ending characters.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{The \why Language}
\subsection{Terms}
The syntax for terms is given in Figure~\ref{fig:bnf:term}.
The various constructs have the following priorities and
associativities, from lowest to greatest priority:
\begin{center}
\begin{tabular}{|l|l|}
\hline
construct & associativity \\
\hline\hline
\texttt{if then else} / \texttt{let in} & -- \\
label & -- \\
cast & -- \\
infix-op level 1 & left \\
infix-op level 2 & left \\
infix-op level 3 & left \\
infix-op level 4 & left \\
prefix-op & -- \\
function application & left \\
brackets / ternary brackets & -- \\
bang-op & -- \\
\hline
\end{tabular}
\end{center}
Note the curryfied syntax for function application, though partial
application is not allowed (rejected at typing).
\begin{figure}
\begin{center}\framebox{\input{./term_bnf.tex}}\end{center}
\caption{Syntax for terms.}
\label{fig:bnf:term}
\end{figure}
\subsection{Type Expressions}
The syntax for type expressions is the following:
\begin{center}\framebox{\input{./type_bnf.tex}}\end{center}
Built-in types are \texttt{int}, \texttt{real}, and tuple types.
Note that the syntax for type
expressions notably differs from the usual ML syntax (\eg the
type of polymorphic lists is written \texttt{list 'a}, not \texttt{'a list}).
\subsection{Formulas}
The syntax for formulas is given Figure~\ref{fig:bnf:formula}.
The various constructs have the following priorities and
associativities, from lowest to greatest priority:
\begin{center}
\begin{tabular}{|l|l|}
\hline
construct & associativity \\
\hline\hline
\texttt{if then else} / \texttt{let in} & -- \\
label & -- \\
\texttt{->} / \texttt{<->} & right \\
\texttt{by} / \texttt{so} & right \\
\verb!\/! / \verb!||! & right \\
\verb|/\| / \verb!&&! & right \\
\texttt{not} & -- \\
infix level 1 & left \\
infix level 2 & left \\
infix level 3 & left \\
infix level 4 & left \\
prefix & -- \\
\hline
\end{tabular}
\end{center}
Note that infix symbols of level 1 include equality (\texttt{=}) and
disequality (\texttt{<>}).
\begin{figure}
\begin{center}\framebox{\input{./formula_bnf.tex}}\end{center}
\caption{Syntax for formulas.}
\label{fig:bnf:formula}
\end{figure}
Notice that there are two symbols for the conjunction: \verb|/\|
and \verb|&&|, and similarly for disjunction. They are logically
equivalent, but may be treated slightly differently by some
transformations. For instance, \texttt{split} transforms the goal
\verb|A /\ B| into subgoals \verb|A| and \verb|B|, whereas it transforms
\verb|A && B| into subgoals \verb|A| and \verb|A -> B|. Similarly, it
transforms \verb!not (A || B)! into subgoals \verb|not A| and
\verb|not ((not A) /\ B)|.
The \texttt{by}/\texttt{so} connectives are proof indications. They are
logically equivalent to their first argument, but may affect the result
of some transformations. For instance, the \texttt{split\_goal}
transformations interpret those connectives as introduction of logical cuts
(see~\ref{tech:trans:split} for details).
\subsection{Theories}
The syntax for theories is given on Figure~\ref{fig:bnf:theorya} and~\ref{fig:bnf:theoryb}.
\begin{figure}
\begin{center}\framebox{\input{./theory_bnf.tex}}\end{center}
\caption{Syntax for theories (part 1).}
\label{fig:bnf:theorya}
\end{figure}
\begin{figure}
\begin{center}\framebox{\input{./theory2_bnf.tex}}\end{center}
\caption{Syntax for theories (part 2).}
\label{fig:bnf:theoryb}
\end{figure}
\subsubsection{Algebraic types}
TO BE COMPLETED
\subsubsection{Record types}
TO BE COMPLETED
\subsubsection{Range types}
\label{sec:rangetypes}
A declaration of the form \texttt{type r = < range \textit{a b} >}
defines a type that projects into the integer range
\texttt{[\textit{a,b}]}. Note that in order to make such a declaration
the theory \texttt{int.Int} must be imported.
Why3 let you cast an integer literal in a range type
(e.g. \texttt{(42:r)}) and will check at typing that the literal is in
range. Defining such a range type $r$ automatically introduces the
following:
\begin{whycode}
function r'int r : int
constant r'maxInt : int
constant r'minInt : int
\end{whycode}
The function \texttt{r'int} projects a term of type \texttt{r} to its
integer value. The two constants represent the high bound and low
bound of the range respectively.
Unless specified otherwise with the meta \texttt{"keep:literal"} on
\texttt{r}, the transformation \emph{eliminate\_literal} introduces an
axiom
\begin{whycode}
axiom r'axiom : forall i:r. r'minInt <= r'int i <= r'maxInt
\end{whycode}
and replaces all casts of the form \texttt{(42:r)} with a constant and
an axiom as in:
\begin{whycode}
constant rliteral7 : r
axiom rliteral7_axiom : r'int rliteral7 = 42
\end{whycode}
This type is used in the standard library in the theories
\texttt{bv.BV8}, \texttt{bv.BV16}, \texttt{bv.BV32}, \texttt{bv.BV64}.
\subsubsection{Floating-point Types}
A declaration of the form \texttt{type f = < float \textit{eb sb} >}
defines a type of floating-point numbers as specified by the IEEE-754
standard~\cite{ieee754}. Here the literal \texttt{\textit{eb}}
represents the number of bits in the exponent and the literal
\texttt{\textit{sb}} the number of bits in the significand (including
the hidden bit). Note that in order to make such a declaration the
theory \texttt{real.Real} must be imported.
Why3 let you cast a real literal in a float type
(e.g. \texttt{(0.5:f)}) and will check at typing that the literal is
representable in the format. Note that Why3 do not implicitly round a
real literal when casting to a float type, it refuses the cast if the
literal is not representable.
Defining such a type \texttt{f} automatically introduces the following:
\begin{whycode}
predicate f'isFinite f
function f'real f : real
constant f'eb : int
constant f'sb : int
\end{whycode}
As specified by the IEEE standard, float formats includes infinite
values and also a special NaN value (Not-a-Number) to represent
results of undefined operations such as $0/0$. The predicate
\texttt{f'isFinite} indicates whether its argument is neither infinite
nor NaN. The function \texttt{f'real} projects a finite term of type
\texttt{f} to its real value, its result is not specified for non finite
terms.
Unless specified otherwise with the meta \texttt{"keep:literal"} on
\texttt{f}, the transformation \emph{eliminate\_literal} will
introduce an axiom
\begin{whycode}
axiom f'axiom :
forall x:f. f'isFinite x -> -. max_real <=. f'real x <=. max_real
\end{whycode}
where \texttt{max\_real} is the value of the biggest finite float in
the specified format. The transformation also replaces all casts of
the form \texttt{(0.5:f)} with a constant and an axiom as in:
\begin{whycode}
constant fliteral42 : f
axiom fliteral42_axiom : f'real fliteral42 = 0.5 /\ f'isFinite fliteral42
\end{whycode}
This type is used in the standard library in the theories
\texttt{ieee\_float.Float32} and \texttt{ieee\_float.Float64}.
\subsection{Files}
A \why input file is a (possibly empty) list of theories.
\begin{center}\framebox{\input{./why_file_bnf.tex}}\end{center}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\clearpage
\section{The \whyml Language}\label{sec:syntax:whyml}
\subsection{Specification}
The syntax for specification clauses in programs
is given in Figure~\ref{fig:bnf:spec}.
\begin{figure}
\begin{center}\framebox{\input{./spec_bnf.tex}}\end{center}
\caption{Specification clauses in programs.}
\label{fig:bnf:spec}
\end{figure}
Within specifications, terms are extended with new constructs
\verb|old| and \verb|at|:
\begin{center}\framebox{\input{./term_old_at_bnf.tex}}\end{center}
Within a postcondition, $\verb|old|~t$ refers to the value of term $t$
in the prestate. Within the scope of a code mark $L$,
the term $\verb|at|~t~\verb|'|L$ refers to the value of term $t$ at the program
point corresponding to $L$.
\subsection{Expressions}
The syntax for program expressions is given in
Figure~\ref{fig:bnf:expra} and~Figure~\ref{fig:bnf:exprb}.
\begin{figure}
\begin{center}\framebox{\input{./expr_bnf.tex}}\end{center}
\caption{Syntax for program expressions (part 1).}
\label{fig:bnf:expra}
\end{figure}
\begin{figure}
\begin{center}\framebox{\input{./expr2_bnf.tex}}\end{center}
\caption{Syntax for program expressions (part 2).}
\label{fig:bnf:exprb}
\end{figure}
In applications, arguments are evaluated from right to left.
This includes applications of infix operators, with the only exception of
lazy operators \verb|&&| and \verb+||+ that evaluate from left to
right, lazily.
% In the following we describe the informal semantics of each
% constructs, and provide the corresponding rule for computing the
% weakest precondition.
% \subsubsection{Constant Expressions, Unary and Binary Operators}
% Integer and real constants are as in the logic language, as weel as the unary and binary operators.
% \subsubsection{Array accesses and updates, fields access and updates}
% \todo{}
% \subsubsection{Let binding, sequences}
% \todo{}
% \subsubsection{Function definition}
% \todo{fun, let rec}
% \subsubsection{Function call}
% \todo{}
% \subsubsection{Exception throwing and catching}
% \todo{raise, try with end}
% \subsubsection{Conditional expression, pattern matching}
% \todo{if then else. Discuss standard WP versus fast WP}
% \subsubsection{Iteration Expressions}
% There are three kind of expressions for iterating: bounded, unbounded and infinite.
% \begin{itemize}
% \item A bounded iteration has the form
% \begin{flushleft}\ttfamily
% for $i$=$a$ to $b$ do invariant \{ $I$ \} $e$ done
% \end{flushleft}
% Expressions $a$ and $b$ are evaluated first and only once, then expression $e$ si evaluated successively for $i$ from $a$ to $b$ included. Nothing is executed if $a > b$. The invariant $I$ must hold at each iteration including before entering the loop and when leaving it. The rule for computing WP is as follows:
% \begin{eqnarray*}
% WP(\texttt{for} \ldots, Q) &=& I(a) \land \\
% && \forall \vec{w} (\forall i, a \leq i \leq b \land I(i) \rightarrow WP(e,I(i+1))) \land (I(b+1) \rightarrow Q)
% \end{eqnarray*}
% where $\vec{w}$ is the set of references modified by $e$.
% A downward bounded iteration is also available, under the form
% \begin{flushleft}\ttfamily
% for $i$=$a$ downto $b$ do invariant \{ $I$ \} $e$ done
% \end{flushleft}
% \item An unbounded iteration has the form
% \begin{flushleft}\ttfamily
% while $c$ do invariant \{ $I$ \} $e$ done
% \end{flushleft}
% it iterates the loop body $e$ until the condition $c$ becomes false.
% \begin{eqnarray*}
% WP(\texttt{while} \ldots, Q) &=& I \land \\
% && \forall \vec{w} (c \land I \rightarrow WP(e,I)) \land (\neg c \land I \rightarrow Q)
% \end{eqnarray*}
% where $\vec{w}$ is the set of references modified by $e$.
% Additionally, such a loop can be given a variant $v$, a quantity that must decreases ar each iteration, so as to prove its termination.
% \item An infinite iteration has the form
% \begin{flushleft}\ttfamily
% loop invariant \{ $I$ \} $e$ end
% \end{flushleft}
% it iterates the loop forever, hence the only way to exit such a loop is to raise an exception.
% \begin{eqnarray*}
% WP(\texttt{loop} \ldots, Q \mid Exc \Rightarrow R) &=& I \land \\
% && \forall \vec{w} (I \rightarrow WP(e,I)) \land (I \rightarrow WP(e,Exc \Rightarrow R))
% \end{eqnarray*}
% \end{itemize}
% \subsubsection{Assertions, Code Contracts}
% There are several form of expressions for inserting annotations in the code.
% The first form of those are the \emph{assertions} which have the form
% \begin{flushleft}\ttfamily
% \textsl{keyword} \{ \textsl{proposition} \}
% \end{flushleft}
% where \textsl{keyword} is either \texttt{assert}, \texttt{assume} or
% \texttt{check}. They all state that the given proposition holds at the given program point. The differences are:
% \begin{itemize}
% \item \texttt{assert} requires to prove that the proposition holds, and then make it available in the context of the remaining of the code
% \item \texttt{check} requires to prove that the proposition holds, but
% does not make it visible in the remaining
% \item \texttt{assume} assumes that the proposition holds and make it
% visible in the context of the remaining code, without requiring to
% prove it. It acts like an axiom, but within a program.
% \end{itemize}
% The corresponding rules for computing WP are as follows:
% \begin{eqnarray*}
% WP(\texttt{assert} \{ P \}, Q) &=& P \mathop{\&\&} Q = P \land (P \rightarrow Q)\\
% WP(\texttt{check} \{ P \}, Q) &=& P \land Q \\
% WP(\texttt{assume} \{ P \}, Q) &=& P \rightarrow Q
% \end{eqnarray*}
% The other forms of code contracts allow to abstract a piece of code by specifications.
% \begin{itemize}
% \item $\texttt{any}~\{ P \}~\tau~\epsilon~\{ Q \}$ is a
% non-deterministic expression that requires the precondition $P$ to
% hold, then makes some side effects $\epsilon$, and returns any value
% of type $\tau$ such that $Q$ holds. This construct acts as an axiom
% in the sense that it does not check whether there exists any program
% that can effectively establish the post-condition (similarly as the
% introduction of a \texttt{val} at the global level).
% \item $\texttt{abstract}~e~\{ Q \}$ makes sure that the evaluation of
% expression $e$ establishes the post-condition $Q$, and then abstract
% away the program state by the post-condition $Q$ (similarly to the
% \texttt{any} construct).
% \end{itemize}
% The corresponding rules for computing WP are as follows:
% \[
% \begin{array}{l}
% WP(\texttt{any}~\{ P \}~\tau~\epsilon~\{ Q \mid exn_i \Rightarrow R_i \} ,
% Q' exn_i \Rightarrow R'_i) = \\
% \qquad\qquad P \mathop{\&\&} \forall result, \epsilon.
% (Q \rightarrow Q') \land (R_i \rightarrow R'_i) \\
% WP(\texttt{abstract}~e~\{ Q \mid exn_i \Rightarrow R_i \} ,
% Q' \mid exn_i \Rightarrow R'_i) = \\
% \qquad\qquad WP(e,Q \mid exn_i \Rightarrow R_i) \land
% \forall result, \epsilon, Q \rightarrow Q' \land R_i \rightarrow R'_i
% \end{array}
% \]
% \subsubsection{Labels, Operators \texttt{old} and \texttt{at}}
% \todo{Labels, old, at}
\subsection{Modules}
The syntax for modules is given in Figure~\ref{fig:bnf:module}.
\begin{figure}
\begin{center}\framebox{\input{./module_bnf.tex}}\end{center}
\caption{Syntax for modules.}
\label{fig:bnf:module}
\end{figure}
Any declaration which is accepted in a theory is also accepted in a
module. Additionally, modules can introduce record types with mutable
fields and declarations which are specific to programs (global
variables, functions, exceptions).
\subsection{Files}
A \whyml input file is a (possibly empty) list of theories and modules.
\begin{center}\framebox{\input{./whyml_file_bnf.tex}}\end{center}
A theory defined in a \whyml file can only be used within that
file. If a theory is supposed to be reused from other files, be they
\why or \whyml files, it should be defined in a \why file.
\section{The \why Standard Library}
\label{sec:library}\index{standard library}\index{library}
The \why standard library provides general-purpose theories and
modules, to be used in logic and/or programs.
It can be browsed on-line at \url{http://why3.lri.fr/stdlib/}.
Each file contains one or several theories and/or modules.
To \texttt{use} or \texttt{clone} a theory/module \texttt{T} from file
\texttt{file}, use the syntax \texttt{file.T}, since \texttt{file} is
available in \why's default load path. For instance, the theory of
integers and the module of references are imported as follows:
\begin{whycode}
use import int.Int
use import ref.Ref
\end{whycode}
%%% Local Variables:
%%% mode: latex
%%% TeX-PDF-mode: t
%%% TeX-master: "manual"
%%% End: