\chapter{Language Reference} \label{chap:syntaxref} This chapter gives the grammar and semantics for \why and \whyml input files. \section{Lexical conventions} Lexical conventions are common to \why and \whyml. % TODO: blanks \paragraph{Comments.} Comments are enclosed by \texttt{(*} and \texttt{*)} and can be nested. \paragraph{Strings.} Strings are enclosed in double quotes (\verb!"!). Double quotes can be inserted in strings using the backslash character (\verb!\!). In the following, strings are referred to with the non-terminal \nonterm{string}{}. % TODO: escape sequences for strings \paragraph{Identifiers.} The syntax distinguishes lowercase and uppercase identifiers and, similarly, lowercase and uppercase qualified identifiers. \begin{center}\framebox{\input{./qualid_bnf.tex}}\end{center} \paragraph{Constants.} The syntax for constants is given in Figure~\ref{fig:bnf:constant}. Integer and real constants have arbitrary precision. Integer constants may be given in base 16, 10, 8 or 2. Real constants may be given in base 16 or 10. \begin{figure} \begin{center}\framebox{\input{./constant_bnf.tex}}\end{center} \caption{Syntax for constants.} \label{fig:bnf:constant} \end{figure} \paragraph{Operators.} Prefix and infix operators are built from characters organized in four categories (\textsl{op-char-1} to \textsl{op-char-4}). \begin{center}\framebox{\input{./operator_bnf.tex}}\end{center} Infix operators are classified into 4 categories, according to the characters they are built from: \begin{itemize} \item level 4: operators containing only characters from \textit{op-char-4}; \item level 3: those containing characters from \textit{op-char-3} or \textit{op-char-4}; \item level 2: those containing characters from \textit{op-char-2}, \textit{op-char-3} or \textit{op-char-4}; \item level 1: all other operators (non-terminal \textit{infix-op-1}). \end{itemize} \paragraph{Labels.} Identifiers, terms, formulas, program expressions can all be labeled, either with a string, or with a location tag. \begin{center}\framebox{\input{./label_bnf.tex}}\end{center} A location tag consists of a file name, a line number, and starting and ending characters. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Why3 Language} \paragraph{Terms.} The syntax for terms is given in Figure~\ref{fig:bnf:term}. The various constructs have the following priorities and associativities, from lowest to greatest priority: \begin{center} \begin{tabular}{|l|l|} \hline construct & associativity \\ \hline\hline \texttt{if then else} / \texttt{let in} & -- \\ label & -- \\ cast & -- \\ infix-op level 1 & left \\ infix-op level 2 & left \\ infix-op level 3 & left \\ infix-op level 4 & left \\ prefix-op & -- \\ function application & left \\ brackets / ternary brackets & -- \\ bang-op & -- \\ \hline \end{tabular} \end{center} Note the curryfied syntax for function application, though partial application is not allowed (rejected at typing). \begin{figure} \begin{center}\framebox{\input{./term_bnf.tex}}\end{center} \caption{Syntax for terms.} \label{fig:bnf:term} \end{figure} \paragraph{Type Expressions.} The syntax for type expressions is the following: \begin{center}\framebox{\input{./type_bnf.tex}}\end{center} Built-in types are \texttt{int}, \texttt{real}, and tuple types. Note that the syntax for type expressions notably differs from the usual ML syntax (\emph{e.g.} the type of polymorphic lists is written \texttt{list 'a}, not \texttt{'a list}). \paragraph{Formulas.} The syntax for formulas is given Figure~\ref{fig:bnf:formula}. The various constructs have the following priorities and associativities, from lowest to greatest priority: \begin{center} \begin{tabular}{|l|l|} \hline construct & associativity \\ \hline\hline \texttt{if then else} / \texttt{let in} & -- \\ label & -- \\ \texttt{->} / \texttt{<->} & right \\ \verb!\/! / \verb!||! & right \\ \verb|/\| / \verb!&&! & right \\ \texttt{not} & -- \\ infix level 1 & left \\ infix level 2 & left \\ infix level 3 & left \\ infix level 4 & left \\ prefix & -- \\ \hline \end{tabular} \end{center} Note that infix symbols of level 1 include equality (\texttt{=}) and disequality (\texttt{<>}). \begin{figure} \begin{center}\framebox{\input{./formula_bnf.tex}}\end{center} \caption{Syntax for formulas.} \label{fig:bnf:formula} \end{figure} Notice that there are two symbols for the conjunction: \texttt{and} and \verb|&&|, and similarly for disjunction. There are logically equivalent, but may be treated slightly differently by some transformation, \eg{} the \texttt{split} transformation transforms $A~\texttt{and}~B$ into subgoals $A$ and $B$, whereas it transforms $A~\verb|&&|~B$ into subgoals $A$ and $A\rightarrow B$. \paragraph{Theories.} The syntax for theories is given on Figure~\ref{fig:bnf:theorya} and~\ref{fig:bnf:theoryb}. \begin{figure} \begin{center}\framebox{\input{./theory_bnf.tex}}\end{center} \caption{Syntax for theories (part 1).} \label{fig:bnf:theorya} \end{figure} \begin{figure} \begin{center}\framebox{\input{./theory2_bnf.tex}}\end{center} \caption{Syntax for theories (part 2).} \label{fig:bnf:theoryb} \end{figure} \paragraph{Files.} A \why input file is a (possibly empty) list of theories. \begin{center}\framebox{\input{./why_file_bnf.tex}}\end{center} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \clearpage \section{Why3ML Language}\label{sec:syntax:whyml} \subsection{Types} The syntax for program types is given in Figure~\ref{fig:bnf:typev}. \begin{figure} \begin{center}\framebox{\input{./typev_bnf.tex}}\end{center} \caption{Syntax for program types.} \label{fig:bnf:typev} \end{figure} \subsection{Expressions} The syntax for program expressions is given in Figure~\ref{fig:bnf:expra} and~Figure~\ref{fig:bnf:exprb}. \begin{figure} \begin{center}\framebox{\input{./expr_bnf.tex}}\end{center} \caption{Syntax for program expressions (part1).} \label{fig:bnf:expra} \end{figure} \begin{figure} \begin{center}\framebox{\input{./expr2_bnf.tex}}\end{center} \caption{Syntax for program expressions (part2).} \label{fig:bnf:exprb} \end{figure} In the following we describe the informal semantics of each constructs, and provide the corresponding rule for computing the weakest precondition. \subsubsection{Constant expressions} Integer and real constants are as in the logic language TODO \subsubsection{Unary and Binary Operators} \subsubsection{Array accesses and updates, fields access and updates} TODO \subsubsection{Let binding, sequences} TODO \subsubsection{Function definition} TODO: fun, let rec \subsubsection{Function call} TODO \subsubsection{Exception throwing and catching} TODO: raise, try with end \subsubsection{Conditional expression, pattern matching} if then else. Discuss standar WP versus fast WP \subsubsection{Loops} while, loop, for \subsubsection{Assertions, Code Contracts} There are several form of expressions for inserting annotations in the code. The first form of those are the \emph{assertions} which have the form \begin{flushleft}\ttfamily \textsl{keyword} \{ \textsl{proposition} \} \end{flushleft} where \textsl{keyword} is either \texttt{assert}, \texttt{assume} or \texttt{check}. They all state that the given proposition holds at the given program point. The differences are: \begin{itemize} \item \texttt{assert} requires to prove that the proposition holds, and then make it available in the context of the remaining of the code \item \texttt{check} requires to prove that the proposition holds, but does not make it visible in the remaining \item \texttt{assume} assumes that the proposition holds and make it visible in the context of the remiaing code, without requiring to prove it. It acts like an axiom, but within a program. \end{itemize} The corresponding rules for computing WP are as follows: \begin{eqnarray*} WP(\texttt{assert} \{ P \}, Q) &=& P \mathop{\&\&} Q = P \land (P \rightarrow Q)\\ WP(\texttt{check} \{ P \}, Q) &=& P \land Q \\ WP(\texttt{assume} \{ P \}, Q) &=& P \rightarrow Q \end{eqnarray*} The other forms of code contracts allow to abstract a piece of code by specifications. \begin{itemize} \item $\texttt{any}~\{ P \}~\tau~\epsilon~\{ Q \}$ is a non-deterministic expression which requires the precondition $P$ to hold, then makes some side effects $\epsilon$ and returns any value of type $\tau$ such that $Q$ holds. This construct acts as an axiom in the sense that it is not check whether there exists any program that can effectively establish the post-condition (similarly as the introduction of a \texttt{val} at the global level). \item $\texttt{contract}~e~\{ Q \}$ evaluates expression $e$ and the ensures that the post-condition $Q$ holds afterwards. \item $\texttt{abstract}~e~\{ Q \}$ makes sure that the evaluation of expression $e$ establishes the post-condition $Q$, and then abstract away the program state by the post-condition $Q$ (similarly to the \texttt{any} construct). \end{itemize} The corresponding rules for computing WP are as follows: \[ \begin{array}{l} WP(\texttt{any}~\{ P \}~\tau~\epsilon~\{ Q \mid exn_i \Rightarrow R_i \} , Q' exn_i \Rightarrow R'_i) = \\ \qquad\qquad P \mathop{\&\&} \forall result, \epsilon. (Q \rightarrow Q') \land (R_i \rightarrow R'_i) \\ WP(\texttt{contract}~e~\{ Q \mid exn_i \Rightarrow R_i \} , Q' \mid exn_i \Rightarrow R'_i) = \\ \qquad\qquad WP(e,Q \land Q' \mid exn_i \Rightarrow R_i \land R'_i) \\ WP(\texttt{abstract}~e~\{ Q \mid exn_i \Rightarrow R_i \} , Q' \mid exn_i \Rightarrow R'_i) = \\ \qquad\qquad WP(e,Q \mid exn_i \Rightarrow R_i) \land \forall result, \epsilon, Q \rightarrow Q' \land R_i \rightarrow R'_i \end{array} \] TODO ? sandbox: \[ WP(\texttt{sandbox}~e, Q) = WP(e,true) \land Q \] (side-effects in $e$ are "forgotten"...) \subsubsection{Labels, old, at} TODO \subsection{Modules} The syntax for modules is given in Figure~\ref{fig:bnf:module}. \begin{figure} \begin{center}\framebox{\input{./module_bnf.tex}}\end{center} \caption{Syntax for modules.} \label{fig:bnf:module} \end{figure} Any declaration which is accepted in a theory is also accepted in a module. Additionally, modules can introduce record types with mutable fields and declarations which are specific to programs (global variables, functions, exceptions). \subsection{Files} A \whyml input file is a (possibly empty) list of theories and modules. \begin{center}\framebox{\input{./whyml_file_bnf.tex}}\end{center} A theory defined in a \whyml\ file can only be used within that file. If a theory is supposed to be reused from other files, be they \why\ or \whyml\ files, it should be defined in a \why\ file. %%% Local Variables: %%% mode: latex %%% TeX-PDF-mode: t %%% TeX-master: "manual" %%% End: