syntaxref.tex 34 KB
Newer Older
1
\chapter{Language Reference}
MARCHE Claude's avatar
MARCHE Claude committed
2
\label{chap:syntaxref}
3
%HEVEA\cutname{syntaxref.html}
MARCHE Claude's avatar
MARCHE Claude committed
4

5
In this chapter, we describe the syntax and semantics of \whyml.
6

7 8 9 10
%This chapter is not yet fully updated to the new syntax of
%\why 1.00, so it not distributed for the moment.
%
%\endinput
11

Guillaume Melquiond's avatar
Guillaume Melquiond committed
12
\section{Lexical Conventions}
13
\label{sec:lexer}
14

15 16 17 18 19
%Lexical conventions of \whyml are similar to those of OCaml.
%
Blank characters are space, horizontal tab, carriage return,
and line feed. Blanks separate lexemes but are otherwise ignored.
%
20
Comments are enclosed by \texttt{(*} and \texttt{*)} and can be nested.
21
Note that \texttt{(*)} does not start a comment.
Guillaume Melquiond's avatar
Guillaume Melquiond committed
22

23
Strings are enclosed in double quotes (\verb!"!). Double quotes can be
24
escaped inside strings using the backslash character (\verb!\!).
Guillaume Melquiond's avatar
Guillaume Melquiond committed
25 26
The other special sequences are \verb!\n! for line feed and \verb!\t!
for horizontal tab.
27
In the following, strings are referred to with the non-terminal
28 29 30 31 32
\nonterm{string}{}\spacefalse.

%\subsection{Constants}
The syntax for numerical constants is given by the following rules:
%\begin{figure}[ht]
33
\begin{center}\input{./generated/constant_bnf.tex}\end{center}
34 35 36 37
%\caption{Syntax for numerical constants.}
%\label{fig:bnf:constant}
%\end{figure}
%in Figure~\ref{fig:bnf:constant}.
38
Integer and real constants have arbitrary precision.
39 40 41 42 43
Integer constants can be given in base 10, 16, 8 or 2.
Real constants can be given in base 10 or 16.
Notice that the exponent in hexadecimal real constants is written in base 10.

%\subsection{Identifiers}
44 45
Identifiers are composed of letters, digits, underscores,
and primes. %, as shown in Figure~\ref{fig:bnf:ident}.
46
The syntax distinguishes identifiers that start with a lowercase letter
47 48
or an underscore (\nonterm{lident}{}\spacefalse), identifiers that
start with an uppercase letter (\nonterm{uident}{}\spacefalse),
49
and identifiers that start with a prime
50
(\nonterm{qident}{}\spacefalse, used exclusively for type variables):
51
%\begin{figure}[ht]
52
\begin{center}\input{./generated/ident_bnf.tex}\end{center}
53 54 55
%\caption{Syntax for identifiers.}
%\label{fig:bnf:ident}
%\end{figure}
56
Identifiers that contain a prime followed by a letter,
57
such as \texttt{int32'max}, are reserved for symbols
58 59 60 61 62 63 64
introduced by \why and cannot be used for user-defined symbols.

In order to refer to symbols introduced in different namespaces
(\textit{scopes}), we can put a dot-separated ``qualifier prefix''
in front of an identifier (e.g.~\texttt{Map.S.get}).
This allows us to use the symbol \texttt{get}
from the scope \texttt{Map.S} without importing
65
it in the current namespace:
66 67 68 69 70
%\begin{figure}[ht]
\begin{center}\input{./generated/qualid_bnf.tex}\end{center}
%\caption{Syntax for qualified identifiers.}
%\label{fig:bnf:qualid}
%\end{figure}
71 72 73 74 75 76 77 78 79
All parenthesised expressions in \whyml (types,
patterns, logical terms, program expressions)
admit a qualifier before the opening parenthesis,
e.g.~\texttt{Map.S.(get m i)}. This imports
the indicated scope into the current namespace during
the parsing of the expression under the qualifier.
For the sake of convenience, the parentheses can be omitted
when the expression itself is enclosed in parentheses,
square brackets or curly braces.
80 81

%\subsection{Operators}
82
Prefix and infix operators are built from characters organized in four
83
precedence groups (\textsl{op-char-1} to \textsl{op-char-4}), with
84
optional primes at the end:
85 86
%as shown in Figure~\ref{fig:bnf:operator}.
%\begin{figure}[ht]
87
\begin{center}\input{./generated/operator_bnf.tex}\end{center}
88 89 90
%\caption{Syntax for operators.}
%\label{fig:bnf:operator}
%\end{figure}
91 92 93 94 95 96 97
Infix operators from a high-numbered group bind stronger
than the infix operators from a low-numbered group.
For example, infix operator \texttt{.*.} from group 3
would have a higher precedence than infix operator
\texttt{->-} from group 1.
Prefix operators always bind stronger than infix operators.
The so-called ``tight operators'' are prefix operators that have even
98
higher precedence than the juxtaposition (application) operator,
99 100 101 102 103
allowing us to write expressions like \texttt{inv !x}
without parentheses.
%An operator from \textsl{infix-op-4} or \textsl{prefix-op}
%cannot start with \texttt{!} or \texttt{?}: such operators
%are always recognized as tight operators.
104 105 106
%Infix operators from groups 2-4 are left-associative.
%Infix operators from group 1 are non-associative and
%may instead be chained, as explained in Section~\ref{sec:terms}.
107

108
%An operator inside parenthesis can act as a lowercase identifier.
109
%\begin{figure}[ht]
110
%\begin{center}\input{./generated/ident_op_bnf.tex}\end{center}
111 112 113 114 115
%\caption{Syntax for qualified identifiers.}
%\label{fig:bnf:qualid}
%\end{figure}

Finally, any identifier, term, formula, or expression
116 117
in a \whyml source can be tagged either with a string
\textit{attribute} or a location:
118
\begin{center}\input{./generated/attribute_bnf.tex}\end{center}
119
An attribute cannot contain newlines or closing square brackets;
120
leading and trailing spaces are ignored.
121
A location consists of a file name in double quotes,
122
a line number, and starting and ending character positions.
123

124 125 126 127 128 129
\section{Type expressions}
\label{sec:types}

\whyml features an ML-style type system with polymorphic types,
variants (sum types), and records that can have mutable fields.
The syntax for type expressions is the following:
130
\begin{center}\input{./generated/type_bnf.tex}\end{center}
131 132
Built-in types are \texttt{int} (arbitrary precision integers),
\texttt{real} (real numbers), \texttt{bool}, the arrow type
133
(also called the \textit{mapping type}),
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156
and the tuple types.
The empty tuple type is also called the \textit{unit type}
and can be written as \texttt{unit}.

Note that the syntax for type expressions notably differs from
the usual ML syntax. In particular, the type of polymorphic lists
is written \texttt{list 'a}, and not \texttt{'a list}.

\textit{Snapshot types} are specific to \whyml, they denote
the types of ghost values produced by pure logical functions in
\whyml programs. A snapshot of an immutable type is the type
itself: thus, \texttt{\{int\}} is the same as \texttt{int} and
\texttt{\{list 'a\}} is the same as \texttt{list 'a}.
A snapshot of a mutable type, however, represents a snapshot
value which cannot be modified anymore. Thus, a snapshot array
\texttt{a} of type \texttt{\{array int\}} can be read from
(\texttt{a[42]} is accepted) but not written into
(\texttt{a[42] <- 0} is rejected). Generally speaking,
a program function that expects an argument of a mutable type
will accept an argument of the corresponding snapshot type
as long as it is not modified by the function.

\section{Logical expressions: terms and formulas}
157
\label{sec:terms}
158

159 160 161 162 163 164 165 166 167 168 169 170
\begin{figure}[p!]
\begin{center}\input{./generated/term1_bnf.tex}\end{center}
\caption{\whyml terms (part I).}
\label{fig:bnf:term1}
\end{figure}

\begin{figure}[ht]
\begin{center}\input{./generated/term2_bnf.tex}\end{center}
\caption{\whyml terms (part II).}
\label{fig:bnf:term2}
\end{figure}

171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
A significant part of a typical \whyml source file is occupied
by non-executable logical content intended for specification
and proof: function contracts, assertions, definitions of
logical functions and predicates, axioms, lemmas, etc.

Logical expressions are called \textit{terms}. Boolean
terms are called \textit{formulas}. Internally, \why distinguishes
the proper formulas (produced by predicate symbols, propositional
connectives and quantifiers) and the terms of type \texttt{bool}
(produced by Boolean variables and logical functions that return
\texttt{bool}). However, this distinction is not enforced on the
syntactical level, and \why will perform the necessary conversions
behind the scenes.

The syntax of \whyml terms is given in
Figures~\ref{fig:bnf:term1}-\ref{fig:bnf:term3}.
The constructions are listed in the order of
decreasing precedence.
189
For example, as was mentioned above,
190
tight operators have the highest precedence of all operators,
191 192 193
so that \texttt{-p.x} denotes the negation of the
record field \texttt{p.x}, whereas \texttt{!p.x}
denotes the field \texttt{x} of a record stored
194 195
in the reference \texttt{p}.

196
An operator in parentheses acts as an identifier
197 198 199 200 201
referring to that operator, for example, in a definition.
To distinguish between prefix and infix operators, an
underscore symbol is appended at the end: for example,
\texttt{(-)} refers to the binary subtraction and
\texttt{(-\_)} to the unary negation.
202
Tight operators cannot be used as infix operators,
203 204 205 206 207 208 209 210
and thus do not require disambiguation.

In addition to prefix and infix operators, \whyml
supports several mixfix bracket operators to
manipulate various collection types: dictionaries,
arrays, sequences, etc. Bracket operators do not have
any predefined meaning and may be used to denote access
and update operations for various user-defined collection types.
211
We can introduce multiple bracket operations in the same scope
212
by disambiguating them with primes after the closing
213 214 215
bracket: for example, \texttt{a[i]} may denote array access
and \texttt{s[i]'} sequence access.
Notice that the in-place update operator \texttt{a[i] <- v}
216 217 218
cannot be used inside logical terms: all effectful operations
are restricted to program expressions. To represent the result
of a collection update, we should use a pure logical update
219 220 221 222 223 224 225
operator \texttt{a[i <- v]} instead.
%Overloading of bracket operations, which would allow to use
%the same bracket operator for different collections,
%is currently not supported in \whyml.

\whyml supports ``associated'' names for operators, obtained
by adding a suffix after the parenthesised operator name.
226 227
For example, an axiom that represents the specification of the
infix operator \texttt{(+)} may be called \texttt{(+)'spec}
228
or \texttt{(+)\_spec}. As with normal identifiers, names
229
with a letter after a prime, such as \texttt{(+)'spec},
230 231
can only be introduced by \why, and not by the user in a \whyml
source.
232 233

The \texttt{at} and \texttt{old} operators are used inside
234
postconditions and assertions to refer to the value of
235 236 237 238 239 240 241 242 243 244 245 246
a mutable program variable at some past moment of execution
(see the next section for details).
These operators have higher precedence than the infix
operators from group 1 (\textsl{infix-op-1}): \texttt{old i > j}
is parsed as \texttt{(old i) > j} and not as \texttt{old (i > j)}.

Infix operators from groups 2-4 are left-associative.
Infix operators from group 1 are non-associative and
can be chained. For example, the term \texttt{0 <= i < j < length a}
is parsed as the conjunction of three inequalities \texttt{0 <= i},
\texttt{i < j}, and \texttt{j < length a}.

247 248 249 250 251 252 253
As with normal identifiers,
we can put a qualifier over a parenthesised operator,
e.g.~\texttt{Map.S.([]) m i}. Also, as noted above,
a qualifier can be put over a parenthesised term,
%e.g.~\texttt{Map.S.(m[i])},
and the parentheses
can be omitted if the term is a record or a record update.
254

255 256 257 258 259 260 261 262 263
The propositional connectives in \whyml formulas are listed in
Figure~\ref{fig:bnf:term2}. The non-standard connectives ---
asymmetric conjunction (\texttt{\&\&}), asymmetric disjunction
(\texttt{||}), proof indication (\texttt{by}), and consequence
indication (\texttt{so}) --- are used to control the goal-splitting
transformations of \why and provide integrated proofs for
\whyml assertions, postconditions, lemmas, etc.
The semantics of these connectives
follows the rules below:
264
\begin{itemize}\setlength{\itemsep}{0ex}
265 266
\item A proof task for \texttt{A \&\& B} is split into
separate tasks for \texttt{A} and \texttt{A -> B}.
267 268 269 270 271 272 273
If \texttt{A \&\& B} occurs as a premise, it behaves
as a normal conjunction.
\item A case analysis over \texttt{A || B} is split into
disjoint cases \texttt{A} and \texttt{not A {/\char92} B}.
If \texttt{A || B} occurs as a goal, it behaves
as a normal disjunction.
\item An occurrence of \texttt{A by B} generates a side condition
274
\texttt{B -> A} (the proof justifies the affirmation).
275 276 277 278 279
When \texttt{A by B} occurs as a premise,
it is reduced to \texttt{A} (the proof is discarded).
When \texttt{A by B} occurs as a goal,
it is reduced to \texttt{B} (the proof is verified).
\item An occurrence of \texttt{A so B} generates a side condition
280
\texttt{A -> B} (the premise justifies the conclusion).
281
When \texttt{A so B} occurs as a premise,
282 283
it is reduced to the conjunction \mbox{\texttt{A {/\char92} B}}
(we use both the premise and the conclusion).
284 285 286 287 288 289 290
When \texttt{A so B} occurs as a goal,
it is reduced to \texttt{A} (the premise is verified).
\end{itemize}
For example, full splitting of the goal
\texttt{(A by (exists x. B so C)) \&\& D}
produces four subgoals:
\texttt{exists x. B} (the premise is verified),
291 292
\texttt{forall x. B -> C} (the premise justifies the conclusion),
\texttt{(exists x. B {/\char92} C) -> A} (the proof justifies the affirmation),
293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339
and finally, \texttt{A -> D} (the proof of \texttt{A} is discarded
and \texttt{A} is used to prove \texttt{D}).

%Figure~\ref{fig:byso} contains more examples of usage of
%\texttt{\&\&}, \texttt{||}, \texttt{by}, and \texttt{so}.
%\begin{figure}[ht]
%\begin{center}
%\begin{tabular}{c|c}
%\multicolumn{1}{c|}{Initial goal} &
%\multicolumn{1}{c}{Goals after full splitting} \\
%\hline
%\texttt{A -> (B {/\char92} C)} & \texttt{A -> B}, \:\: \texttt{A -> C} \\
%\texttt{(A {\char92/} B) -> C} & \texttt{A -> C}, \:\: \texttt{B -> C} \\[1ex]
%\texttt{A -> (B {\&\&} C)} & \texttt{A -> B}, \:\: \texttt{A -> (B -> C)} \\
%\texttt{(A || B) -> C} & \texttt{A -> C}, \:\: \texttt{(not A {/\char92} B) -> C} \\[1ex]
%\texttt{A -> (B by C)} & \texttt{A -> C}, \:\: \texttt{A -> (C -> B)} \\
%\texttt{(A so B) -> C} & \texttt{A -> B}, \:\: \texttt{(A {/\char92} B) -> C} \\[1ex]
%\texttt{A by (B by C)} & \texttt{C}, \:\:
%  \texttt{C -> B}, \:\: \texttt{B -> A} \\
%\texttt{A by (B so C)} & \texttt{B}, \:\:
%  \texttt{B -> C}, \:\: \texttt{(B {/\char92} C) -> A} \\
%\end{tabular}
%\end{center}
%\caption{Non-standard propositional connectives.}
%\label{fig:byso}
%\end{figure}

The behaviour of the splitting transformations is further
controlled by attributes \texttt{[@stop\_split]} and
\texttt{[@case\_split]}. Consult Section~\ref{tech:trans:split}
for details.

Among the propositional connectives,
\texttt{not} has the highest precedence,
\texttt{\&\&} has the same precedence as \texttt{/\char92}
(weaker than negation),
\texttt{||} has the same precedence as \texttt{\char92/}
(weaker than conjunction),
\texttt{by}, \texttt{so}, \texttt{->}, and \texttt{<->}
all have the same precedence (weaker than disjunction).
All binary connectives except equivalence are right-associative.
Equivalence is non-associative and is chained instead:
\texttt{A <-> B <-> C} is transformed into a conjunction
of \texttt{A <-> B} and \texttt{B <-> C}.
To reduce ambiguity, \whyml forbids to place
a non-parenthesised implication at the right-hand side
of an equivalence: \texttt{A <-> B -> C} is rejected.
340

341 342 343 344 345 346
\begin{figure}[ht]
\begin{center}\input{./generated/term3_bnf.tex}\end{center}
\caption{\whyml terms (part III).}
\label{fig:bnf:term3}
\end{figure}

347 348 349 350 351 352 353 354 355 356
In Figure~\ref{fig:bnf:term3}, we find the more advanced
term constructions: conditionals, let-bindings, pattern
matching, and local function definitions,
either via the \texttt{let-in} construction or the
\texttt{fun} keyword. The pure logical functions
defined in this way are called \emph{mappings};
they are first-class values of ``arrow'' type
\texttt{$\tau_1$ -> $\tau_2$}.

The patterns are similar to those of OCaml, though the \texttt{when}
357 358
clauses and numerical constants are not supported. Unlike in OCaml,
\texttt{as} binds stronger than the comma: in the pattern
359 360 361 362 363 364 365 366 367 368 369 370 371
\texttt{($p_1$,$p_2$ as x)}, variable \texttt{x} is bound to
the value matched by pattern $p_2$. Also notice the closing
\texttt{end} after the \texttt{match-with} term.
A \texttt{let-in} construction with a non-trivial pattern is
translated as a \texttt{match-with} term with a single branch.

Inside logical terms, pattern matching must be exhaustive:
\whyml rejects a term like \texttt{let Some x = o in $\dots$},
where \texttt{o} is a variable of an option type.
In program expressions, non-exhaustive pattern matching
is accepted and a proof obligation is generated to show
that the values not covered cannot occur in execution.

372 373 374
The syntax of parameters %(non-terminal
%\nonterm{param}{} in Figure~\ref{fig:bnf:term3})
in user-defined operations---%
375 376
first-class mappings,
top-level logical functions and predicates,
377
and program functions---%
378 379 380 381 382
is rather flexible in \whyml.
Like in OCaml, the user can specify the name of a parameter
without its type and let the type be inferred from the
definition. Unlike in OCaml, the user can also specify
the type of the parameter without giving its name.
383 384
This is convenient when the symbol declaration does not
provide the actual definition or specification of
385 386 387
the symbol, and thus only the type signature is of
relevance.
For example, one can declare an abstract binary function
388 389 390 391 392
that adds an element to a set simply by writing
\texttt{function add 'a (set 'a) : set 'a}.
A standalone non-qualified lowercase identifier without
attributes is treated as a type name when the definition
is not provided, and as a parameter name otherwise.
393 394

Ghost patterns, ghost variables after \texttt{as},
395
and ghost parameters in function definitions are only used
396 397 398 399 400 401
in program code, and not allowed in logical terms.

\section{Program expressions}
\label{sec:exprs}

The syntax of program expressions is given in
402
Figures~\ref{fig:bnf:expr1}-\ref{fig:bnf:expr2}.
403 404 405 406 407 408 409
As before, the constructions are listed in the order of decreasing
precedence. The rules for tight, prefix, infix, and bracket operators
are the same as for logical terms. In particular, the infix operators
from group~1 can be chained. Notice that binary operators \texttt{\&\&}
and \texttt{||} denote here the usual lazy conjunction and disjunction,
respectively.

410 411 412 413 414 415 416 417 418 419 420
\begin{figure}[ht]
\begin{center}\input{./generated/expr1_bnf.tex}\end{center}
\caption{\whyml program expressions (part I).}
\label{fig:bnf:expr1}
\end{figure}

%Two new operators make appearance in Figure~\ref{fig:bnf:expr1}.
Keyword \texttt{ghost} marks the expression as
ghost code added for verification purposes. Ghost code is
removed from the final code intended for execution, and thus
cannot affect the computation of the program results nor the
421 422
content of the observable memory.

423 424 425 426
Assignment updates in place
a mutable record field or an element of a collection.
The former can be done simultaneously
on a tuple of values: \texttt{x.f, y.g <- a, b}. The latter
427 428 429 430
form, \texttt{a[i] <- v}, amounts to a call of the ternary
bracket operator \texttt{([]<-)} and cannot be used in a
multiple assignment.

431
\newpage
432 433 434 435 436 437
\begin{figure}[ht]
\begin{center}\input{./generated/expr2_bnf.tex}\end{center}
\caption{\whyml program expressions (part II).}
\label{fig:bnf:expr2}
\end{figure}

438

439
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
MARCHE Claude's avatar
MARCHE Claude committed
440

441
\section{The \why Language}
Guillaume Melquiond's avatar
Guillaume Melquiond committed
442 443

\subsection{Terms}
444

445
The syntax for terms is given in Figure~\ref{fig:bnf:term1}.
446 447 448 449 450 451 452 453 454 455
The various constructs have the following priorities and
associativities, from lowest to greatest priority:
\begin{center}
  \begin{tabular}{|l|l|}
    \hline
    construct & associativity \\
    \hline\hline
    \texttt{if then else} / \texttt{let in} & -- \\
    label & -- \\
    cast  & -- \\
456 457 458 459 460 461 462 463
    infix-op level 1 & left \\
    infix-op level 2 & left \\
    infix-op level 3 & left \\
    infix-op level 4 & left \\
    prefix-op     & --   \\
    function application & left \\
    brackets / ternary brackets & -- \\
    bang-op       & --   \\
464 465 466 467
    \hline
  \end{tabular}
\end{center}

468 469
Note the curryfied syntax for function application, though partial
application is not allowed (rejected at typing).
470

471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499
\subsection{Formulas}

The syntax for formulas is given Figure~\ref{fig:bnf:formula}.
The various constructs have the following priorities and
associativities, from lowest to greatest priority:
\begin{center}
  \begin{tabular}{|l|l|}
    \hline
    construct & associativity \\
    \hline\hline
    \texttt{if then else} / \texttt{let in} & -- \\
    label & -- \\
    \texttt{->} / \texttt{<->} & right \\
    \texttt{by} / \texttt{so} & right \\
    \verb!\/! / \verb!||! & right \\
    \verb|/\| / \verb!&&! & right \\
    \texttt{not}  & -- \\
    infix level 1 & left \\
    infix level 2 & left \\
    infix level 3 & left \\
    infix level 4 & left \\
    prefix        & --   \\
    \hline
  \end{tabular}
\end{center}
Note that infix symbols of level 1 include equality (\texttt{=}) and
disequality (\texttt{<>}).

\begin{figure}
500
  \begin{center}\framebox{\input{./generated/formula_bnf.tex}}\end{center}
501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523
  \caption{Syntax for formulas.}
\label{fig:bnf:formula}
\end{figure}

Notice that there are two symbols for the conjunction: \verb|/\|
and \verb|&&|, and similarly for disjunction. They are logically
equivalent, but may be treated slightly differently by some
transformations. For instance, \texttt{split} transforms the goal
\verb|A /\ B| into subgoals \verb|A| and \verb|B|, whereas it transforms
\verb|A && B| into subgoals \verb|A| and \verb|A -> B|. Similarly, it
transforms \verb!not (A || B)! into subgoals \verb|not A| and
\verb|not ((not A) /\ B)|.
The \texttt{by}/\texttt{so} connectives are proof indications. They are
logically equivalent to their first argument, but may affect the result
of some transformations. For instance, the \texttt{split\_goal}
transformations interpret those connectives as introduction of logical cuts
 (see~\ref{tech:trans:split} for details).

\subsection{Theories}

The syntax for theories is given on Figure~\ref{fig:bnf:theorya} and~\ref{fig:bnf:theoryb}.

\begin{figure}
524
  \begin{center}\framebox{\input{./generated/theory_bnf.tex}}\end{center}
525 526 527 528 529
  \caption{Syntax for theories (part 1).}
\label{fig:bnf:theorya}
\end{figure}

\begin{figure}
530
  \begin{center}\framebox{\input{./generated/theory2_bnf.tex}}\end{center}
531 532 533 534
  \caption{Syntax for theories (part 2).}
\label{fig:bnf:theoryb}
\end{figure}

535 536 537 538 539 540 541 542 543
\subsubsection{Algebraic types}

TO BE COMPLETED

\subsubsection{Record types}

TO BE COMPLETED

\subsubsection{Range types}
MARCHE Claude's avatar
MARCHE Claude committed
544
\label{sec:rangetypes}
545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580

A declaration of the form \texttt{type r = < range \textit{a b} >}
defines a type that projects into the integer range
\texttt{[\textit{a,b}]}. Note that in order to make such a declaration
the theory \texttt{int.Int} must be imported.

Why3 let you cast an integer literal in a range type
(e.g. \texttt{(42:r)}) and will check at typing that the literal is in
range. Defining such a range type $r$ automatically introduces the
following:
\begin{whycode}
  function  r'int r : int
  constant  r'maxInt : int
  constant  r'minInt : int
\end{whycode}
The function \texttt{r'int} projects a term of type \texttt{r} to its
integer value. The two constants represent the high bound and low
bound of the range respectively.

Unless specified otherwise with the meta \texttt{"keep:literal"} on
\texttt{r}, the transformation \emph{eliminate\_literal} introduces an
axiom
\begin{whycode}
axiom r'axiom : forall i:r. r'minInt <= r'int i <= r'maxInt
\end{whycode}
and replaces all casts of the form \texttt{(42:r)} with a constant and
an axiom as in:
\begin{whycode}
constant rliteral7 : r
axiom rliteral7_axiom : r'int rliteral7 = 42
\end{whycode}

This type is used in the standard library in the theories
\texttt{bv.BV8}, \texttt{bv.BV16}, \texttt{bv.BV32}, \texttt{bv.BV64}.


MARCHE Claude's avatar
MARCHE Claude committed
581
\subsubsection{Floating-point Types}
582 583

A declaration of the form \texttt{type f = < float \textit{eb sb} >}
MARCHE Claude's avatar
MARCHE Claude committed
584
defines a type of floating-point numbers as specified by the IEEE-754
585
standard~\cite{ieee754-2008}. Here the literal \texttt{\textit{eb}}
MARCHE Claude's avatar
MARCHE Claude committed
586
represents the number of bits in the exponent and the literal
587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603
\texttt{\textit{sb}} the number of bits in the significand (including
the hidden bit). Note that in order to make such a declaration the
theory \texttt{real.Real} must be imported.

Why3 let you cast a real literal in a float type
(e.g. \texttt{(0.5:f)}) and will check at typing that the literal is
representable in the format. Note that Why3 do not implicitly round a
real literal when casting to a float type, it refuses the cast if the
literal is not representable.

Defining such a type \texttt{f} automatically introduces the following:
\begin{whycode}
  predicate f'isFinite f
  function  f'real f : real
  constant  f'eb : int
  constant  f'sb : int
\end{whycode}
MARCHE Claude's avatar
MARCHE Claude committed
604 605 606 607 608 609 610
As specified by the IEEE standard, float formats includes infinite
values and also a special NaN value (Not-a-Number) to represent
results of undefined operations such as $0/0$.  The predicate
\texttt{f'isFinite} indicates whether its argument is neither infinite
nor NaN. The function \texttt{f'real} projects a finite term of type
\texttt{f} to its real value, its result is not specified for non finite
terms.
611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629

Unless specified otherwise with the meta \texttt{"keep:literal"} on
\texttt{f}, the transformation \emph{eliminate\_literal} will
introduce an axiom
\begin{whycode}
axiom f'axiom :
  forall x:f. f'isFinite x -> -. max_real <=. f'real x <=. max_real
\end{whycode}
where \texttt{max\_real} is the value of the biggest finite float in
the specified format. The transformation also replaces all casts of
the form \texttt{(0.5:f)} with a constant and an axiom as in:
\begin{whycode}
constant fliteral42 : f
axiom fliteral42_axiom : f'real fliteral42 = 0.5 /\ f'isFinite fliteral42
\end{whycode}

This type is used in the standard library in the theories
\texttt{ieee\_float.Float32} and \texttt{ieee\_float.Float64}.

Guillaume Melquiond's avatar
Guillaume Melquiond committed
630 631
\subsection{Files}

MARCHE Claude's avatar
MARCHE Claude committed
632
A \why input file is a (possibly empty) list of theories.
633
\begin{center}\framebox{\input{./generated/why_file_bnf.tex}}\end{center}
634 635 636


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
MARCHE Claude's avatar
MARCHE Claude committed
637
\clearpage
Guillaume Melquiond's avatar
Guillaume Melquiond committed
638
\section{The \whyml Language}\label{sec:syntax:whyml}
639

640
\subsection{Specification}
641

Guillaume Melquiond's avatar
Guillaume Melquiond committed
642
The syntax for specification clauses in programs
643
is given in Figure~\ref{fig:bnf:spec}.
MARCHE Claude's avatar
MARCHE Claude committed
644
\begin{figure}
645
  \begin{center}\framebox{\input{./generated/spec_bnf.tex}}\end{center}
646 647
  \caption{Specification clauses in programs.}
\label{fig:bnf:spec}
648
\end{figure}
649 650
Within specifications, terms are extended with new constructs
\verb|old| and \verb|at|:
651
\begin{center}\framebox{\input{./generated/term_old_at_bnf.tex}}\end{center}
652 653 654 655
Within a postcondition, $\verb|old|~t$ refers to the value of term $t$
in the prestate. Within the scope of a code mark $L$,
the term $\verb|at|~t~\verb|'|L$ refers to the value of term $t$ at the program
point corresponding to $L$.
656

657 658 659 660
\subsection{Expressions}

The syntax for program expressions is given in
Figure~\ref{fig:bnf:expra} and~Figure~\ref{fig:bnf:exprb}.
MARCHE Claude's avatar
MARCHE Claude committed
661
\begin{figure}
662
  \begin{center}\framebox{\input{./generated/expr_bnf.tex}}\end{center}
MARCHE Claude's avatar
MARCHE Claude committed
663
  \caption{Syntax for program expressions (part 1).}
664 665 666 667
\label{fig:bnf:expra}
\end{figure}

\begin{figure}
668
  \begin{center}\framebox{\input{./generated/expr2_bnf.tex}}\end{center}
MARCHE Claude's avatar
MARCHE Claude committed
669
  \caption{Syntax for program expressions (part 2).}
670
\label{fig:bnf:exprb}
671 672
\end{figure}

673 674 675 676 677 678
In applications, arguments are evaluated from right to left.
This includes applications of infix operators, with the only exception of
lazy operators \verb|&&| and \verb+||+ that evaluate from left to
right, lazily.


Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
679 680 681
% In the following we describe the informal semantics of each
% constructs, and provide the corresponding rule for computing the
% weakest precondition.
682 683


Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
684
% \subsubsection{Constant Expressions, Unary and Binary Operators}
685 686


Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
687
% Integer and real constants are as in the logic language, as weel as the unary and binary operators.
688

689

Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
690
% \subsubsection{Array accesses and updates, fields access and updates}
691

Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
692
% \todo{}
693

Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738
% \subsubsection{Let binding, sequences}

% \todo{}

% \subsubsection{Function definition}

% \todo{fun, let rec}

% \subsubsection{Function call}

% \todo{}

% \subsubsection{Exception throwing and catching}

% \todo{raise, try with end}

% \subsubsection{Conditional expression, pattern matching}

% \todo{if then else. Discuss standard WP versus fast WP}

% \subsubsection{Iteration Expressions}

% There are three kind of expressions for iterating: bounded, unbounded and infinite.

% \begin{itemize}
% \item A bounded iteration has the form
% \begin{flushleft}\ttfamily
%   for $i$=$a$ to $b$ do invariant \{ $I$ \} $e$ done
% \end{flushleft}
% Expressions $a$ and $b$ are evaluated first and only once, then expression $e$ si evaluated successively for $i$ from $a$ to $b$ included. Nothing is executed if $a > b$. The invariant $I$ must hold at each iteration including before entering the loop and when leaving it. The rule for computing WP is as follows:
% \begin{eqnarray*}
%   WP(\texttt{for} \ldots, Q) &=& I(a) \land \\
% && \forall \vec{w} (\forall i, a \leq i \leq b \land I(i) \rightarrow WP(e,I(i+1))) \land (I(b+1) \rightarrow Q)
% \end{eqnarray*}
% where $\vec{w}$ is the set of references modified by $e$.

% A downward bounded iteration is also available, under the form
% \begin{flushleft}\ttfamily
%   for $i$=$a$ downto $b$ do invariant \{ $I$ \} $e$ done
% \end{flushleft}

% \item An unbounded iteration has the form
% \begin{flushleft}\ttfamily
%   while $c$ do invariant \{ $I$ \} $e$ done
% \end{flushleft}
739
% it iterates the loop body $e$ until the condition $c$ becomes false.
Jean-Christophe Filliatre's avatar
Jean-Christophe Filliatre committed
740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814
% \begin{eqnarray*}
%   WP(\texttt{while} \ldots, Q) &=& I \land \\
% && \forall \vec{w} (c \land I \rightarrow WP(e,I)) \land (\neg c \land I \rightarrow Q)
% \end{eqnarray*}
% where $\vec{w}$ is the set of references modified by $e$.

% Additionally, such a loop can be given a variant $v$, a quantity that must decreases ar each iteration, so as to prove its termination.


% \item An infinite iteration has the form
% \begin{flushleft}\ttfamily
%   loop invariant \{ $I$ \} $e$ end
% \end{flushleft}
% it iterates the loop forever, hence the only way to exit such a loop is to raise an exception.
% \begin{eqnarray*}
%   WP(\texttt{loop} \ldots, Q \mid Exc \Rightarrow R) &=& I \land \\
% && \forall \vec{w} (I \rightarrow WP(e,I)) \land (I \rightarrow WP(e,Exc \Rightarrow R))
% \end{eqnarray*}
% \end{itemize}

% \subsubsection{Assertions, Code Contracts}

% There are several form of expressions for inserting annotations in the code.
% The first form of those are the \emph{assertions} which have the form
% \begin{flushleft}\ttfamily
%   \textsl{keyword} \{ \textsl{proposition} \}
% \end{flushleft}
% where \textsl{keyword} is either \texttt{assert}, \texttt{assume} or
% \texttt{check}. They all state that the given proposition holds at the given program point. The differences are:
% \begin{itemize}
% \item \texttt{assert} requires to prove that the proposition holds, and then make it available in the context of the remaining of the code
% \item \texttt{check} requires to prove that the proposition holds, but
%   does not make it visible in the remaining
% \item \texttt{assume} assumes that the proposition holds and make it
%   visible in the context of the remaining code, without requiring to
%   prove it. It acts like an axiom, but within a program.
% \end{itemize}
% The corresponding rules for computing WP are as follows:
% \begin{eqnarray*}
%   WP(\texttt{assert} \{ P \}, Q) &=& P \mathop{\&\&} Q = P \land (P \rightarrow Q)\\
%   WP(\texttt{check} \{ P \}, Q) &=& P \land Q \\
%   WP(\texttt{assume} \{ P \}, Q) &=& P \rightarrow Q
% \end{eqnarray*}

% The other forms of code contracts allow to abstract a piece of code by specifications.
% \begin{itemize}
% \item $\texttt{any}~\{ P \}~\tau~\epsilon~\{ Q \}$ is a
%   non-deterministic expression that requires the precondition $P$ to
%   hold, then makes some side effects $\epsilon$, and returns any value
%   of type $\tau$ such that $Q$ holds. This construct acts as an axiom
%   in the sense that it does not check whether there exists any program
%   that can effectively establish the post-condition (similarly as the
%   introduction of a \texttt{val} at the global level).
% \item $\texttt{abstract}~e~\{ Q \}$ makes sure that the evaluation of
%   expression $e$ establishes the post-condition $Q$, and then abstract
%   away the program state by the post-condition $Q$ (similarly to the
%   \texttt{any} construct).
% \end{itemize}
% The corresponding rules for computing WP are as follows:
% \[
% \begin{array}{l}
%   WP(\texttt{any}~\{ P \}~\tau~\epsilon~\{ Q \mid exn_i \Rightarrow R_i \} ,
%   Q'  exn_i \Rightarrow R'_i) = \\
%   \qquad\qquad P \mathop{\&\&} \forall result, \epsilon.
%   (Q \rightarrow Q') \land (R_i \rightarrow R'_i) \\
%   WP(\texttt{abstract}~e~\{ Q \mid exn_i \Rightarrow R_i \} ,
%   Q' \mid exn_i \Rightarrow R'_i) = \\
%   \qquad\qquad WP(e,Q \mid exn_i \Rightarrow R_i) \land
%   \forall result, \epsilon, Q \rightarrow Q' \land R_i \rightarrow R'_i
% \end{array}
% \]

% \subsubsection{Labels, Operators \texttt{old} and \texttt{at}}

% \todo{Labels, old, at}
815 816 817

\subsection{Modules}

818
The syntax for modules is given in Figure~\ref{fig:bnf:module}.
MARCHE Claude's avatar
MARCHE Claude committed
819
\begin{figure}
820
  \begin{center}\framebox{\input{./generated/module_bnf.tex}}\end{center}
821 822 823
  \caption{Syntax for modules.}
\label{fig:bnf:module}
\end{figure}
824 825 826 827
Any declaration which is accepted in a theory is also accepted in a
module. Additionally, modules can introduce record types with mutable
fields and declarations which are specific to programs (global
variables, functions, exceptions).
828

829 830
\subsection{Files}

MARCHE Claude's avatar
MARCHE Claude committed
831
A \whyml input file is a (possibly empty) list of theories and modules.
832
\begin{center}\framebox{\input{./generated/whyml_file_bnf.tex}}\end{center}
Guillaume Melquiond's avatar
Guillaume Melquiond committed
833
A theory defined in a \whyml file can only be used within that
834
file. If a theory is supposed to be reused from other files, be they
Guillaume Melquiond's avatar
Guillaume Melquiond committed
835
\why or \whyml files, it should be defined in a \why file.
836

MARCHE Claude's avatar
MARCHE Claude committed
837

838 839 840
\section{The \why Standard Library}
\label{sec:library}\index{standard library}\index{library}

841
The \why standard library provides general-purpose
842 843
modules, to be used in logic and/or programs.
It can be browsed on-line at \url{http://why3.lri.fr/stdlib/}.
844 845 846 847
Each file contains one or several modules.
To \texttt{use} or \texttt{clone} a module \texttt{M} from file
\texttt{file}, use the syntax \texttt{file.M}, since \texttt{file} is
available in \why's default load path. For instance, the module of
848 849 850 851 852
integers and the module of references are imported as follows:
\begin{whycode}
  use import int.Int
  use import ref.Ref
\end{whycode}
853 854 855 856 857 858 859 860 861 862
A sub-directory \texttt{mach/} provides various modules to model
machine arithmetic.
For instance, the module of 63-bit integers and the module of arrays
indexed by 63-bit integers are imported as follows:
\begin{whycode}
  use import mach.int.Int63
  use import mach.array.Array63
\end{whycode}
In particular, the types and operations from these modules are mapped
to native OCaml's types and operations when \why code is extracted to
863
OCaml (see Section~\ref{sec:extract}).
864

MARCHE Claude's avatar
MARCHE Claude committed
865 866 867 868 869
%%% Local Variables:
%%% mode: latex
%%% TeX-PDF-mode: t
%%% TeX-master: "manual"
%%% End: