Commit 78a947be authored by POTTIER Francois's avatar POTTIER Francois

Progress up to [nullable].

parent ecbab0ba
......@@ -15,6 +15,8 @@ a library that offers facilities for
constructing (recursive) memoized functions
and for performing least fixed point computations.
<!------------------------------------------------------------------------------>
## From REs to DFAs, via Brzozowski derivatives
Suppose `e` denotes a set of words. Then, its **derivative** `delta a e` is
......@@ -65,6 +67,8 @@ The empty regular expression can be viewed as an empty disjunction.
[The complete code](http://gitlab.inria.fr/fpottier/fix/demos/brz/Brzozowski.ml)
for this demo is also available.
<!------------------------------------------------------------------------------>
## An alphabet
Throughout, I assume that the alphabet is given by a module `Char` whose
......@@ -86,6 +90,8 @@ the function `Char.foreach`, which enumerates all characters.
As an exercise for the reader, this can be used to define an auxiliary
function `exists_char` of type `(Char.t -> bool) -> bool`.
<!------------------------------------------------------------------------------>
## Regular expressions, hash-consed
The syntax of regular expressions (expressions, for short) is naturally
......@@ -151,10 +157,85 @@ let skeleton : regexp -> skeleton =
HashCons.data
```
The function `make` is where the magic takes place: whenever we wish to
construct a new expression, we construct just a skeleton and pass it to
The function `make` is where the magic takes place: whenever one wishes to
construct a new expression, one constructs just a skeleton and passes it to
`make`, which takes care of determining whether this skeleton is already
known (in which case an existing integer identity is re-used) or is new
(in which case a fresh integer identity is allocated).
The function `skeleton`, which converts between an expression and a
skeleton in the reverse direction, is just a pair projection.
Here are two basic examples of the use of `make`:
```
let epsilon : regexp =
make EEpsilon
let zero : regexp =
make (EDisj [])
```
Still using `make`, one can define several *smart constructors* that build
expressions: concatenation `@@`, disjunction, conjunction, iteration,
negation. These smart constructors reduce expressions to a normal form with
respect to the equational theory EQTH. In particular, `disjunction` flattens
nested disjunctions, sorts the disjuncts, and removes duplicate disjuncts, so
that two disjunctions that are equal according to EQTH are indeed recognized
as equal.
```
let (@@) : regexp -> regexp -> regexp = ...
let star : regexp -> regexp = ...
let disjunction : regexp list -> regexp = ...
let conjunction : regexp list -> regexp = ...
let neg : regexp -> regexp = ...
```
<!------------------------------------------------------------------------------>
## Nullability
An expression is nullable if and only if it accepts the empty word.
Determining whether an expression is nullable is a simple matter of
writing a recursive function `nullable` of type `regexp -> bool`.
I memoize this function, so the nullability of an expression is computed at
most once and can be retrieved immediately if requested again. (This is not
mandatory, but it is convenient to be able to call `nullable` without worrying
about its cost. Another approach, by the way, would be to store nullability
information inside each expression.)
The module
[Memoize](https://gitlab.inria.fr/fpottier/fix/blob/master/src/Memoize.mli),
which is part of
[Fix](https://gitlab.inria.fr/fpottier/fix/),
provides facilities for memoization.
To use it, I apply the functor `Memoize.ForHashedType`
to a little module `R` (not shown) which equips the type `regexp` with
equality, comparison, and hashing functions. (Because expressions
carry unique integer identifiers, the definition of `R` is trivial.)
This functor application yields a module `M` which offers
a memoizing fixed-point combinator `M.fix`.
Then, instead of defining `nullable` directly as a recursive function,
I define it as an application of `M.fix`, as follows.
```
let nullable : regexp -> bool =
let module M = Memoize.ForHashedType(R) in
M.fix (fun nullable e ->
match skeleton e with
| EChar _ ->
false
| EEpsilon
| EStar _ ->
true
| ECat (e1, e2) ->
nullable e1 && nullable e2
| EDisj es ->
exists nullable es
| EConj es ->
forall nullable es
| ENeg e ->
not (nullable e)
)
```
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment