Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
Gabriel Radanne
fix
Commits
33ec6429
Commit
33ec6429
authored
Nov 30, 2018
by
POTTIER Francois
Browse files
Progress up to `nonempty`.
parent
c0390942
Changes
1
Hide whitespace changes
Inline
Side-by-side
misc/post.md
View file @
33ec6429
...
...
@@ -10,7 +10,7 @@ of regular expressions.
In this post,
I describe a concise OCaml implementation of this transformation.
This is an opportunity to illustrate the use of
[
F
ix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
[
f
ix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
a library that offers facilities for
constructing (recursive) memoized functions
and for performing least fixed point computations.
...
...
@@ -136,7 +136,7 @@ and `nonempty` are three examples, and there are more.
The module
[
HashCons
](
https://gitlab.inria.fr/fpottier/fix/blob/master/src/HashCons.mli
)
,
which is part of
[
F
ix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
[
f
ix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
provides facilities for hash-consing.
(The reader is encouraged to take a look at
[
its implementation:
](
https://gitlab.inria.fr/fpottier/fix/blob/master/src/HashCons.ml
)
...
...
@@ -207,7 +207,7 @@ information inside each expression.)
The module
[
Memoize
](
https://gitlab.inria.fr/fpottier/fix/blob/master/src/Memoize.mli
)
,
which is part of
[
F
ix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
[
f
ix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
provides facilities for memoization.
To use it, I apply the functor
`Memoize.ForHashedType`
to a little module
`R`
(not shown) which equips the type
`regexp`
with
...
...
@@ -284,3 +284,73 @@ let delta : Char.t -> regexp -> regexp =
)
)
```
<!------------------------------------------------------------------------------>
## (Non)Emptiness
During the construction of the DFA (which comes up next), it is convenient to
be able to (efficiently) decide whether an expression is nonempty (that is,
whether it accepts at least one word).
Following
[
Owens et al.
](
https://www.cs.kent.ac.uk/people/staff/sao/documents/jfp09.pdf
)
,
I have included all three Boolean operators (disjunction, conjunction,
negation) in the syntax of expressions. The presence of conjunction makes it
apparently nontrivial to determine whether an expression is nonempty.
Fortunately, by exploiting
`nullable`
,
`delta`
, and with the help of
[
fix
](
https://gitlab.inria.fr/fpottier/fix/
)
, I am able to define
an efficient nonemptiness test in four lines of code:
```
let nonempty : regexp -> bool =
let module F = Fix.ForHashedType(R)(Prop.Boolean) in
F.lfp (fun e nonempty ->
nullable e || exists_char (fun a -> nonempty (delta a e))
)
```
This test is based on the following fact:
`e`
is nonempty if and only if
either
`e`
is nullable or there exists a character
`a`
such that
`delta a e`
is nonempty.
This statement, however, cannot be used directly as a recursive definition of
`nonempty`
, because such a definition would not be well-founded. An attempt to
define
`nonempty`
using
`let rec`
or
`M.fix`
would construct a possibly
nonterminating function.
Instead, the above statement must be viewed as
**
an inductive
characterization
** of nonemptiness. That is, `nonempty` is the **
least fixed
point
**
in the Boolean lattice (where
`false`
is less than
`true`
) of the
equation:
```
nonempty e = nullable e || exists_char (fun a -> nonempty (delta a e))
```
The main module of the
[
fix
](
https://gitlab.inria.fr/fpottier/fix/
)
library, named
[
Fix
](
https://gitlab.inria.fr/fpottier/fix/src/Core.mli
)
,
is an algorithm for computing a least fixed point of type
`variable -> property`
,
where the type
`property`
forms a lattice.
The functor application
`Fix.ForHashedType(R)(Prop.Boolean)`
instantiates this algorithm so that the type
`variable`
is
`regexp`
and the type
`property`
is
`bool`
.
Then, the fixed point combinator
`F.lfp`
allows us to give a
straightforward definition of
`nonempty`
.
The function
`nonempty`
thus defined is incremental. Nothing is computed when
the function is constructed; computation takes place only when the function is
invoked. It is also memoized: the nonemptiness of an expression, once
computed, is recorded and is never recomputed. Thus, one can invoke this
function without worrying about its cost.
The reader might note that nonemptiness could also be computed via a graph
traversal: an expression is nonempty if only there is exists a path from this
expression to a nullable expression in the graph whose vertices are
expressions and whose edges are determined by
`delta`
. What I have just done
is exploit the fact that co-accessibility is easily expressed as a least fixed
point.
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment