Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
F
fix
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Merge Requests
0
Merge Requests
0
Operations
Operations
Incidents
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
POTTIER Francois
fix
Commits
78a947be
Commit
78a947be
authored
Nov 29, 2018
by
POTTIER Francois
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Progress up to [nullable].
parent
ecbab0ba
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
83 additions
and
2 deletions
+83
-2
misc/post.md
misc/post.md
+83
-2
No files found.
misc/post.md
View file @
78a947be
...
...
@@ -15,6 +15,8 @@ a library that offers facilities for
constructing (recursive) memoized functions
and for performing least fixed point computations.
<!------------------------------------------------------------------------------>
## From REs to DFAs, via Brzozowski derivatives
Suppose
`e`
denotes a set of words. Then, its
**derivative**
`delta a e`
is
...
...
@@ -65,6 +67,8 @@ The empty regular expression can be viewed as an empty disjunction.
[
The complete code
](
http://gitlab.inria.fr/fpottier/fix/demos/brz/Brzozowski.ml
)
for this demo is also available.
<!------------------------------------------------------------------------------>
## An alphabet
Throughout, I assume that the alphabet is given by a module
`Char`
whose
...
...
@@ -86,6 +90,8 @@ the function `Char.foreach`, which enumerates all characters.
As an exercise for the reader, this can be used to define an auxiliary
function
`exists_char`
of type
`(Char.t -> bool) -> bool`
.
<!------------------------------------------------------------------------------>
## Regular expressions, hash-consed
The syntax of regular expressions (expressions, for short) is naturally
...
...
@@ -151,10 +157,85 @@ let skeleton : regexp -> skeleton =
HashCons.data
```
The function
`make`
is where the magic takes place: whenever
we wish
to
construct a new expression,
we construct just a skeleton and pas
s it to
The function
`make`
is where the magic takes place: whenever
one wishes
to
construct a new expression,
one constructs just a skeleton and passe
s it to
`make`
, which takes care of determining whether this skeleton is already
known (in which case an existing integer identity is re-used) or is new
(in which case a fresh integer identity is allocated).
The function
`skeleton`
, which converts between an expression and a
skeleton in the reverse direction, is just a pair projection.
Here are two basic examples of the use of
`make`
:
```
let epsilon : regexp =
make EEpsilon
let zero : regexp =
make (EDisj [])
```
Still using
`make`
, one can define several
*smart constructors*
that build
expressions: concatenation
`@@`
, disjunction, conjunction, iteration,
negation. These smart constructors reduce expressions to a normal form with
respect to the equational theory EQTH. In particular,
`disjunction`
flattens
nested disjunctions, sorts the disjuncts, and removes duplicate disjuncts, so
that two disjunctions that are equal according to EQTH are indeed recognized
as equal.
```
let (@@) : regexp -> regexp -> regexp = ...
let star : regexp -> regexp = ...
let disjunction : regexp list -> regexp = ...
let conjunction : regexp list -> regexp = ...
let neg : regexp -> regexp = ...
```
<!------------------------------------------------------------------------------>
## Nullability
An expression is nullable if and only if it accepts the empty word.
Determining whether an expression is nullable is a simple matter of
writing a recursive function
`nullable`
of type
`regexp -> bool`
.
I memoize this function, so the nullability of an expression is computed at
most once and can be retrieved immediately if requested again. (This is not
mandatory, but it is convenient to be able to call
`nullable`
without worrying
about its cost. Another approach, by the way, would be to store nullability
information inside each expression.)
The module
[
Memoize
](
https://gitlab.inria.fr/fpottier/fix/blob/master/src/Memoize.mli
)
,
which is part of
[
Fix
](
https://gitlab.inria.fr/fpottier/fix/
)
,
provides facilities for memoization.
To use it, I apply the functor
`Memoize.ForHashedType`
to a little module
`R`
(not shown) which equips the type
`regexp`
with
equality, comparison, and hashing functions. (Because expressions
carry unique integer identifiers, the definition of
`R`
is trivial.)
This functor application yields a module
`M`
which offers
a memoizing fixed-point combinator
`M.fix`
.
Then, instead of defining
`nullable`
directly as a recursive function,
I define it as an application of
`M.fix`
, as follows.
```
let nullable : regexp -> bool =
let module M = Memoize.ForHashedType(R) in
M.fix (fun nullable e ->
match skeleton e with
| EChar _ ->
false
| EEpsilon
| EStar _ ->
true
| ECat (e1, e2) ->
nullable e1 && nullable e2
| EDisj es ->
exists nullable es
| EConj es ->
forall nullable es
| ENeg e ->
not (nullable e)
)
```
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment