Skip to content
GitLab
Projects
Groups
Snippets
Help
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
F
fix
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Service Desk
Milestones
Merge Requests
0
Merge Requests
0
Operations
Operations
Incidents
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
POTTIER Francois
fix
Commits
e1d65d4b
Commit
e1d65d4b
authored
Nov 30, 2018
by
POTTIER Francois
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Progress up to the type [dfa].
parent
691737d0
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
69 additions
and
3 deletions
+69
-3
misc/post.md
misc/post.md
+69
-3
No files found.
misc/post.md
View file @
e1d65d4b
# A feeling of déjà vu
<!-- TEMPORARY update title -->
There are several ways of compiling
a
[
regular expression
](
https://en.wikipedia.org/wiki/Regular_expression
)
(
RE
)
...
...
@@ -244,9 +245,9 @@ let nullable : regexp -> bool =
## Derivation
We now reach a key operation: computing the Brzozowski derivative of
an expression. If
`a`
is a character and
`e`
is an expression, then
`delta
a e`
is the derivative of
`e`
with respect to
`a`
.
It is now time to define a key operation: computing the Brzozowski derivative
of an expression. If
`a`
is a character and
`e`
is an expression, then
`delta
a e`
is the derivative of
`e`
with respect to
`a`
.
Implementing
`delta`
is a textbook exercise. A key remark, though, is that
this function
**must**
be memoized in order to ensure good complexity. A
...
...
@@ -354,3 +355,68 @@ expression to a nullable expression in the graph whose vertices are
expressions and whose edges are determined by
`delta`
. What I have just done
is exploit the fact that co-accessibility is easily expressed as a least fixed
point.
<!-- TEMPORARY
Accessibility, too, can be expressed as a least fixed point.
However, to do so, one must have access to the predecessors
of each vertex.
-->
<!------------------------------------------------------------------------------>
## Constructing a DFA
The tools are now at hand to convert an expression
to a deterministic finite-state automaton.
I must first settle on a representation of such an automaton as a data
structure in memory. I choose to represent a state as an integer in the range
of
`0`
to
`n-1`
, where
`n`
is the number of states. An automaton can then
be described as follows:
```
type state =
int
type dfa = {
n: int;
init: state option;
decode: state -> regexp;
transition: state -> Char.t -> state option;
}
```
`init`
is the initial state. If it is absent, then the automaton rejects every
input.
The function
`decode`
maps every state to the expression that this state
accepts. This expression is guaranteed to be nonempty. This state is a final
state if and only if this expression is nullable.
The function
`transition`
maps every state and character to an optional target
state.
Now, how does one construct a DFA for an expression
`e`
?
The answer is simple, really.
Consider the infinite graph whose vertices are
nonempty expressions and whose edges are determined by
`delta`
.
The fragment of this graph that is reachable from
`e`
is guaranteed to be finite,
and is exactly the desired automaton.
<!-- TEMPORARY can we point to a proof of finiteness? -->
There are several ways of approaching the construction of this finite graph
fragment. I choose to first perform a forward graph traversal in which I
discover the vertices of this graph, number them from
`0`
to
`n-1`
, and record
the bijective correspondence between vertices (that is, expressions) and state
numbers. Once this is done, completing the construction of a data structure of
type
`dfa`
is easy.
<!-- TEMPORARY tabulating the transition function:
we could choose not to tabulate it,
but
`delta`
would then be invoked at automaton runtime,
every time a transition is taken.
By tabulating this function,
taking a transition at runtime becomes a simple matter
of doing two table lookups. -->
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment