Generic Interpreter 0.9
Private API

gi
Class Lexicon

java.lang.Object
  |
  +--gi.Lexicon
Direct Known Subclasses:
Grammar

public class Lexicon
extends Object

This class implements a Lexicon.

Version:
0.9
Author:
© 1999-2000 Craig A. Rich <carich@acm.org>
See Also:
Source code

Inner Class Summary
(package private) static class Lexicon.Alphabet
          This class implements an Expression denoting a set of characters.
protected static class Lexicon.Concatenation
          This class implements an Expression denoting the concatenation of two regular languages.
protected  class Lexicon.Exception
          This class implements an Exception.
(package private) static class Lexicon.Expression
          This class implements an Expression denoting a regular language.
protected static class Lexicon.Match
          This class implements an Expression denoting the set of characters in a string.
protected static class Lexicon.NonMatch
          This class implements an Expression denoting the set of characters not in a string.
protected static class Lexicon.PosixClass
          This class implements an Expression denoting the set of characters in a POSIX class.
protected static class Lexicon.Range
          This class implements an Expression denoting the set of characters in a range.
protected static class Lexicon.Repetition
          This class implements an Expression denoting the repetition of a regular language.
(package private) static class Lexicon.Set
          This class implements a Set.
protected static class Lexicon.Singleton
          This class implements an Expression denoting the set containing a string.
protected static class Lexicon.UnicodeCategory
          This class implements an Expression denoting the set of characters in a Unicode category.
protected static class Lexicon.Union
          This class implements an Expression denoting the union of two regular languages.
 
Field Summary
private  Map accepts
          The mapping from an NFA accept state to the terminal it recognizes in this Lexicon.
protected static String END_OF_SOURCE
          The terminal matching the character at the end of a source stream.
private static Lexicon.Expression END_OF_SOURCE_EXPRESSION
          The Expression denoting the set containing the character at the end of a source stream.
private  Lexicon.Set initial
          The initial state of this Lexicon.
private static int size
          The number of NFA states in the lexical NFA.
private  Lexicon.Set[] states
          The states through which this Lexicon transitions.
private  Map terminals
          The terminals put into this Lexicon.
private static Lexicon.Set transitions
          The transition function of the lexical NFA.
private  StringBuffer word
          The StringBuffer containing the word most recently grabbed.
 
Constructor Summary
protected Lexicon()
          Constructs an empty Lexicon.
(package private) Lexicon(Lexicon lexicon)
          Constructs a Lexicon that is a shallow copy of lexicon.
 
Method Summary
private static Lexicon.Set closure(Lexicon.Set from)
          Computes a null-closure using the lexical NFA.
protected static Lexicon.Expression expression(String string)
          Creates an Expression by interpreting a POSIX extended regular expression (ERE), as used in egrep.
 Object grab(BufferedReader source)
          Grabs a terminal from a source character stream using this Lexicon.
private  Lexicon.Set initial()
          Returns the initial state of this Lexicon.
private static void put(Integer from, Lexicon.Alphabet on, Integer to)
          Puts a transition into the lexical NFA.
protected  void put(Object terminal, Lexicon.Expression expression)
          Puts a terminal and associated Expression into this Lexicon.
private  Object recognize(Lexicon.Set state)
          Computes the terminal recognized by a state in this Lexicon.
private static Integer state()
          Creates a new state in the lexical NFA.
(package private)  boolean terminal(Object symbol)
          Indicates whether a symbol is a terminal in this Lexicon.
private static Lexicon.Set transition(Lexicon.Set from, char on, Lexicon.Set to)
          Computes a transition using the lexical NFA.
 String word()
          Returns the word most recently grabbed using this Lexicon.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

accepts

private final Map accepts

The mapping from an NFA accept state to the terminal it recognizes in this Lexicon. When empty, there is a need to compute current NFA accept states. It is computed only on demand created by initial().


END_OF_SOURCE

protected static final String END_OF_SOURCE

The terminal matching the character at the end of a source stream.


END_OF_SOURCE_EXPRESSION

private static final Lexicon.Expression END_OF_SOURCE_EXPRESSION

The Expression denoting the set containing the character at the end of a source stream.


initial

private final Lexicon.Set initial

The initial state of this Lexicon. When empty, there is a need to compute the current initial state. It is computed only on demand created by initial().


size

private static int size

The number of NFA states in the lexical NFA.


states

private final Lexicon.Set[] states

The states through which this Lexicon transitions.


terminals

private final Map terminals

The terminals put into this Lexicon. It is a mapping from a terminal to the NFA initial state recognizing the language denoted by the associated Expression.


transitions

private static final Lexicon.Set transitions

The transition function of the lexical NFA.


word

private final StringBuffer word

The StringBuffer containing the word most recently grabbed.

Constructor Detail

Lexicon

protected Lexicon()

Constructs an empty Lexicon.


Lexicon

Lexicon(Lexicon lexicon)

Constructs a Lexicon that is a shallow copy of lexicon. The fields of the new Lexicon refer to the same objects as those in lexicon.

Parameters:
lexicon - the Lexicon to be copied.
Method Detail

closure

private static Lexicon.Set closure(Lexicon.Set from)

Computes a null-closure using the lexical NFA. The null-closure is computed in place by a breadth-first search expanding from.

Parameters:
from - the state whose null-closure is computed.
Returns:
the reflexive transitive closure of from under null transition.

expression

protected static Lexicon.Expression expression(String string)
                                        throws Lexicon.Exception

Creates an Expression by interpreting a POSIX extended regular expression (ERE), as used in egrep. The syntax and semantics for EREs is formally specified by the ERE Grammar. Provides a convenient method for constructing an Expression, at the cost of an LR(1) parse. Implementations seeking maximum speed should avoid this method and use explicit Expression subclass constructors; for example,

new Union(new NonMatch("0"), new Singleton("foo"))
instead of
Lexicon.expression("[^0]|foo")
Parameters:
string - the POSIX extended regular expression (ERE) to be interpreted.
Returns:
the Expression constructed by interpreting string.
Throws:
Lexicon.Exception - if a syntax error occurs.

grab

public Object grab(BufferedReader source)
            throws Lexicon.Exception

Grabs a terminal from a source character stream using this Lexicon. The variable returned by word() is set to the longest nonempty prefix of the remaining source characters matching an Expression in this Lexicon. If no nonempty prefix matches an Expression, a Lexicon.Exception is thrown. If the longest matching prefix matches more than one Expression, the terminal associated with the Expression most recently constructed is returned. Blocks until a character is available, an I/O error occurs, or the end of the source stream is reached.

Parameters:
source - the source character stream.
Returns:
the terminal grabbed from source.
Throws:
Lexicon.Exception - if an I/O or lexical error occurs.

initial

private Lexicon.Set initial()

Returns the initial state of this Lexicon.

Returns:
initial, computing it and accepts if there is a need to compute the current initial state and NFA accept states.

put

private static void put(Integer from,
                        Lexicon.Alphabet on,
                        Integer to)

Puts a transition into the lexical NFA.

Parameters:
from - the state from which the transition is made.
on - the Alphabet on which the transition is made.
to - the state to which the transition is made.

put

protected void put(Object terminal,
                   Lexicon.Expression expression)

Puts a terminal and associated Expression into this Lexicon. The Expression supersedes any previously associated with the terminal.

Parameters:
terminal - the terminal to be added.
expression - the Expression associated with terminal. When grabbing, the language denoted by expression matches terminal.

recognize

private Object recognize(Lexicon.Set state)

Computes the terminal recognized by a state in this Lexicon.

Parameters:
state - the state.
Returns:
the highest priority terminal associated with an NFA accept state in state. Returns null if state contains no NFA accept states.

state

private static Integer state()

Creates a new state in the lexical NFA.

Returns:
the new state in the lexical NFA.

terminal

boolean terminal(Object symbol)

Indicates whether a symbol is a terminal in this Lexicon.

Parameters:
symbol - the symbol whose status is requested.
Returns:
true if symbol is a terminal in this Lexicon; false otherwise.

transition

private static Lexicon.Set transition(Lexicon.Set from,
                                      char on,
                                      Lexicon.Set to)

Computes a transition using the lexical NFA.

Parameters:
from - the state from which the transition is made.
on - the character on which the transition is made.
to - the state to which the transition is made.
Returns:
the state to which the transition is made.

word

public String word()

Returns the word most recently grabbed using this Lexicon.

Returns:
the word most recently grabbed by grab(source).

 

© 1999-2000 Craig A. Rich <carich@acm.org>