TextTokenizer

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

aminePlatform.util.parserGenerator
Class TextTokenizer

java.lang.Object
  java.io.StreamTokenizer
      aminePlatform.util.parserGenerator.TextTokenizer

All Implemented Interfaces:: AmineConstants

public class TextTokenizer
extends java.io.StreamTokenizer
implements AmineConstants

Title : parserGenerator.TextTokenizer Class

Description : TextTokenizer is responsible of the first step in "text" analysis ; the lexical analysis of an Amine Object, of a CG or of a Prolog+CG program. It scans the text in order to tokenize it, and the result, the sequence of tokens (with their types) is stored in the vector vctTokenTokenType which is used by the second step (syntactic analysis).

Field Summary

Fields inherited from class java.io.StreamTokenizer

nval, sval, TT_EOF, TT_EOL, TT_NUMBER, TT_WORD

Fields inherited from interface aminePlatform.util.AmineConstants

ANALOGY, B_ASSIGN, B_DSPLY_WT_DELAY, B_DSPLY_WTT_DELAY, B_TRIGGER, B_WTT_DSPLY, BLOCK_BACKWARD_PROPAGATION, BLOCK_FORWARD_PROPAGATION, CANON, CGIF, CGRAPHIC, CHECK_PRECONDITIONS, COMPARE, COMPOSED_GOAL, CONCEPT_TYPE_IDENT, CONTEXT, COVERED_BY, CPLTE_CONTRACT, DEFINITION, EQ_OR_MORE_SPCFQ, EQUAL, EXPAND, FALSE_FOCUS_LIST, FUNCTIONAL, GENERALISE, GENERALIZE, HAVE_AN_INTERSECTION, ID_ADD, ID_DIV, ID_EQ, ID_INF, ID_IS, ID_MESSAGE, ID_MUL, ID_NOT, ID_NULL, ID_OPER_AND, ID_OPER_OR, ID_SUB, ID_SUP, IN_ACTIVATION, IN_MODE, IN_MODE2, INDIVIDUAL, INDIVIDUAL_IDENT, INTEGRATED, IS_CANONIC, KEY_GLOBAL_RULE, LC_ADD, LC_AMINE_BOOLEAN, LC_AMINE_DOUBLE, LC_AMINE_INTEGER, LC_AND, LC_BOOLEAN, LC_CG, LC_CLOSE_BRKT, LC_CLOSE_PARENT, LC_CLOSE_SET, LC_COMMA, LC_COMMA_SEMI, LC_CONCEPT, LC_CONSTRUCTOR, LC_CS, LC_CUT, LC_DIFF, LC_DIV, LC_DOUBLE, LC_DSBL_BKWRD_PRPGTN, LC_DSBL_FRWRD_PRPGTN, LC_EOF, LC_EQ, LC_FOUR_POINTS, LC_IDENTIFIER, LC_IF, LC_INF, LC_INTEGER, LC_INTEROG, LC_IS, LC_JAVA_OBJECT, LC_LEFT_ARROW, LC_LIST, LC_NULL, LC_OPEN_BRKT, LC_OPEN_PARENT, LC_OPEN_SET, LC_OPER_AND, LC_OPER_OR, LC_POINT, LC_RELATION, LC_RGHT_ARROW, LC_SEMI_COMMA, LC_SET, LC_STAR, LC_STATE, LC_STRING, LC_SUB, LC_SUP, LC_TERM, LC_TWO_POINTS, LC_VAR_LIST_CONSTRUCTOR, LC_VARIABLE, LF, MAXIMAL_JOIN, MORE_GENERAL, MORE_SPECIFIC, NOTHING_TO_INTEGRATE, OPERS_WITH_RSLT, OUT_MODE, OUT_MODE2, PARTIAL_CONTRACT, PARTIAL_SUBSUME, PRJCT_OPERS, PROJECT, READ, READ_SENTENCE, RELATION_TYPE_IDENT, S_AND, S_BOOLEAN, S_CG, S_CLOSE_BRKT, S_CLOSE_PARENT, S_CLOSE_SET, S_COMMA, S_CONCEPT, S_CONSTRUCTOR, S_CUT, S_DIFF, S_DOUBLE, S_EOF, S_EQUAL, S_EXPAND, S_FALSE, S_FOUR_POINTS, S_GENERALISE, S_GENERALIZE, S_IDENTIFIER, S_IF, S_INTEGER, S_INTEROG, S_IS, S_IS_CANONIC, S_LEFT_ARROW, S_LIST, S_MAXIMAL_JOIN, S_OPEN_BRKT, S_OPEN_PARENT, S_OPEN_SET, S_POINT, S_RGHT_ARROW, S_SEMI_COMMA, S_SOURCE, S_SPECIALIZE, S_STATE, S_STRING, S_SUBSUME, S_SUBSUME_WITH_RESULT, S_SUPER, S_TARGET, S_TERM, S_THIS, S_TRUE, S_TWO_POINTS, S_UNIFY, S_VARIABLE, SITUATION, SPECIALIZE, STEADY, SUBSUME, SUBSUME_WITH_RSLT, TRIGGER, UNCOMPARABLE, UNIFY, VAR_SUPER, WAIT_ASSIGNMENT, WAIT_END_OF_ASSIGNMENT, WAIT_PRECONDITIONS, WAIT_VALUE

Constructor Summary
`TextTokenizer(java.lang.String s)` The constructor will create a Tokenizer and proceed for the tokenization of the text in argument.

Method Summary
`void`	`back(int i)` back i elements in the vector of the token/tokenType couples.
`void`	`changeToken(byte newTokenType)`
`boolean`	`endOfStream()` Test if the next token is the end of the stream; the end of the vector of token/tokenType couples.
`void`	`finalize()`
`int`	`getCursor()` Get the value of the Cursor; the index in the vector of the token/tokenType couples
`int`	`getIndexOfCurrentToken()`
`java.lang.String`	`getTextToParse()`
`java.lang.String`	`getToken()` Get the current token
`java.lang.String`	`getTokenAhead(int n)`
`byte`	`getTokenType()` Get the token type
`java.lang.String`	`getVctTokenTokenType()`
`(package private) boolean`	`isTokenTowChars()`
`static boolean`	`isVariableIdentifier(java.lang.String token)` A variable identifier should start with a letter followed optionally by a digit or underscore.
`void`	`lexicalAnalysis()` This method performs the lexical analysis of the current text and corresponds to a loop that calls the method nxtToken1() at each iteration.
`byte`	`lookAhead(int n)` lookAhead(n) is used by syntactic analysis to look at the type of the n-th token, starting from the current one.
`java.lang.String`	`nameOfTknType(byte tknType)` Return the name of the specified token type, given as a byte.
`int`	`numberOfTokens()` Get the number of tokens in the vector of token/tokenType couples
`void`	`nxtToken()` This method assumes that the lexical analysis was done and that the tokens and their types are now stored in the vector of token/tokenType couples.
`(package private) void`	`nxtToken1()` This method calls the method nextToken() to read the next token, and returns its type (i.e. its lexical category).
`(package private) void`	`recognizeIdentifier()` The current token begins with a letter or an underscore '_'.
`void`	`recognizeToken(byte pTokenType)` Like the method nxtToken(), this method assumes that the lexical analysis was done and that the tokens and their types are now stored in the vector of token/tokenType couples.
`void`	`recognizeToken(byte pTokenType1, byte pTokenType2)` Like the method nxtToken(), this method assumes that the lexical analysis was done and that the tokens and their types are now stored in the vector of token/tokenType couples.
`void`	`setCursor(int i)` Set the cursor at the range i in the vector of the token/tokenType couples.
`int`	`tokenLength()` Get the length of the current token

Methods inherited from class java.io.StreamTokenizer

commentChar, eolIsSignificant, lineno, lowerCaseMode, nextToken, ordinaryChar, ordinaryChars, parseNumbers, pushBack, quoteChar, resetSyntax, slashSlashComments, slashStarComments, toString, whitespaceChars, wordChars

Methods inherited from class java.lang.Object

clone, equals, getClass, hashCode, notify, notifyAll, wait, wait, wait

Constructor Detail

TextTokenizer

public TextTokenizer(java.lang.String s)

The constructor will create a Tokenizer and proceed for the tokenization of the text in argument. The initialization of the tokenizer is as follow :

- the parameter s is the string to tokenize,

- underscore '_' is considered as alphabetic; as part of an identifier

- Java-style one-line comments are considered

- Java-style multi-lines comments are considered

- The Prolog-style comments, which starts with '%', are consired too

- Characters '.', '/' and '-' are considered as ordinary characters

Parameters:

s - : the string to tokenize

Method Detail

getVctTokenTokenType

public java.lang.String getVctTokenTokenType()

finalize

public void finalize()

getToken

public java.lang.String getToken()

Get the current token

Returns:: the current token

getTokenAhead

public java.lang.String getTokenAhead(int n)

getTextToParse

public java.lang.String getTextToParse()

getTokenType

public byte getTokenType()

Get the token type

Returns:: the current token type

getIndexOfCurrentToken

public int getIndexOfCurrentToken()

setCursor

public void setCursor(int i)

Set the cursor at the range i in the vector of the token/tokenType couples.

Parameters:: i - : the new value for the cursor associated to the vector of the token/tokenType couples

getCursor

public int getCursor()

Get the value of the Cursor; the index in the vector of the token/tokenType couples

Returns:: the value of the Cursor

back

public void back(int i)

back i elements in the vector of the token/tokenType couples.

numberOfTokens

public int numberOfTokens()

Get the number of tokens in the vector of token/tokenType couples

Returns:: the number of tokens in the vector of token/tokenType couples

nameOfTknType

public java.lang.String nameOfTknType(byte tknType)

Return the name of the specified token type, given as a byte. Here is the list of the recognized token types : S_BOOLEAN, S_IDENTIFIER, S_VARIABLE, S_STRING, S_INTEGER,S_DOUBLE S_LIST, S_TERM, S_SUP, S_INF, S_COMMA, S_SEMI_COMMA, S_TWO_POINTS, S_CUT, S_INTEROG, S_EQ, S_OPEN_BRKT, S_CLOSE_BRKT, S_OPEN_SET, S_CLOSE_SET, S_OPEN_PARENT, S_CLOSE_PARENT, S_CONSTRUCTOR, S_POINT, S_ADD, S_SUB, S_STAR, S_DIV, S_DIFF, S_IF, S_FOUR_POINTS, S_RGHT_ARROW, S_LEFT_ARROW, S_EOF. These constants are defined in the interface util.AmineConstants

Parameters:: tknType - : token type as a byte
Returns:: the name of the specified token type, given as a byte

lexicalAnalysis

public void lexicalAnalysis()
                     throws ParsingException

This method performs the lexical analysis of the current text and corresponds to a loop that calls the method nxtToken1() at each iteration. nxtToken1() calls the method nextToken() to read the next token, and returns its type (i.e. its lexical category).

Throws:: ParsingException

nxtToken1

void nxtToken1()
         throws java.io.IOException

This method calls the method nextToken() to read the next token, and returns its type (i.e. its lexical category).

Throws:: java.io.IOException

nxtToken

public void nxtToken()

This method assumes that the lexical analysis was done and that the tokens and their types are now stored in the vector of token/tokenType couples. This method reads the current element from this vector and assigns, in the two attributes token and tokenType, the current couple of token/tokenType. This method and the two attributes (token/getToken() and tokenType/getTokenType()) are used by the syntactic analysis process and constitute the main interface between the lexical analysis, done by this class, and syntactic analysis (of CG or Prolog+CG program).

changeToken

public void changeToken(byte newTokenType)

endOfStream

public boolean endOfStream()

Test if the next token is the end of the stream; the end of the vector of token/tokenType couples.

lookAhead

public byte lookAhead(int n)

lookAhead(n) is used by syntactic analysis to look at the type of the n-th token, starting from the current one. The look is done in the vector of token/tokenType couples.

recognizeIdentifier

void recognizeIdentifier()

The current token begins with a letter or an underscore '_'. An identifier is either a variable identifier, a boolean or a constant identifier.

isVariableIdentifier

public static boolean isVariableIdentifier(java.lang.String token)

A variable identifier should start with a letter followed optionally by a digit or underscore. After the digit or the underscore, a variable can have any sequence of characters. A variable identifier can begin also with an underscore followed optionally by any sequence of characters. There is also special cases of identifiers "super", "this", "x_source", and "y_target" that are considered as variables.

Returns:: true if the current token is a variable, and false otherwise

isTokenTowChars

boolean isTokenTowChars()

Returns:: true if the current token is composed of two characters

tokenLength

public int tokenLength()

Get the length of the current token

Returns:: the length of the current token

recognizeToken

public void recognizeToken(byte pTokenType)
                    throws ParsingException

Like the method nxtToken(), this method assumes that the lexical analysis was done and that the tokens and their types are now stored in the vector of token/tokenType couples. The method reads the next token and determines its type, from the vector of token/tokenType couples, and checks that the tokenType is equal to the specified tokenType.

Parameters:: pTokenType - : the tokenType that should be recognized
Throws:: : - throws ParsingException if the tokenType of the current token is not identical to the specified tokenType; ParsingException

recognizeToken

public void recognizeToken(byte pTokenType1,
                           byte pTokenType2)
                    throws ParsingException

Parameters:: pTokenType1 - : a tokenType; pTokenType2 - : a tokenType
Throws:: : - throws ParsingException if the tokenType of the current token is not identical to the specified tokenTypes : pTokenType1 or pTokenType2; ParsingException

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

aminePlatform.util.parserGenerator Class TextTokenizer

TextTokenizer

getVctTokenTokenType

finalize

getToken

getTokenAhead

getTextToParse

getTokenType

getIndexOfCurrentToken

setCursor

getCursor

back

numberOfTokens

nameOfTknType

lexicalAnalysis

nxtToken1

nxtToken

changeToken

endOfStream

lookAhead

recognizeIdentifier

isVariableIdentifier

isTokenTowChars

tokenLength

recognizeToken

recognizeToken

aminePlatform.util.parserGenerator
Class TextTokenizer