Class: LEMTokenizer

LEMTokenizer

LEMTokenizer consumes a string of characters and divides into a list of tokens. This is done according to the following grammar specified in Extended Backus-Naur Form. Note that whitespace (as defined in the grammar) are not considered tokens and will not be returned by this tokenizer

In the following definition, [] denotes an option and {} denotes repetition.

input = 
    (token | whitespace), input
    
whitespace = 
    ' ' | '\t' | '\n' | '\r'
    
token = 
    one of the token types defined in LEMTokenType

Constructor

new LEMTokenizer(query)

Constructs a LEMTokenizer to tokenize the provided query
Parameters:
Name Type Description
query String The raw query string to tokenize
Source:

Methods

getNumCharsConsumed() → {Integer}

Gets the number of characters consumed by the tokenizer so far
Source:
Returns:
The number of chars consumed
Type
Integer

hasNext() → {Boolean}

Returns true if there are more tokens to be read
Source:
Returns:
True if there are more tokens
Type
Boolean

next() → {Token}

Gets the next token and removes it from the stream of produced tokens. If there is no next token, this method returns null
Source:
Returns:
The next token or null if there is no next token
Type
Token

peek() → {Token}

Gets the next token but does not remove it. If there is no next token, this method returns null
Source:
Returns:
The next token or null if there is no next token
Type
Token