Simple stream-oriented parser for efficient basic recognition of incoming data.
Class Tokenizer( [seps],[options],[tokLen],[source] )
seps | A string representing the separators. |
options | Tokenization options. |
tokLen | Maximum length of returned tokens. |
source | The string to be tokenized, or a stream to be read for tokens. |
The tokenizer class is meant to provide simple and efficient logic to parse incoming data (mainly, incoming from string).
The source can also be set at a second time with the Tokenizer.parse method. seps defaults to " " if not given.
The options parameter can be a binary combinations of the following values:
- Tokenizer.groupsep: Groups different tokens into one. If not given, when a token immediately follows another, an empty field is returned.
Methods | |
hasCurrent | Return true if the tokenizer has a current token. |
next | Advances the tokenizer up to the next token. |
nextToken | Returns the next token from the tokenizer |
parse | Changes or set the source data for this tokenizer. |
rewind | Resets the status of the tokenizer. |
token | Get the current token. |
Return true if the tokenizer has a current token.
Tokenizer.hasCurrent()
Return | True if a token is now available, false otherwise. |
Contrarily to iterators, it is necessary to call this Tokenizer.next at least once before calling this method.
See also: Tokenizer, Tokenizer.
Advances the tokenizer up to the next token.
Tokenizer.next()
Return | True if a new token is now available, false otherwise. | ||||
Raise |
|
For example:
t = Tokenizer( source|"A string to be tokenized" ) while t.hasCurrent() > "Token: ", t.token() t.next() end
See also: Tokenizer.
Returns the next token from the tokenizer
Tokenizer.nextToken()
Return | A string or nil at the end of the tokenization. | ||||
Raise |
|
This method is actually a combination of Tokenizer.next followed by Tokenizer.token.
Sample usage:
t = Tokenizer( source|"A string to be tokenized" ) while (token = t.nextToken()) != nil > "Token: ", token end
Note: When looping, remember to check the value of the returned token against nil, as empty strings can be legally returned multiple times, and they are considered false in logic checks.
Changes or set the source data for this tokenizer.
Tokenizer.parse( source )
source | A string or a stream to be used as a source for the tokenizer. | ||
Raise |
|
The first token is immediately read and set as the current token. If it's not empty, that is, if at least a token can be read, Tokenizer.hasCurrent returns true, and Tokenizer.token returns its value.
Resets the status of the tokenizer.
Tokenizer.rewind()
Raise |
|
Get the current token.
Tokenizer.token()
Return | True if a new token is now available, false otherwise. | ||||
Raise |
|
This method returns the current token.