Tokens

A Token is a chain of characters forming a coherent text unit in a document.

Token

This class implements the base-class for all token classes.

Inheritance diagram:

Inheritance diagram of pyTokenizer.SuperToken, pyTokenizer.ValuedToken, pyTokenizer.StartOfDocumentToken, pyTokenizer.CharacterToken, pyTokenizer.SpaceToken, pyTokenizer.DelimiterToken, pyTokenizer.NumberToken, pyTokenizer.StringToken

class pyTokenizer.Token(previousToken, start, end=None)[source]

Bases: object

SuperToken

class pyTokenizer.SuperToken(startToken, endToken=None)[source]

Bases: pyTokenizer.Token

ValuedToken

class pyTokenizer.ValuedToken(previousToken, value, start, end=None)[source]

Bases: pyTokenizer.Token

StartOfDocumentToken

A topken stream starts with a StartOfDocumentToken.

class pyTokenizer.StartOfDocumentToken[source]

Bases: pyTokenizer.ValuedToken

CharacterToken

class pyTokenizer.CharacterToken(previousToken, value, start)[source]

Bases: pyTokenizer.ValuedToken

SpaceToken

class pyTokenizer.SpaceToken(previousToken, value, start, end=None)[source]

Bases: pyTokenizer.ValuedToken

DelimiterToken

class pyTokenizer.DelimiterToken(previousToken, value, start, end=None)[source]

Bases: pyTokenizer.ValuedToken

NumberToken

A NumberToken represents a number (RegExp: [0-9]+).

class pyTokenizer.NumberToken(previousToken, value, start, end=None)[source]

Bases: pyTokenizer.ValuedToken

StringToken

A StringToken represents a word (RegExp: [a-zA-Z]+).

class pyTokenizer.StringToken(previousToken, value, start, end=None)[source]

Bases: pyTokenizer.ValuedToken