Lexer
Lexica is a whitespace delimited language and so indentation is used for lexical scoping. This means that indentation tokens must be parsed into logical block open and close tokens.
The lexer consists of a pipeline from a string to a stream of tokens. This stream differs from a traditional iterator in that it always produces a token. End of input is denoted with a special token variant. This simplifies parsing the language.
Each stage in the pipeline transforms the provided input. An intermediary LexerToken
type is used to thread Indent
tokens through to further stages.
Stage | Description |
Source Split | Splits a string into string slices with span information |
Lexer Tokenize | Maps each string slice into a |
Indent Lexer | Maintains the indentation level and produces logical blocks |
Space Lexer | Modifies logical tokens based on surrounding context |
Last updated