Lexer
Lexica is a whitespace delimited language and so indentation is used for lexical scoping. This means that indentation tokens must be parsed into logical block open and close tokens.
The lexer consists of a pipeline from a string to a stream of tokens. This stream differs from a traditional iterator in that it always produces a token. End of input is denoted with a special token variant. This simplifies parsing the language.
Each stage in the pipeline transforms the provided input. An intermediary LexerToken
type is used to thread Indent
tokens through to further stages.
Stage
Description
Source Split
Splits a string into string slices with span information
Lexer Tokenize
Maps each string slice into a LexerToken
Indent Lexer
Maintains the indentation level and produces logical blocks
Space Lexer
Modifies logical tokens based on surrounding context
Last updated