Module pl.lexer
Lexical scanner for creating a sequence of tokens from text.
lexer.scan(s)
returns an iterator over all tokens found in the
string s
. This iterator returns two values, a token type string
(such as ‘string’ for quoted string, ‘iden’ for identifier) and the value of the
token.
Versions specialized for Lua and C are available; these also handle block comments and classify keywords as ‘keyword’ tokens. For example:
> s = 'for i=1,n do' > for t,v in lexer.lua(s) do print(t,v) end keyword for iden i = = number 1 , , iden n keyword do
See the Guide for further discussion
Functions
scan (s, matches[, filter[, options]]) | create a plain token iterator from a string or file-like object. |
insert (tok, a1, a2) | insert tokens into a stream. |
getline (tok) | get everything in a stream upto a newline. |
lineno (tok) | get current line number. |
getrest (tok) | get the rest of the stream. |
get_keywords () | get the Lua keywords as a set-like table. |
lua (s[, filter[, options]]) | create a Lua token iterator from a string or file-like object. |
cpp (s[, filter[, options]]) | create a C/C++ token iterator from a string or file-like object. |
get_separated_list (tok[, endtoken=')'[, delim=']]) | get a list of parameters separated by a delimiter from a stream. |
skipws (tok) | get the next non-space token from the stream. |
expecting (tok, expected_type, no_skip_ws) | get the next token, which must be of the expected type. |
Functions
- scan (s, matches[, filter[, options]])
-
create a plain token iterator from a string or file-like object.
Parameters:
- s
string or file
a string or a file-like object with
:read()
method returning lines. - matches
tab
an optional match table - array of token descriptions.
A token is described by a
{pattern, action}
pair, wherepattern
should match token body andaction
is a function called when a token of described type is found. - filter
tab
a table of token types to exclude, by default
{space=true}
(optional) - options
tab
a table of options; by default,
{number=true,string=true}
, which means convert numbers and strip string quotes. (optional)
- s
string or file
a string or a file-like object with
- insert (tok, a1, a2)
-
insert tokens into a stream.
Parameters:
- tok a token stream
- a1 a string is the type, a table is a token list and a function is assumed to be a token-like iterator (returns type & value)
- a2 string a string is the value
- getline (tok)
-
get everything in a stream upto a newline.
Parameters:
- tok a token stream
Returns:
-
a string
- lineno (tok)
-
get current line number.
Parameters:
- tok a token stream
Returns:
-
the line number.
if the input source is a file-like object,
also return the column.
- getrest (tok)
-
get the rest of the stream.
Parameters:
- tok a token stream
Returns:
-
a string
- get_keywords ()
-
get the Lua keywords as a set-like table.
So
res["and"]
etc would betrue
.Returns:
-
a table
- lua (s[, filter[, options]])
-
create a Lua token iterator from a string or file-like object.
Will return the token type and value.
Parameters:
- s string the string
- filter
tab
a table of token types to exclude, by default
{space=true,comments=true}
(optional) - options
tab
a table of options; by default,
{number=true,string=true}
, which means convert numbers and strip string quotes. (optional)
- cpp (s[, filter[, options]])
-
create a C/C++ token iterator from a string or file-like object.
Will return the token type type and value.
Parameters:
- s string the string
- filter
tab
a table of token types to exclude, by default
{space=true,comments=true}
(optional) - options
tab
a table of options; by default,
{number=true,string=true}
, which means convert numbers and strip string quotes. (optional)
- get_separated_list (tok[, endtoken=')'[, delim=']])
-
get a list of parameters separated by a delimiter from a stream.
Parameters:
- tok the token stream
- endtoken string end of list. Can be ‘\n’ (default ')')
- delim string separator (default ')
Returns:
-
a list of token lists.
- skipws (tok)
-
get the next non-space token from the stream.
Parameters:
- tok the token stream.
- expecting (tok, expected_type, no_skip_ws)
-
get the next token, which must be of the expected type.
Throws an error if this type does not match!
Parameters:
- tok the token stream
- expected_type string the token type
- no_skip_ws bool whether we should skip whitespace