Module scan
A Lexical Scanner.
Lexical scanners are a smarter and cleaner alternative to the primitive strtok
function.
Each time you call scan_next, the scanner finds the next token,
ScanState *ts = scan_new_from_string("hello = (10,20,30)")); scan_next(ts); char *name = scan_get_str(ts); // will be "hello" char ch = scan_next(ts); // will be '=' scan_next(ts); // skip '(' scan_next(ts); double val1 = scan_get_number(ts); // 10 scan_next(ts); // skip ',' double val2 = scan_get_number(ts); // 20
At any point, ts->type
tells you the next available token.
Note that by default this scanner ignores space.
A convenient higher-level function is scan_scanf; the equivalent of above code is simply:
scan_scanf(ts,"%s %c (%f,%f",&name,&ch,&val1,&val2).
See test-scan.c for examples of various uses.
Tables
ScanState | Scanner type. |
Configuration
scan_set_flags (ts, flags) | set flags. |
scan_set_line_comment (ts, cc) | line comment (either one or two characters). |
scan_force_line_mode (ts) | tell the scanner not to grab the next line automatically. |
scan_push_back (ts) | tell the scanner not to advance on following scan_next. |
Constructing
scan_new_from_string (str) | scanner from a string. |
scan_new_from_file (fname) | scanner from a file. |
scan_new_from_stream (stream) | scanner from an existing file stream. |
Grabbing
scan_fetch_line (ts, skipws) | fetch a new line from the stream, if defined. |
scan_getch (ts) | get the next character. |
scan_advance (ts, offs) | Move the scan reader position directly with an offset. |
scan_peek (ts, offs) | look at character ahead |
scan_get_upto (ts, target, buff, bufsz) | grab a string upto (but not including) a final target string. |
scan_numbers (ts, values, sz) | grab up to sz numbers from the stream. |
Skipping
scan_skip_whitespace (ts) | skip white space, reading new lines if necessary. |
scan_skip_space (ts) | skip white space and single-line comments. |
scan_skip_digits (ts) | skip digits. |
scan_skip_until (ts, type) | skip until a token is found with type . |
scan_next_number (ts, val) | fetch the next number, skipping any other tokens. |
scan_next_iden (ts, buff, len) | fetch the next word, skipping other tokens. |
scan_next_item (ts, type, buff, sz) | fetch the next item, skipping other tokens. |
Scanning
scan_next (ts) | advance to the next token. |
Getting
scan_get_tok (ts, tok, len) | copy the current token to a buff. |
scan_get_str (ts) | get current token as string. |
scan_scanf (ts, fmt, ...) | Formatted reading from the scanner, like scanf . |
scan_get_line (ts, buff, len) | get the rest of the current line. |
scan_next_line (ts) | fetch the next line and force line mode. |
scan_get_number (ts) | get the current token as a number. |
Tables
- ScanState
-
Scanner type.
Fields:
- line int current line in file, if not just parsing a string.
- type
int
One of the following:
T_END, T_EOF=0, T_TOKEN, T_IDEN=1, T_NUMBER, T_STRING, T_CHAR, T_NADA
- int_type
int
One of
T_DOUBLE, T_INT, T_HEX, T_OCT,
Configuration
- scan_set_flags (ts, flags)
-
set flags.
C_IDEN
words may contain underscoresC_NUMBER
instead ofT_NUMBER
, returnT_INT
,T_HEX
andT_DOUBLE
C_STRING
parse C string escapesC_WSPACE
don’t skip whitespace
Parameters:
- ts ScanState *
- flags int
- scan_set_line_comment (ts, cc)
-
line comment (either one or two characters).
Parameters:
- ts ScanState *
- cc const char *
- scan_force_line_mode (ts)
-
tell the scanner not to grab the next line automatically.
Parameters:
- ts ScanState *
- scan_push_back (ts)
-
tell the scanner not to advance on following scan_next.
Parameters:
- ts ScanState *
Constructing
- scan_new_from_string (str)
-
scanner from a string.
Parameters:
- str const char *
Returns:
- scan_new_from_file (fname)
-
scanner from a file.
Parameters:
- fname const char *
Returns:
- scan_new_from_stream (stream)
-
scanner from an existing file stream.
Parameters:
- stream FILE *
Returns:
Grabbing
- scan_fetch_line (ts, skipws)
-
fetch a new line from the stream, if defined.
Advances the line count – not used if the scanner has
been given a string directly.
Parameters:
- ts ScanState *
- skipws int
Returns:
-
bool
- scan_getch (ts)
-
get the next character.
Parameters:
- ts ScanState *
Returns:
-
char
- scan_advance (ts, offs)
-
Move the scan reader position directly with an offset.
Parameters:
- ts ScanState *
- offs int
- scan_peek (ts, offs)
-
look at character ahead
Parameters:
- ts ScanState *
- offs int
Returns:
-
char
- scan_get_upto (ts, target, buff, bufsz)
-
grab a string upto (but not including) a final target string.
Advances the scanner (use scan_advance with negative offset to back off)
Parameters:
- ts ScanState *
- target const char *
- buff char *
- bufsz int
Returns:
-
int
- scan_numbers (ts, values, sz)
-
grab up to
sz
numbers from the stream. scan_next_line can be used to limit this to the current line only.Parameters:
- ts ScanState *
- values double *
- sz int
Returns:
-
int
Skipping
- scan_skip_whitespace (ts)
-
skip white space, reading new lines if necessary.
Parameters:
- ts ScanState *
Returns:
-
bool
- scan_skip_space (ts)
-
skip white space and single-line comments.
Parameters:
- ts ScanState *
- scan_skip_digits (ts)
-
skip digits.
Parameters:
- ts ScanState *
- scan_skip_until (ts, type)
-
skip until a token is found with
type
. May returnfalse
if the scanner ran out.Parameters:
- ts ScanState *
- type ScanTokenType
Returns:
-
bool
- scan_next_number (ts, val)
-
fetch the next number, skipping any other tokens.
Parameters:
- ts ScanState *
- val double *
Returns:
-
bool
- scan_next_iden (ts, buff, len)
-
fetch the next word, skipping other tokens.
Parameters:
- ts ScanState *
- buff char *
- len int
Returns:
-
char *
- scan_next_item (ts, type, buff, sz)
-
fetch the next item, skipping other tokens.
Parameters:
- ts ScanState *
- type ScanTokenType
- buff char *
- sz int
Returns:
-
bool
Scanning
- scan_next (ts)
-
advance to the next token.
Usually this skips whitespace, and single-line comments if defined.
Parameters:
- ts ScanState *
Returns:
-
ScanTokenType
Getting
- scan_get_tok (ts, tok, len)
-
copy the current token to a buff.
Parameters:
- ts ScanState *
- tok char *
- len int
Returns:
-
char *
- scan_get_str (ts)
-
get current token as string.
Parameters:
- ts ScanState *
Returns:
-
char *
- scan_scanf (ts, fmt, ...)
-
Formatted reading from the scanner, like
scanf
. Flags start with ‘%’, and ‘%%’ encodes a literal ‘%’.v
values
identifierl
rest of lineq
quoted stringi
intf
doublec
char.
don’t care!
Parameters:
- ts ScanState *
- fmt const char *
- ...
Returns:
-
bool
- scan_get_line (ts, buff, len)
-
get the rest of the current line.
This trims any leading whitespace.
Parameters:
- ts ScanState *
- buff char *
- len int
Returns:
-
char *
- scan_next_line (ts)
-
fetch the next line and force line mode.
After this, the scanner will regard end-of-line as end of input.
Parameters:
- ts ScanState *
Returns:
-
const char *
- scan_get_number (ts)
-
get the current token as a number.
Parameters:
- ts ScanState *
Returns:
-
double