re2c --version (return code: 0)
re2c 1.0.1
re2c --help (return code: 0)
-? -h --help
Show a short help screen:
-b --bit-vectors
Implies -s. Use bit vectors as well to try to coax better code
out of the compiler. Most useful for specifications with more
than a few keywords (e.g., for most programming languages).
-c --conditions
Used for (f)lex-like condition support.
-d --debug-output
Creates a parser that dumps information about the current posi‐
tion and the state the parser is in. This is useful for debug‐
ging parser issues and states. If you use this switch, you need
to define a YYDEBUG macro, which will be called like a function
with two parameters: void YYDEBUG (int state, char current).
The first parameter receives the state or -1 and the second
parameter receives the input at the current cursor.
-D --emit-dot
Emit Graphviz dot data, which can then be processed with e.g.,
dot -Tpng input.dot > output.png. Please note that scanners with
many states may crash dot.
-e --ecb
Generate a parser that supports EBCDIC. The generated code can
deal with any character up to 0xFF. In this mode, re2c assumes
an input character size of 1 byte. This switch is incompatible
with -w, -x, -u, and -8.
-f --storable-state
Generate a scanner with support for storable state.
-F --flex-syntax
Partial support for flex syntax. When this flag is active, named
definitions must be surrounded by curly braces and can be
defined without an equal sign and the terminating semicolon.
Instead, names are treated as direct double quoted strings.
-g --computed-gotos
Generate a scanner that utilizes GCC's computed-goto feature.
That is, re2c generates jump tables whenever a decision is of
certain complexity (e.g., a lot of if conditions would be other‐
wise necessary). This is only usable with compilers that support
this feature. Note that this implies -b and that the complexity
threshold can be configured using the cgoto:threshold inplace
configuration.
-i --no-debug-info
Do not output #line information. This is useful when you want
use a CMS tool with re2c's output. You might want to do this if
you do not want to impose re2c as a build requirement for your
source.
-o OUTPUT --output=OUTPUT
Specify the OUTPUT file.
-r --reusable
Allows reuse of scanner definitions with /*!use:re2c */ after
/*!rules:re2c */. In this mode, no /*!re2c */ block and exactly
one /*!rules:re2c */ must be present. The rules are saved and
used by every /*!use:re2c */ block that follows. These blocks
can contain inplace configurations, especially re2c:flags:e,
re2c:flags:w, re2c:flags:x, re2c:flags:u, and re2c:flags:8.
That way it is possible to create the same scanner multiple
times for different character types, different input mechanisms,
or different output mechanisms. The /*!use:re2c */ blocks can
also contain additional rules that will be appended to the set
of rules in /*!rules:re2c */.
-s --nested-ifs
Generate nested ifs for some switches. Many compilers need this
assist to generate better code.
-t HEADER --type-header=HEADER
Create a HEADER file that contains types for the (f)lex-like
condition support. This can only be activated when -c is in use.
-T --tags
Enable submatch extraction with tags.
-P --posix-captures
Enable submatch extraction with POSIX-style capturing groups.
-u --unicode
Generate a parser that supports UTF-32. The generated code can
deal with any valid Unicode character up to 0x10FFFF. In this
mode, re2c assumes an input character size of 4 bytes. This
switch is incompatible with -e, -w, -x, and -8. This implies -s.
-v --version
Show version information.
-V --vernum
Show the version as a number in the MMmmpp (Majorm, minor,
patch) format.
-w --wide-chars
Generate a parser that supports UCS-2. The generated code can
deal with any valid Unicode character up to 0xFFFF. In this
mode, re2c assumes an input character size of 2 bytes. This
switch is incompatible with -e, -x, -u, and -8. This implies -s.
-x --utf-16
Generate a parser that supports UTF-16. The generated code can
deal with any valid Unicode character up to 0x10FFFF. In this
mode, re2c assumes an input character size of 2 bytes. This
switch is incompatible with -e, -w, -u, and -8. This implies -s.
-8 --utf-8
Generate a parser that supports UTF-8. The generated code can
deal with any valid Unicode character up to 0x10FFFF. In this
mode, re2c assumes an input character size of 1 byte. This
switch is incompatible with -e, -w, -x, and -u.
--case-insensitive
Makes all strings case insensitive. This makes "-quoted expres‐
sions behave as '-quoted expressions.
--case-inverted
Invert the meaning of single and double quoted strings. With
this switch, single quotes are case sensitive and double quotes
are case insensitive.
--no-generation-date
Suppress date output in the generated file.
--no-lookahead
Use TDFA(0) instead of TDFA(1). This option only has effect
with --tags or --posix-captures options.
--no-optimize-tags
Suppress optimization of tag variables (mostly used for debug‐
ging).
--no-version
Suppress version output in the generated file.
--no-generation-date
Suppress version output in the generated file.
--encoding-policy POLICY
Specify how re2c must treat Unicode surrogates. POLICY can be
one of the following: fail (abort with an error when a surrogate
is encountered), substitute (silently replace surrogates with
the error code point 0xFFFD), ignore (treat surrogates as normal
code points). By default, re2c ignores surrogates (for backward
compatibility). The Unicode standard says that standalone surro‐
gates are invalid code points, but different libraries and pro‐
grams treat them differently.
--input INPUT
Specify re2c's input API. INPUT can be either default or custom.
-S --skeleton
Instead of embedding re2c-generated code into C/C++ source, gen‐
erate a self-contained program for the same DFA. Most useful for
correctness and performance testing.
--empty-class POLICY
What to do if the user uses an empty character class. POLICY can
be one of the following: match-empty (match empty input: pretty
illogical, but this is the default for backwards compatibility
reasons), match-none (fail to match on any input), error (compi‐
lation error). Note that there are various ways to construct an
empty class, e.g., [], [^\x00-\xFF], [\x00-\xFF][\x00-\xFF].
--dfa-minimization <table | moore>
The internal algorithm used by re2c to minimize the DFA
(defaults to moore). Both the table filling algorithm and the
Moore algorithm should produce the same DFA (up to states rela‐
beling). The table filling algorithm is much simpler and
slower; it serves as a reference implementation.
--eager-skip
This option controls when the generated lexer advances to the
next input symbol (that is, increments YYCURSOR or invokes
YYSKIP). By default this happens after transition to the next
state, but --eager-skip option allows one to override default behav‐
ior and advance input position immediately after reading input
symbol. This option is implied by --no-lookahead.
--dump-nfa
Generate .dot representation of NFA and dump it on stderr.
--dump-dfa-raw
Generate .dot representation of DFA under construction and dump
it on stderr.
--dump-dfa-det
Generate .dot representation of DFA immediately after deter‐
minization and dump it on stderr.
--dump-dfa-tagopt
Generate .dot representation of DFA after tag optimizations and
dump it on stderr.
--dump-dfa-min
Generate .dot representation of DFA after minimization and dump
it on stderr.
--dump-adfa
Generate .dot representation of DFA after tunneling and dump it
on stderr.
-1 --single-pass
Deprecated. Does nothing (single pass is the default now).
-W Turn on all warnings.
-Werror
Turn warnings into errors. Note that this option alone doesn't
turn on any warnings; it only affects those warnings that have
been turned on so far or will be turned on later.
-W<warning>
Turn on a warning.
-Wno-<warning>
Turn off a warning.
-Werror-<warning>
Turn on a warning and treat it as an error (this implies
-W<warning>).
-Wno-error-<warning>
Don't treat this particular warning as an error. This doesn't
turn off the warning itself.
-Wcondition-order
Warn if the generated program makes implicit assumptions about
condition numbering. You should use either the -t, --type-header
option or the /*!types:re2c*/ directive to generate a mapping of
condition names to numbers and then use the autogenerated condi‐
tion names.
-Wempty-character-class
Warn if a regular expression contains an empty character class.
Rationally, trying to match an empty character class makes no
sense: it should always fail. However, for backwards compatibil‐
ity reasons, re2c allows empty character classes and treats them
as empty strings. Use the --empty-class option to change the
default behavior.
-Wmatch-empty-string
Warn if a regular expression in a rule is nullable (matches an
empty string). If the DFA runs in a loop and an empty match is
unintentional (the input position in not advanced manually), the
lexer may get stuck in an infinite loop.
-Wswapped-range
Warn if the lower bound of a range is greater than its upper
bound. The default behavior is to silently swap the range
bounds.
-Wundefined-control-flow
Warn if some input strings cause undefined control flow in the
lexer (the faulty patterns are reported). This is the most dan‐
gerous and most common mistake. It can be easily fixed by adding
the default rule (*) (this rule has the lowest priority, matches
any code unit, and consumes exactly one code unit).
-Wunreachable-rules
Warn about rules that are shadowed by other rules and will never
match.
-Wuseless-escape
Warn if a symbol is escaped when it shouldn't be. By default,
re2c silently ignores such escapes, but this may as well indi‐
cate a typo or error in the escape sequence.
-Wnondeterministic-tags
Warn if tag has n-th degree of nondeterminism, where n is
greater than 1.