Start Conditions

Start conditions allow the lexer to switch between different sets of rules. This is useful for handling comments, strings, and other context-dependent lexing.

Declaring Start Conditions

Use %x for exclusive conditions or %s for inclusive conditions:

%x COMMENT
%x STRING
%s SPECIAL
  • Exclusive (%x): Only rules with this condition are active.
  • Inclusive (%s): Rules with this condition AND rules without conditions are active.

Using Start Conditions

Applying to Rules

Prefix a rule with the condition name in angle brackets:

<COMMENT>.      { /* inside comment */ }
<STRING>[^"]+   { /* inside string */ }

Multiple Conditions

Specify multiple conditions separated by commas:

<COMMENT,STRING>.   { /* in comment or string */ }

Initial Condition

Rules without a condition apply in the INITIAL state:

[a-z]+      { return WORD; }   /* applies in INITIAL */

Or explicitly:

<INITIAL>[a-z]+     { return WORD; }

Switching Conditions

Use BEGIN(condition) to switch:

"/*"            { BEGIN(COMMENT); }
<COMMENT>"*/"   { BEGIN(INITIAL); }

Example: C-Style Comments

%x COMMENT

%%

"/*"            { BEGIN(COMMENT); }
<COMMENT>"*/"   { BEGIN(INITIAL); }
<COMMENT>\n     { yylineno++; }
<COMMENT>.      { /* skip comment content */ }

%%

Example: String Literals

%x STRING

%%

\"              { BEGIN(STRING); string_buf_ptr = string_buf; }
<STRING>\"      { 
    BEGIN(INITIAL);
    *string_buf_ptr = '\0';
    yylval.str = strdup(string_buf);
    return STRING_LITERAL;
}
<STRING>\\n     { *string_buf_ptr++ = '\n'; }
<STRING>\\t     { *string_buf_ptr++ = '\t'; }
<STRING>\\\\    { *string_buf_ptr++ = '\\'; }
<STRING>\\.     { *string_buf_ptr++ = yytext[1]; }
<STRING>[^\\\"]+ {
    char *p = yytext;
    while (*p) *string_buf_ptr++ = *p++;
}

%%

Example: Nested Comments

For languages with nested comments, use a counter:

%x COMMENT

%{
int comment_depth = 0;
%}

%%

"(*"            { comment_depth++; BEGIN(COMMENT); }
<COMMENT>"(*"   { comment_depth++; }
<COMMENT>"*)"   { 
    if (--comment_depth == 0) BEGIN(INITIAL);
}
<COMMENT>.      { /* skip */ }
<COMMENT>\n     { yylineno++; }

%%