Actions and Return Values

Each lexer rule has an action block that executes when the pattern matches.

Basic Actions

Return a token type:

[0-9]+      { return NUMBER; }
"+"         { return PLUS; }

Skip input without returning:

[ \t\n]+    { /* skip whitespace */ }

Available Variables

yytext

The matched text as a string:

[a-z]+      { printf("Matched: %s\n", yytext); return WORD; }

yyleng

Length of the matched text:

[a-z]+      { printf("Length: %d\n", yyleng); return WORD; }

yylval

Semantic value passed to the parser. Set in the action:

[0-9]+      { yylval.ival = atoi(yytext); return NUMBER; }
[0-9]+\.[0-9]+  { yylval.dval = atof(yytext); return FLOAT; }

Multi-Statement Actions

Use braces for multiple statements:

[0-9]+      {
    int val = atoi(yytext);
    if (val > 1000) {
        fprintf(stderr, "Warning: large number\n");
    }
    yylval.ival = val;
    return NUMBER;
}

Language-Specific Actions

C Actions

[a-z]+      {
    yylval.str = strdup(yytext);
    return IDENTIFIER;
}

Python Actions

[a-z]+      {
    self.yylval = self.yytext
    return self.IDENTIFIER
}

Java Actions

[a-z]+      {
    yylval = yytext;
    return IDENTIFIER;
}

Token Constants

Token constants should match those declared in the grammar file:

%{
/* From parser.h or defined here */
#define NUMBER 258
#define PLUS   259
#define MINUS  260
%}

%%

[0-9]+  { return NUMBER; }
"+"     { return PLUS; }
"-"     { return MINUS; }

%%

Error Handling

The . pattern matches any single character. Use it as a catch-all:

.       {
    fprintf(stderr, "Unexpected character: %s\n", yytext);
    return ERROR;
}