Techknow_Study

Lexical Analysis

Lexical Analysis

Overview

_ Main task: to read input characters and group them into
tokens.”

_ Secondary tasks:

_ Skip comments and whitespace;
_ Correlate error messages with source program (e.g., line number of error).




Lexical Analysis: Terminology

_ token: a name for a set of input strings with related
structure.

Example: “identifier,” “integer constant”

_ pattern: a rule describing the set of strings
associated with a token.

Example: “a letter followed by zero or more letters, digits, or
underscores.”

_ lexeme: the actual input string that matches a
pattern.
Example: count

Examples

Input: count = 123
Tokens:
identifier : Rule: letter followed by …”
Lexeme: count
assg_op : Rule: =
Lexeme: =
integer_const : Rule: digit followed by …”
Lexeme: 123

Attributes for Tokens

_ If more than one lexeme can match the pattern for a
token, the scanner must indicate the actual lexeme
that matched.

_
This information is given using an attribute
associated with the token.

Single Post Navigation

Leave a comment

Design a site like this with WordPress.com
Get started