Dana Vrajitoru
C311 Programming Languages
Parser Implementation
LL(1) Table-Based Parsing
- The table is similar to the table for the scanner.
- It contains a row for every possible non-terminal and a column for
every possible terminal.
- The cells of the table store the rule number to be applied when
the non-terminal (row) must be expanded and the terminal (column) is
present at the current position in the input string.
- The non-terminals in the processing sequence are stored in a stack.
- The one at the top is processed next. The symbols at the
right-hand side of the rule used to expand it are pushed on the stack
in reverse order (right to left).
Example
- Considering the grammar:
S => F 1
S => ( S + F ) 2
F = > 1 3
- or in BNF form:
S => F | (S + F ) 1|2
- This recognizes strings of the form:
1 or (1+1) or ((1+1)+1), etc
- The table:
| ( | ) | 1 | + | else
|
S | 2 | - | 1 | - | -
|
F | - | - | 3 | - | -
|
Table Parsing
Table for the Expression
| + | - | * | / | ( | ) | id | nr | else
|
E | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | -
|
TT | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | -
|
T | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | -
|
FT | 6 | 6 | 5 | 5 | 6 | 6 | 6 | 6 | -
|
F | - | - | - | - | 7 | - | 8 | 9 | -
|
AO | 10 | 11 | - | - | - | - | - | - | -
|
MO | - | - | 12 | 13 | - | - | - | - | -
|
Stack-based Parsing
- The parser keeps a stack of the part of the parsed string to the
right of (and including) the first non processed non terminal from the
left.
- The top of the stack is processed at each step.
- If it contains a non terminal, then the current terminal in the
input string and the table give us the rule to apply to expand it. The
non terminal is popped from the stack and is replaced on it with the
right hand side of the rule, from right to left.
- If it contains a terminal, it must be matched by the current
terminal in the input string and the cursor moves forward in the input
string if it’s true.
- Stack-based parser example
Recursive Descent
- Recursive descent parsers have one function for each non-terminal.
- The function is responsible for checking all of the discriminating
terminals for that particular non-terminal.
- Applying a production rule means checking all the terminal symbols
and calling the function for each non-terminals.
- If the grammar is LL or LR, there is no need for backtracking or
checking. The algorithm is linear.
Implementation
- The lisp code of type check-S, check-T is a recursive descent
implementation.
- If there is more than one non-terminal in right hand side of the
rule or if the non-terminal is followed by some terminal symbols, the
function needs to return the position in the input string where the
parsing stopped along with success or failure.
- We can store the input in a global variable instead.
; input is a global variable
(defun move-input ()
(pop input) t)
; factor => ( expr ) | id | number
(defun check-factor ()
(cond ((symbolp (car input))
(pop input) t)
((numberp (car input))
(pop input) t)
((equal (car input) "(")
(move-input)
(and (check-expr)
(equal (car input) ")")
(move-input)))
(t nil)))
Building the Tree
- The next step from the parsing is building the parse tree that can
be used in the evaluation or translation into another language.
- When we expand a non-terminal symbol, the result is a subtree. The
non-terminal becomes the root of the subtree and the right-hand side
its children.
- The function that parses the symbol in recursive descent must
return the subtree.
(defun parse-factor ()
(let ((token (car input)) (Exp nil))
(cond ((or (symbolp token)
(numberp token))
(move-input)
(list 'factor token))
((equal token "(")
(move-input)
(if (and (setq Expr (parse-expr))
(equal (car input) ")")
(move-input)))
(list 'factor "(" Expr ")"))
(t nil))))
Example
Partial lisp parsing of the grammar
interpreting an arithmetic expression.
Evaluate the Parse Tree
- A simple function that checks for the symbol of the root, then
performs some operation based on what the operands are.
- It needs to call itself recursively in most cases.
- If a child is a subtree, then the function will be called
recursively on it and the result will be used in the operation.
- Otherwise the root of the tree usually defines the type or
operation to be performed (+, -, etc).
Example
(defun eval-tree (Tree)
(let ((root (car Tree)))
(cond
((equal root 'factor)
(if (= (length Tree) 2)
; number or symbol
(eval (cadr Tree))
(eval-tree (caddr Tree)))))))
Separate Functions
; '(factor 2)
; '(factor a)
; '(factor "(" (E ...) ")")
(defun eval1-factor (Tree Prev)
(if (or (not Tree)
(not (equal root 'factor)))
nil
(if (= (length Tree) 2) ; number or symbol
(eval (cadr Tree))
(eval1-E (elt Tree 2)))))
Direct Evaluation
When all the operands involved in an operation are direct children of
the root, as well as the terminal identifying the operation, the
function can simply evaluate all the children, then return the
operation applied to the results.
E => T AO TT
(eval-E) means:
(setq r1 (eval-T))
(setq r2 (eval-TT))
(+ r1 r2)
; E => T
; E => T AO TT
; TT => T
; TT => T AO TT
; '(E (T ...))
; '(E (T ...) "+" (TT ...))
(defun eval-E (Tree)
(if (equal (length Tree) 1)
(eval-T (cadr Tree))
(let ((r1 (eval-T (cadr Tree)) )
(r2 (eval-TT (elt Tree 3))))
(cond ((equal (elt Tree 2) "+")
(+ r1 r2))
((equal (elt Tree 2) "-")
(- r1 r2))
(t nil)))))
More Evaluation
Example
Complete Elisp implementation for the
simple grammar recognizing strings like (1 + 1 ).
Evaluating an expression
Parameter passed forward
Lambda expression returned