Post on 07-Mar-2018
Compiler--- Top-Down Parsing
Zhang Zhizhengseu_zzz@seu.edu.cn
School of Computer Science and Engineering,Software College
Southeast University
2013/11/12 1Zhang Zhizheng, Southeast University
Left Recursion Infinite loop Eliminating Left Recursion
Backtracking inefficientMethods: Prediction
1. Lifting common factor
2. Eliminating Ambiguity
Solution:Rewriting the grammar E.g.stmtif expr then stmt|if expr
then stmt else stmt|other==> stmt matched-
stmt|unmatched-stmtmatched-stmt if expr then
matched-stmt else matched-stmt|other
unmatched-stmt if expr then stmt|if expr then matched-stmt else unmatched-stmt
Problems in T-D Approach
E.g. Consider the following grammar, and parse the string id+id*id#
1.E TE` 2.E` +TE`
3.E` 4.T FT`
5.T` *FT` 6.T`
7.F id 8.F (E)
A Case without left recursion,
left common factor, ambiguity
1) FIRST & FOLLOW
FIRST:
• If is any string of grammar symbols, let FIRST() be the set of terminals that begin the string derived
from .
• If , then is also in FIRST()
• That is :
V*, First()={a| a……,a VT }
+
FOLLOW:• For non-terminal A, to be the set of terminals a that
can appear immediately to the right of A in some sentential form.
• That is: Follow(A)={a|S …Aa…,a VT }
If S…A, then # FOLLOW(A)。
2) Computing FIRST()
(1)to compute FIRST(X) for all grammar symbols X
• If X is terminal, then FIRST(X) is {X}.
• If X is a production, then add to FIRST(X).
• If Xa is a production, then add a to FIRST(X).
• If X is non-terminal, and X Y1Y2…Yk,Yj(VNVT),1j k, then
{ j=1; FIRST(X)={}; //initiate
while ( j<k and FIRST(Yj)) {
FIRST(X)=FIRST(X)(FIRST(Yj)-{})
j=j+1
}
IF (j=k and FIRST(Yk))
FIRST(X)=FIRST(X) {}
}
(2)to compute FIRST for any string =X1X2…Xn,Xi(VNVT),1i n
{i=1; FIRST()={}; //initiate
repeat
{
FIRST()=FIRST()(FIRST(Xi)-{})
i=i+1
}
until (i=n and FIRST(Xj))
IF (i=n and FIRST(Xn))
FIRST()=FIRST(){}
}
3) Computing FOLLOW(A)(1) Place # in FOLLOW(S), where S is the start
symbol and # is the input right end-marker.
(2)If there is A B in G, then add First()-{}to Follow(B).
(3)If there is A B, or AB where FIRST() contains ,then add Follow(A) to Follow(B).
construct FIRST & FOLLOW for each non-terminals
1.E TE` 2.E` +TE`
3.E` 4.T FT`
5.T` *FT` 6.T`
7.F i 8.F (E)
Answer:
First(E)=First(T)=First(F)={(, i}
First(E`)={+, }
First(T`)={*, }
Follow(E)= Follow(E`)={),#}
Follow(T)= Follow(T`)={+,),#}
Follow(F)={*,+,),#}
4) Construction of Predictive Parsing Tables
Main Idea: Suppose A is a production with a in FIRST(). Then the parser will expand A by when the current input symbol is a. If , we should again expand A by if the current input symbol is in FOLLOW(A), or if the # on the input has been reached and # is in FOLLOW(A).
*
Method.
1. For each production A , do steps 2 and 3.
2. For each terminal a in FIRST(), add A to M[A,a].
3. If is in FIRST(), add A to M[A,b] for each terminal b in FOLLOW(A). If is in FIRST() and # is in FOLLOW(A), add A to M[A,#].
4.Make each undefined entry of M be error.
Parsing table M
id + * ( ) #
E ETE` ETE`
E` E`
+TE`
E`ε E`ε
T TFT` TFT`
T` T`ε T`
*FT`
T`ε T`ε
F F i F (E)
LL(1) Algorithm
X: the symbol on top of the stack;
a: the current input symbol
If X=a=#, the parser halts and announces successful completion of parsing;
If X=a!=#, the parser pops X off the stack and advances the input pointer to the next input symbol;
If X is a non-terminal, the program consults entry M[X,a] of the parsing table M. This entry will be either an X-production of the grammar or an error entry.
E.g. Consider the following Grammar, construct predictive
parsing table for it.
S iEtSS` |a
S` eS |
E b
Definition
A grammar whose parsing table has no
multiply-defined entries is said to
be LL(1).
The first “L” stands for scanning the
input from left to right.
The second “L” stands for producing a
leftmost derivation
“1” means using one input symbol of
look-ahead s.t each step to make
parsing action decisions.
(1)No ambiguous can be LL(1).
(2)Left-recursive grammar cannot be LL(1).
(3)A grammar G is LL(1) if and only if
whenever A | are two distinct
productions of G
I. For no terminal a do both and derive strings beginning with a.
II. At most one of and can derive the empty string.III. If , then does not derive any string beginning with
a terminal in FOLLOW(A).
Forms of left recursion
Left recursion is the grammar contains the following
kind of productions.
• P P| Immediate recursion
or
• P Aa , APb Indirect recursion
Eliminate Left Recursions
The Main Idea of Algorithm
(1) Elimination of immediate left recursion
P P|
=> P->*
=> P P’ P’ P’|
(2) Elimination of indirect left recursion
Convert it into immediate left recursion first according to specific order, then eliminate the related immediate left recursion
Algorithm:
– (1)Arrange the non-terminals in G in some order as P1,P2,…,Pn, do step 2 for each of them.
– (2) for (i=1,i<=n,i++)
{for (k=1,k<=i-1,k++){replace each production of the form Pi Pk
by Pi 1 | 2 |……| ,n ;
where Pk 1| 2|……| ,n are all the current Pk productions
}
change Pi Pi1| Pi2|…. | Pim|1| 2|….| n into
Pi 1 Pi`| 2 Pi `|……| n Pi`
Pi`1Pi`|2Pi`|……| mPi`|
}
eliminate the immediate left recursion
(3)Simplify the grammar.
E.g. Eliminating all left recursion in the following grammar:
(1) S Qc|c (2)Q Rb|b (3) R Sa|a
Answer: 1)Arrange the non-terminals in the order:R,Q,S
2)for R: no actions.
for Q:Q Rb|b Q Sab|ab|b
for S: S Qc|c S Sabc|abc|bc|c;
then get S (abc|bc|c)S`
S` abcS`|
3) Because R,Q is not reachable, so delete them
so, the grammar is :
S (abc|bc|c)S`
S` abcS`|
If the grammar contains the productions like A1| 2|…. | n
Chang them into AA`
A`1|2|…. |n
Lift the Common Factor
A. Left Recursion is a fatal flaw
B. To improve the TOP-Down by Prediction (LL(1))① Eliminating Ambiguities manually
② Eliminating Left Recursions
③ Lifting Maximal Common Factors
④ Constructing A Prediction Parse
Table
A Survey