Introduction
This article describes an implementation of a particular method of constructing a parse table for an LR (left to right bottom up) parser called an SLR parser or Simple-LR parser.
Main things that lead to i am start to program this project is export pars Table in reusable format for other parsing program.
Background
Parsing
Syntax Analysis . Recognize sentences in a language. Discover the structure of a document/program.
Construct (implicitly or explicitly) a tree (called as a parse tree) to represent the structure.
LR Parsers
Recognize CFGs for most programming languages. No need to rewrite grammar.
Most general O(n) shift-reduce method.
Optimal syntax error detection.
LR grammars superset of LL grammars.
Bottom-Up Parsing
Construct parse tree from the leaves to the root.
Corresponds to a rightmost derivation in reverse.
Shift-Reduce Parsing
- Parser performs actions based on parse table information:
- Shift. Shift the next input symbol onto the top of the stack.
- Reduce. The right end of the string to be reduced must be at the top of the stack. Locate the left end of the string within the stack and decide with what nonterminal to replace the string.
- Accept. Announce successful completion of parsing.
- Error. Discover a syntax error and call an error recovery routine.
Consider following example grammar and steps to create Parse Table :
Grammar :
States of this grammar :
state creation rules :
1- State 0 build from extra grammar Law S' -> S $ that S is all start symbol of grammar
and one Dot ( . ) before S symbol.
2- if Dot symbol is before a nonterminal Add grammar Laws that this nonterminal is in Left Hand Side of that Law and set Dot in before of first part of Right Hand Side.
3-if state is exist ( a state with this Laws and same Dot position ) use that instead
4- for one state that step 4 not check to now find Set of terminal and nonterminal that Dot exist in before
5- if Step 4 Set is Nonempty go to 5 else go to 7
5- for each terminal/nonterminal in set step 4 create new state by using all grammar law that Dot position is before of that terminal/nonterminal in refrence state by increasing Dot Point to Next Part in Right Hand Side of that Laws
6- goto step 2
7-end of state building
fill the parse table by SLR algorithm rules:
Goto : in parse table in row StateNumber SN and Column nonterminal V set GX when exist an export nonterminal V from state SN to state SX
Reduce : in parse table in row StateNumber SN and Column terminal T set RX when exist Law number X in state SN that have Dot point in end of law and T is last part of Right Hand Side of law number X ( terminal T is before Dot )
Shift : in parse table in row StateNumber SN and Column terminal T set SY when exist a Law in state SN that have Dot point before terminal T and state SN exports T to State number Y
Accept : In parse Table in row StateNumber SN and Column $ ( Extra terminal added to real Terminals ) set Accept when S' -> S Dot ( Dot after Start nonterminal ) member of StateNumber SN Laws
Error : all empty parse table cells are Parse Errors.
grammar can ambiguity between shift and reduce when both condition is true and set parse table Row SN to RX/SY
Using the code
for avoid of constraint in count of grammar production rule and nonterminal and terminal and ... all classes implemented in linked list data structure.
by this logic all Classes implemented in 3 level :
1- Item fields Struct
2- Node Class
3- List Class
Classes :
Class Diagram exist in download list
Classes Definitions:
1-ParsHead
this Class Top Level of All Detail Classes and Contain main Properties:
first(Head) of nonterminal list object
All terminal count
All nonterminal count
a stack for terminal symbols
2-NonTerminals
for manage Nonterminals. Nonterminal Parent is ParsHead
sample method description :
Grammar Laws proccessed and LHS nonterminal and LawNumber send to this method, nonterminal pure name build from split # and rest of name after that check that Nonterminals list is empty if it was Add first node that is extraNonTerm and add law number 0 ( added law ) and exit from method.
if first item is not null , searching for Nonterminal node that it's name is equall to searching nonterminal , if not found add new NonterminalNode and add it in List.
finally add LawNum to LawLink of Nonterminal
public void checkAdd(string nonTerm,int lawNum)
{
try
{
nonTerminalNode thisNode=null;
string []part=nonTerm.Split(' ');
string NonTerm=part[0].Substring(1,part[0].Length-1);
nonTerminalNode temp;
if(first==null)
{
first = new nonTerminalNode(nonTerminals.ExtraNonTerm,this);
first.lawLink.add(0,nonTerminals.ExtraNonTerm+" #"+NonTerm);
this.checkAdd(nonTerm,lawNum);
return;
}
else
{
temp=first;
while(temp.next!=null)
{
if(temp.item.Name==NonTerm)
{
thisNode=temp;
break;
}
temp=temp.next;
}
if(thisNode==null)
{
if(temp.item.Name!=NonTerm)
{
temp.next = new nonTerminalNode(NonTerm,this);
thisNode=temp.next;
count++;
}
else
thisNode=temp;
}
}
thisNode.lawLink.add(lawNum,nonTerm);
}
catch(Exception e1)
{
MessageBox.Show("in nonTerminals.CheckAdd : "+e1.Message);
}
}
3-Laws
for Manage Laws . Laws Parent class is Nonterminals
4-LawsParts
for Manage Laws Parts . LawsParts Parent class is Laws
5-State
for manage States that connected to StateExports and StateLaws
6-StateLaws
List of Laws in State , StateLaws Parent is State
7-StateExport
export symbol and next state number and detemine export symbol is terminal or Nonterminal
8- ParsTable
for building parsTable that have head of all List above
9-ParsTableElement
Shift ( S ) , Reduce ( R ) , Goto ( G ) for saving in ParseTable Cells
10-TermNonTermList
building means of laws part that consist Number of parts in Term and type of them.
consider the MakeStates_Click method of MainForm ( main method for using other objects):
step of using classes :
1- create ParsHead object .
2- from loaded input in RichTextBox Reading line by line ( each Line at most one law) and load items to parsHead object by load method
3- parsHead now ready for use ( nonterminal and terminal and laws and relation between is created)
4- by parsHead create State Class
5- Call Create member function of State Class for Creation of States
6-Now it is ready to create ParsTable object by previous information and object :
parsTable=new ParsTable(states.StateCount , ParsHead.terminals , parsHead.Head , this.dgTabel );
states.StateCount : Count of states created by input grammar by state.create method
ParsHead.terminals : a stack object that contain terminal symbols
parsHead.Head : root (first) of nonterminal list object
this.dgTabel : current form Datagridview object for filling by created parse table
7- Clear Result Listbox
8- Show output to result listbox
9- Change tabcontrol selected index to parse table tab ( 3th tab )
10- Fill form datagridview by writeGrid method of parsTable
11- Ask for output file path for saving parse table
12- Export to file by saveTo method in parseTable
private void MakeStates_Click(object sender, System.EventArgs e)
{
try
{
parsHead=new ParsHead();
foreach(string s in content.Lines)
parsHead.load(s);
states=new State(parsHead);
states.create();
parsTable=new ParsTable(states.StateCount,ParsHead.terminals,parsHead.Head,this.dgTabel);
states.makeParsTable(this.parsTable);
result.Items.Clear();
parsTable.showContents(result);
tabControl1.SelectedIndex=2;
parsTable.writeGrid();
SaveFileDialog saveDlg=new SaveFileDialog();
saveDlg.Filter="(*.prt)|*.prt";
if(saveDlg.ShowDialog()==DialogResult.OK)
{
StreamWriter sw=new StreamWriter(saveDlg.FileName);
parsTable.saveTo(sw);
}
}
catch(Exception ex)
{
MessageBox.Show("Grammar is incorrect : please check your grammar , example : There is one NonTerminal in Right side that never apear in left side!!!");
}
}
Program is Tab Paged and exists following 4 Tabs:
1-Grammar Input :
Get grammar from input file or direct typing
Input grammar syntax (.grm):
1-every noterminal start with #
sample :
#Vars
2-Nonterminals haven't # in Start
sample : phrase a is terminal
#B a
3- alone nonterminal show empty grammar
sample :
#B
4- All nonterminal in right hand side of grammars must be at least one grammar left hand side
sample :
#B #A p
#A a
5- delimiter for phrase in grammar is One Space
grammar Sample Input included in article download links.
2-First & Follow
Set of First and Follow of nonterminals by its definitions.
3- Parse Table
Show the export version of parse table (space word not exist in export file and just for show space character ):
N=TeminalCount+NonTeminalCount
parse table (.prt) format :
In First Line :
StateCount Space N
in Second Line :
TerminalCount Space NonTerminalCount
in line 3 (braces just for reading purpose and it not exist in exported file):
[terminal symbols separated by Space] Space [nonterminal symbols separated by space]
In Line 4 to StateCount+3 :
StateNumber: Space ParseCell1Value Space ParseCell2Value ... ParseCellNValue
If Parse Table in One Cell have multiple value or ambiguity Cell values separated by delimiter /
for example :
S107/R77
For empty parse table cell in file used dash symbol -
4- Grid View
This tab Show Grid View of Parse
Points of Interest
Simply this program can build LALR and CLR parsing. Because that algorithm is very near to SLR , my interest point for distributing this source code is other programmers add other LR family parsers to current source.
History
I implemented this project in 2005 and updated in 2013.
References:
The project code implements the SLR algorithm described in the dragon book "Compilers: Principles, Techniques, and Tools" (1986) by Aho, Sethi, and Ullman.
http://www.cs.uaf.edu/~cs631/notes/parse/node5.html