Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C#

Generate SLR Parse Table From CFG Grammar

4.92/5 (6 votes)
23 Apr 2013CPOL7 min read 50.3K   4.3K  
this article content for Build Simple-LR ( SLR ) parse table from input grammar and find First and Follows of nonterminals

Introduction

This article describes an implementation of a particular method of constructing a parse table for an LR (left to right bottom up) parser called an SLR parser or Simple-LR parser.

Main things that lead to i am start to program this project is export pars Table in reusable format for other parsing program.

Background

Parsing

Syntax Analysis . Recognize sentences in a language. Discover the structure of a document/program.

Construct (implicitly or explicitly) a tree (called as a parse tree) to represent the structure.

LR Parsers

Recognize CFGs for most programming languages. No need to rewrite grammar.
Most general O(n) shift-reduce method.
Optimal syntax error detection.
LR grammars superset of LL grammars.

Bottom-Up Parsing

Construct parse tree from the leaves to the root.
Corresponds to a rightmost derivation in reverse.

Shift-Reduce Parsing
Parser performs actions based on parse table information:

  1. Shift. Shift the next input symbol onto the top of the stack.
  2. Reduce. The right end of the string to be reduced must be at the top of the stack. Locate the left end of the string within the stack and decide with what nonterminal to replace the string.
  3. Accept. Announce successful completion of parsing.
  4. Error. Discover a syntax error and call an error recovery routine.

Image 1

Consider following example grammar and steps to create Parse Table :

Grammar :

Image 2

States of this grammar :

state creation rules :

1- State 0 build from extra grammar Law S' -> S $ that S is all start symbol of grammar

and one Dot ( . ) before S symbol.

2- if Dot symbol is before a nonterminal Add grammar Laws that this nonterminal is in Left Hand Side of that Law and set Dot in before of first part of Right Hand Side.

3-if state is exist ( a state with this Laws and same Dot position ) use that instead

4- for one state that step 4 not check to now find Set of terminal and nonterminal that Dot exist in before

5- if Step 4 Set is Nonempty go to 5 else go to 7

5- for each terminal/nonterminal in set step 4 create new state by using all grammar law that Dot position is before of that terminal/nonterminal in refrence state by increasing Dot Point to Next Part in Right Hand Side of that Laws

6- goto step 2

7-end of state building Image 3

fill the parse table by SLR algorithm rules:

Goto : in parse table in row StateNumber SN and Column nonterminal V set GX when exist an export nonterminal V from state SN to state SX

Reduce : in parse table in row StateNumber SN and Column terminal T set RX when exist Law number X in state SN that have Dot point in end of law and T is last part of Right Hand Side of law number X ( terminal T is before Dot )

Shift : in parse table in row StateNumber SN and Column terminal T set SY when exist a Law in state SN that have Dot point before terminal T and state SN exports T to State number Y

Accept : In parse Table in row StateNumber SN and Column $ ( Extra terminal added to real Terminals ) set Accept when S' -> S Dot ( Dot after Start nonterminal ) member of StateNumber SN Laws

Error : all empty parse table cells are Parse Errors.

grammar can ambiguity between shift and reduce when both condition is true and set parse table Row SN to RX/SY

Image 4

Using the code

for avoid of constraint in count of grammar production rule and nonterminal and terminal and ... all classes implemented in linked list data structure.

by this logic all Classes implemented in 3 level :

1- Item fields Struct

2- Node Class

3- List Class

Classes :

Class Diagram exist in download list

Classes Definitions:

1-ParsHead

this Class Top Level of All Detail Classes and Contain main Properties:

first(Head) of nonterminal list object

All terminal count

All nonterminal count

a stack for terminal symbols

2-NonTerminals

for manage Nonterminals. Nonterminal Parent is ParsHead

sample method description :

Grammar Laws proccessed and LHS nonterminal and LawNumber send to this method, nonterminal pure name build from split # and rest of name after that check that Nonterminals list is empty if it was Add first node that is extraNonTerm and add law number 0 ( added law ) and exit from method.

if first item is not null , searching for Nonterminal node that it's name is equall to searching nonterminal , if not found add new NonterminalNode and add it in List.

finally add LawNum to LawLink of Nonterminal

C#
public void checkAdd(string nonTerm,int lawNum)
{
 try
 {
  nonTerminalNode thisNode=null;
  string []part=nonTerm.Split(' ');
  string NonTerm=part[0].Substring(1,part[0].Length-1);
  nonTerminalNode temp;
  if(first==null)
  {
   first = new nonTerminalNode(nonTerminals.ExtraNonTerm,this);
   first.lawLink.add(0,nonTerminals.ExtraNonTerm+" #"+NonTerm);
   this.checkAdd(nonTerm,lawNum);
   return;
  }
  else
  {
   temp=first;
   while(temp.next!=null)
   {
    if(temp.item.Name==NonTerm)
    {
     thisNode=temp;
     break;
    }
    temp=temp.next;
   }
   if(thisNode==null)
   {
    if(temp.item.Name!=NonTerm)
    {
     temp.next = new nonTerminalNode(NonTerm,this);
     thisNode=temp.next;
     count++;
    }
    else
     thisNode=temp;
   }
  }
  thisNode.lawLink.add(lawNum,nonTerm);
 }
 catch(Exception e1)
 {
  MessageBox.Show("in nonTerminals.CheckAdd : "+e1.Message);
 }
}

3-Laws

for Manage Laws . Laws Parent class is Nonterminals

4-LawsParts

for Manage Laws Parts . LawsParts Parent class is Laws

5-State

for manage States that connected to StateExports and StateLaws

6-StateLaws

List of Laws in State , StateLaws Parent is State

7-StateExport

export symbol and next state number and detemine export symbol is terminal or Nonterminal

8- ParsTable

for building parsTable that have head of all List above

9-ParsTableElement

Shift ( S ) , Reduce ( R ) , Goto ( G ) for saving in ParseTable Cells

10-TermNonTermList

building means of laws part that consist Number of parts in Term and type of them.

consider the MakeStates_Click method of MainForm ( main method for using other objects):

step of using classes :

1- create ParsHead object .

2- from loaded input in RichTextBox Reading line by line ( each Line at most one law) and load items to parsHead object by load method

3- parsHead now ready for use ( nonterminal and terminal and laws and relation between is created)

4- by parsHead create State Class

5- Call Create member function of State Class for Creation of States

6-Now it is ready to create ParsTable object by previous information and object :

parsTable=new ParsTable(states.StateCount , ParsHead.terminals , parsHead.Head , this.dgTabel ); 

states.StateCount : Count of states created by input grammar by state.create method

ParsHead.terminals : a stack object that contain terminal symbols

parsHead.Head : root (first) of nonterminal list object

this.dgTabel : current form Datagridview object for filling by created parse table

7- Clear Result Listbox

8- Show output to result listbox

9- Change tabcontrol selected index to parse table tab ( 3th tab )

10- Fill form datagridview by writeGrid method of parsTable

11- Ask for output file path for saving parse table

12- Export to file by saveTo method in parseTable

C#
  private void MakeStates_Click(object sender, System.EventArgs e)
  {
   try
   {
    parsHead=new ParsHead();
    foreach(string s in content.Lines)
     parsHead.load(s);
    states=new State(parsHead);
    states.create();
    parsTable=new ParsTable(states.StateCount,ParsHead.terminals,parsHead.Head,this.dgTabel);    
    states.makeParsTable(this.parsTable);
    result.Items.Clear();
    parsTable.showContents(result);
    tabControl1.SelectedIndex=2;
    parsTable.writeGrid();
    SaveFileDialog saveDlg=new SaveFileDialog();
    saveDlg.Filter="(*.prt)|*.prt";
    if(saveDlg.ShowDialog()==DialogResult.OK)
    {
     StreamWriter sw=new StreamWriter(saveDlg.FileName);
     parsTable.saveTo(sw);
    }
   }
   catch(Exception ex)
   {
//    MessageBox.Show(ex.Message);
    MessageBox.Show("Grammar is incorrect : please check your grammar , example : There is one NonTerminal in Right side that never apear in left side!!!");
   }
  }

Program is Tab Paged and exists following 4 Tabs:

1-Grammar Input :

Get grammar from input file or direct typing

Input grammar syntax (.grm):

1-every noterminal start with #

sample :

#Vars

2-Nonterminals haven't # in Start

sample : phrase a is terminal

#B a

3- alone nonterminal show empty grammar

sample :

#B

4- All nonterminal in right hand side of grammars must be at least one grammar left hand side

sample :

#B #A p

#A a

5- delimiter for phrase in grammar is One Space

grammar Sample Input included in article download links.

Image 5

2-First & Follow

Set of First and Follow of nonterminals by its definitions.

Image 6

Image 7

3- Parse Table

Show the export version of parse table (space word not exist in export file and just for show space character ):

N=TeminalCount+NonTeminalCount

parse table (.prt) format :

In First Line :

StateCount Space N

in Second Line :

TerminalCount Space NonTerminalCount

in line 3 (braces just for reading purpose and it not exist in exported file):

[terminal symbols separated by Space] Space [nonterminal symbols separated by space]

In Line 4 to StateCount+3 :

StateNumber: Space ParseCell1Value Space ParseCell2Value ... ParseCellNValue

If Parse Table in One Cell have multiple value or ambiguity Cell values separated by delimiter /

for example :

S107/R77

For empty parse table cell in file used dash symbol -

Image 8

4- Grid View

This tab Show Grid View of Parse

Image 9

Points of Interest

Simply this program can build LALR and CLR parsing. Because that algorithm is very near to SLR , my interest point for distributing this source code is other programmers add other LR family parsers to current source.

History

I implemented this project in 2005 and updated in 2013.

References:

The project code implements the SLR algorithm described in the dragon book "Compilers: Principles, Techniques, and Tools" (1986) by Aho, Sethi, and Ullman.

http://www.cs.uaf.edu/~cs631/notes/parse/node5.html

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)