Table of Contents
- What's a Programming Language?
- Why We Need another Programming Language
- JavaCC
- Java Reflection
- Eclipse Configuration
- Programming Language Example (Name: St4tic)
- 6.0- Grammar
- 6.1- Code Generating
- 6.2- Using Reflection
- 6.3- Core Creation
- 6.4- Making Interpreter
- System:out:println(1 + var)
- Summary
- Reference
1- What's a Programming Language?
A programming language is an artificial language designed to express computations that can be performed by a machine, particularly a computer. Why? Programming languages can be used to create programs that control the behavior of a machine, to express algorithms precisely, or as a mode of human communication, because is hard for humans to type just a numbers like “1001011001...” for creating very large algorithms or programs like your Operating System.
In reality, a programming language is just a vocabulary and set of grammatical rules for instructing a computer to perform specific tasks. The term programming language usually refers to high-level languages, such as C/C++,Perl, Java, and Pascal etc. In theory, each language has a unique set of keywords (words that it understands) and a special syntax for organizing program instructions, but we can create many languages that have the same vocabulary and grammar like “Ruby” and “JRuby” or others.
Regardless of what language we use, we eventually need to convert our program into machine language so that the computer can understand it. There are two ways to do this:
- Compile the program (like C/C++)
- Interpret the program (like Perl)
In this article, we use the second way “interpreted language” like Perl or Ruby, called “St4tic” for demonstration.
2- Why We Need Another Programming Language
Really, why do we need another? We have many programming languages as we can see in a Wiki list.
But how do you create your own? Even if you have this idea, you might say, "creating a programming language is impossible for me. I'm not crazy, because it's very hard!" Yes, creating a programing language from scratch is hard. You don't have any libraries or any source code to follow it. Hard like if you set a M.U.G.E.N configuration “Level : hard 8” and “Speed : fast 6”.
But now, we have many tools like Yacc, JavaCC, etc. for generating source code for us.
Personally I've created my own programming language called Alef++ [http://alefpp.sourceforge.net/ ] just for fun, and for better understanding: What is a programing language? How does it work? Can I can create my own? What's the difference between my own and others?
It's good reading if you're not discouraged yet!
3- JavaCC
"JavaCC (Java Compiler ) is an open source parser generator for the Java programming language. JavaCC is similar to Yacc in that it generates a parser for a formal grammar provided in EBNF notation, except the output is Java source code. Unlike Yacc, however, JavaCC generates top-down parsers, which limits it to the LL(k) class of grammars (in particular, left recursion cannot be used). The tree builder that accompanies it, JJTree, constructs its trees from the bottom up."
Briefly, JavaCC is a tool for transforming and generating a parser with Java source code (like regular expressions) for checking source code syntax, from rules you've defined as grammar. Don't worry, JavaCC grammar is like Java source code, so you may need to be familiarized with Java.
4- Java Reflection
Java reflection isn't quite accurate, but perhaps "mirror reflection" is closer to the truth. I explain why:
In reality Java or Ruby reflection, .NET reflection is just hacking and breaking into OO-Style (OOP) rules, is like mirror reflection.
- If in Java we can't access
private
members and methods in another classes, with Java reflection we can do it easily - If we need to use an external library not imported in compiled code “import something.*”, we can import it dynamically.
- If we need to use a class instance not declared in compiled code, we can create a class instance dynamically.
- Etc.
Now you see why! If you know Java persistence, this can read from a database and return a list of objects, each object is a row in database, but you have only defined a table structure in one class and Java persistence does all the work for you! So you don't have the question, "How does this work?"
I back to our game analogy, now that the team is completed, with our second Player as Java reflection, we just need to choose a battle area and start the fight!
5- Eclipse Configuration
Eclipse, Eclipse and Eclipse... why? If you're lazy like me, creating a text file and writing grammar without any syntax colorization can be discouraging, and people just want it done like a Wizard/Setup - "Next, Next, Finish!"
Okay, let's configure a battle area. If you don't have Eclipse, download it from here.
Next, follow this setup from SourceForge for configuring JavaCC in Eclipse.
6- Programming Language Example ( Name : St4tic )
Ready!? St4tic is very small programing language (nano-programing language) deigned to be easy to understand for beginners, and any one can modify it without much effort, because I have created it just for a demonstration.
St4tic can do just arithmetic operations (+, -, /, *) for integers. Mathematical operations in IN, has two conditions “IF” and "WHILE," importing Java packages, variables declaration, and executes ONLY public static methods such System.out.println
Not bad?
Before viewing St4tic grammar, just remember St4tic is an interpreted language like Perl or Python, can read text (source code) from file and parsing it, and create an object tree for interpreting them (executing instructions).
Fight!
Example file text:
require java lang.
"I'm comment
def var = 13.
while var > 0 do
System:out:println( var ).
var = var – 1.
stop
(image from wiki)
6.0- Grammar
Open your big eyes, and follow my steps. Remember how I said I'm lazy, and I preferred using a JTB (Java Tree Builder) to build or generate all the needed source code without much effort? That's what makes this wizard so nice.
First, we create a JTB file. Do you know how? In theory, you have installed JavaCC in your Eclipse by following these steps, so you should be good. If you lost it, that's no problem, you still have 98 credits and can go back and restart.
Okay, now we divide a grammar to three big groups:
If your JDK version don't support templates (generics), try to set in your project Java compilation compatibility 1.5 (Java 5).
Options
options {
JDK_VERSION = "1.5";
STATIC = false;
}
We use a Java Development Kit 1.5 (also called Java5 JDK_VERSION= “1.5”;) for compilation compatibility with Java 5, and also an instance methods for parser (STATIC=false);.
Tokens
SKIP :
{
" "
| "\t"
| "\n"
| "\r"
| <"\"" (~["\n","\r"])* ("\n"|"\r"|"\r\n")>
}
For skipping a space between keyword, tab and new lines or returns, but last is for skipping comments, like in Java:
in St4tic comment for one line we use a (") double quot:
[Code Block]
"Comment here...
TOKEN :
{
< REQUERE: "require" >
| < IF: "if" >
| < WHILE: "while" >
| < DO: "do" >
| < STOP: "stop" >
| < DEF : "def" >
}
We can assume from an initial glance this a St4tic reserved keyword! St4tic has only six reserved keywords.
"require
" keyword used for Java library importation like "import
" in Java:
require java lang.
This imports all “java.lang
” classes.
The def
keyword, is like my
in Perl for variable declaration, we can't declare any variables without using def
.
def myVar = 1.
def num13 = 13.
“if
” and “while
” are the classical if
-condition and while
-loop.
if 1 > 0 do
"do something …
stop
while 1 > 0 do
"repeat in infinite loop …
stop
TOKEN :
{
< DOT: "." >
| < COLON: ":" >
| < EQ: "==" >
| < GT: ">" >
| < LT: "<" >
| < GE: ">=" >
| < LE: "<=" >
| < NE: "!=" >
| < PLUS: "+">
| < MINUS: "-" >
| < MUL: "*" >
| < DIV: "/" >
| < MOD: "%" >
| < ASSIGN: "=" >
}
Here, we can grouping symbols to “Math Operation Symbols” (+,-,*,/,%) and “Math Relational Symbols” (>,<,==,>=,<=,!=).
TOKEN :
{
< INTEGER_LITERAL: ["1"-"9"] (["0"-"9"])* | "0" >
}
Literals, maybe we can say “value” or “data” (in St4tic) example:
def myAge = 24.
def var = 666.
if 11 > 10 do
…
stop
a values "24, 666, 11, 10" is checked or parsed as literals.
TOKEN :
{
< IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* >
| < #LETTER: ["_","a"-"z","A"-"Z"] >
| < #DIGIT: ["0"-"9"] >
}
Identifiers like literals, just identifiers for only variables names “myAge
”, “var
”, etc. Now we have completed tokens that are not hard at all =), we just need imagination for founding keywords and symbols, but we can use an existences keyword from other programing languages.
Rules
Here is a big challenge, because we need a new programing language, that has different or revolutionary organization adopted for parsing, hmm... maybe can be hard to understand it if we use hard organization (syntax)? I preferred to use an easy something like Pascal or Visual Basic.
Before starting:
“if rule is writing “1 + 1” and you write “1 – 1” that's throw exception by JavaCC, because parser can't found “1” flowed by “+” and “1”, but has found “-” in place of “+” and can't continue.”
Is this understood?
void Start():{}
{
(
Require() "."
)+
(
StatementExpression()
)*
}
This is an enter point for St4tic parsing without it, a parser can't be started. For this rule, it is mandatory to specify a “require” (if you notice “+”, one or many) and after it a program instructions (notice “*”, no-one or many):
void Require():{}
{
"require"
(
< IDENTIFIER >
)+
}
Here for packages importation can be one word after “require
” or many like :
require java .
require java lang .
...
And after importation, we can write a St4tic script, “statement expression” :
void StatementExpression():{}
{
VariableDeclaration()
| LOOKAHEAD(2) VariableAssign()
| JavaStaticMethods()
| IfExpression()
| WhileExpression()
}
“statement expression” is program body or algorithm can contain many variables declarations, variables assignments, logical tests (if;while) or Java methods calling (remember in St4tic just public static
methods).
void VariableDeclaration():{}
{
"def" VariableName() "=" MathExpression() "."
}
void VariableAssign():
{}
{
VariableName() "=" MathExpression() "."
}
As you can see, a variable declaration and variable assignment is identical, just in declaration we need to start with “def
” for defining variables.
void JavaStaticMethods():{}
{
< IDENTIFIER >
(
":" < IDENTIFIER >
)+
"(" MathExpression() ( "," MathExpression() )* ")" "."
}
Like his name, =) invoking just static
methods, as this rule:
ClassName:[Method|Members]( number ) example “System:out:println(1)”, like Java?
Yes, just by changing “dot[.]” by “colon[:]”.
void IfExpression():{}
{
"if" RelationalExprssion() "do"
(
StatementExpression()
) *
"stop"
}
void WhileExpression():{}
{
"while" RelationalExprssion() "do"
(
StatementExpression()
) *
"stop"
}
Easy and simple “IF
” and “WHILE
” rules. Finally, you can see a full grammar source code, for now it's just empty parser just for checking a syntax without interpreting it (no result). In the next chapters, we add an interpreter for it.
options {
JDK_VERSION = "1.5";
STATIC = false;
}
PARSER_BEGIN(St4tic)
package st4tic;
import st4tic.syntaxtree.*;
import st4tic.visitor.*;
public class St4tic
{
public static void main(String args[]) {
try {
Start start = new St4tic(new java.io.StringReader(
"require java lang.\n" +
"def var = 13.\n" +
"while var > 0 do\n" +
"System:out:println( var ).\n" +
"var = var - 1.\n" +
"stop.\n"
) ).Start();
start.accept( new DepthFirstVisitor () );
System.out.println("Right! no errors founded! =)");
} catch (Exception e) {
System.out.println("Oops.");
System.out.println(e.getMessage());
}
}
}
PARSER_END(St4tic)
SKIP :
{
" "
| "\t"
| "\n"
| "\r"
| <"\"" (~["\n","\r"])* ("\n"|"\r"|"\r\n")>
}
TOKEN :
{
< REQUERE: "require" >
| < IF: "if" >
| < WHILE: "while" >
| < DO: "do" >
| < STOP: "stop" >
| < DEF : "def" >
}
TOKEN :
{
< DOT: "." >
| < COLON: ":" >
| < EQ: "==" >
| < GT: ">" >
| < LT: "<" >
| < GE: ">=" >
| < LE: "<=" >
| < NE: "!=" >
| < PLUS: "+">
| < MINUS: "-" >
| < MUL: "*" >
| < DIV: "/" >
| < MOD: "%" >
| < ASSIGN: "=" >
}
TOKEN :
{
< INTEGER_LITERAL: ["1"-"9"] (["0"-"9"])* | "0" >
}
TOKEN :
{
< IDENTIFIER: <LETTER> (<LETTER>|<DIGIT>)* >
| < #LETTER: ["_","a"-"z","A"-"Z"] >
| < #DIGIT: ["0"-"9"] >
}
void Start():{}
{
(
Require() "."
)+
(
StatementExpression()
)*
}
void Require():{}
{
"require"
(
< IDENTIFIER >
)+
}
void MathExpression():{ }
{
AdditiveExpression()
}
void AdditiveExpression():{}
{
MultiplicativeExpression() ( ( "+" | "-" )
MultiplicativeExpression() )*
}
void MultiplicativeExpression():{}
{
UnaryExpression() ( ( "*" | "/" | "%" ) UnaryExpression() )*
}
void UnaryExpression():{}
{
"(" MathExpression() ")" | < INTEGER_LITERAL > | VariableName()
}
void RelationalExprssion():{}
{
RelationalEqualityExpression()
}
void RelationalEqualityExpression():{}
{
RelationalGreaterExpression()
(
(
"==" | "!="
)
RelationalGreaterExpression()
)*
}
void RelationalGreaterExpression():{}
{
RelationalLessExpression()
(
(
">" | ">="
)
RelationalLessExpression()
)*
}
void RelationalLessExpression():{}
{
UnaryRelational()
(
(
"<" | "<="
)
UnaryRelational()
)*
}
void UnaryRelational():{}
{
< INTEGER_LITERAL > |
VariableName()
}
void IfExpression():{}
{
"if" RelationalExprssion() "do"
(
StatementExpression()
) *
"stop"
}
void WhileExpression():{}
{
"while" RelationalExprssion() "do"
(
StatementExpression()
) *
"stop"
}
void VariableDeclaration():{}
{
"def" VariableName() "=" MathExpression() "."
}
void VariableAssign():
{
}
{
VariableName() "=" MathExpression() "."
}
void VariableName():{}
{
< IDENTIFIER >
}
void JavaStaticMethods():{}
{
< IDENTIFIER >
(
":" < IDENTIFIER >
)+
"(" MathExpression() ( "," MathExpression() )* ")" "."
}
void StatementExpression():{}
{
VariableDeclaration()
| LOOKAHEAD(2) VariableAssign()
| JavaStaticMethods()
| IfExpression()
| WhileExpression()
}
6.1 – Code Generation
It is simple and easy. You just need to click “compile with JavaCC” in the context menu or just by saving a file (if you have auto-compilation). The results are like this:
Maybe if you copy and paste it, you can get socked by gentle error, if you have created your JTB file in another package.
If you have to do it, try to change a package name from JTB file and in secret place
“Project-Properties > JavaCC Options > Tab JTB Option > in p (default) = your new package name”, I hope now I'm not responsible for your errors I have to give you a secret solution.
If you're lost here, no worries. You still have many credits to restart, maybe now 97 credits!
For trying a parser, you need just to run the St4tic.java file, result is:
“Right! No errors found!”
For editing or testing another code, edit the code here on your own:
Start start = new St4tic(new java.io.StringReader(
"require java lang.\n" +
"def var = 13.\n" +
"while var > 0 do\n" +
"System:out:println( var ).\n" +
"var = var - 1.\n" +
"stop.\n"
) ).Start();
6.2 – Using Reflection
For using Java reflection, we need to create a small class for doing it:
- static method full-identifier : parameters (string : class-name ) : return string
- static method exists-field : parameters ( object : class-instance, string : field-name) : return boolean
- static method get-field-object : parameters ( object : class-instance, string : field-name) : return object
- static method exists-method : parameters( object : class-instance, string : method-name, st4tic-value[] : args) : return boolean
- static method invoke-static-method : parameters ( object : class-instance, string : method-name, st4tic-value[] : args ) : return object
- static method push-package : parameters( string : package-name) : return void
- static method make-object : parameters( string : class-name) :return class
The complicated method is an invocation of static
methods (or all methods in general), because in this step we need to choose the right types for parameters, unlike the compiler that can automatically cast Java native objects (integer
to double
or long
to float
, etc.).
Maybe you don't see a real problem, but imagine if you have
- class X;
- and class Z extends X;
- and you have a method
myMethod
( X x );
If you pass an instance of class Z in method myMethod
and you compile it, your code is accepted with no errors.
But, if you use reflection for do it, this is where the holy of all errors shows himself. Like, "method does not exist," or "error in object type," because in reflection automatically casting does not exist. And you need to do it by yourself.
Okay in our case, not to worry as we have a simple method:
@SuppressWarnings("unchecked")
public static Object invokeStaticSubroutine(
Object classInstance, String methodName, St4ticValue ... args){
try{
Class clazz = classInstance instanceof Class ?
(Class)classInstance : classInstance.getClass();
if( args != null ){
LinkedList<Class> params = new LinkedList<Class>();
for( St4ticValue arg : args ){
params.add( arg.getType() );
}
Method method = clazz.getMethod(methodName,
params.toArray(new Class[]{}));
LinkedList<Object> values = new LinkedList<Object>();
for( St4ticValue arg : args ){
values.add(arg.getValue());
}
return method.invoke(classInstance,
values.toArray(new Object[]{}));
}
else{
Method method = clazz.getMethod(methodName, new Class[]{});
return method.invoke(classInstance, new Object[]{});
}
} catch (SecurityException se) {
} catch (NoSuchMethodException nsme) {
} catch (IllegalArgumentException iae) {
} catch (IllegalAccessException iae) {
} catch (InvocationTargetException ate) {
}
return null;
}
6.3 - Core Creation
Core package is the heart for St4tic data manipulation, we have just four classes.
(generated by doxygen)
It is a very simple class and you view them in source code. Just getters, setters and child's finding.
6.4- Making Interpreter
For making an easy interpreter, I have separated it to another package called “interpreter” and creating an interface content all needed methods called “Interpret
” finally I have implemented it in class called “Interpreter
.”
The methods in interface “Interpret
” has been copied from interface st4tic.visitor.Visitor
and changing his signature, like Alef++
from public void visit(Require n);
to public Object visit(Require node, St4ticScope scope, Object ... objects);
just visit(Start node)
is not changed because this method is enter or start point for St4tic interpreter.
public Object visit(Start node) throws Exception {
Enumeration importedPackagesEnum = node.f0.elements();
while( importedPackagesEnum.hasMoreElements() )
{
NodeSequence ns = (NodeSequence) importedPackagesEnum.nextElement();
St4ticReflection.pushPackage( this.visit
( (Require) ns.elementAt( 0 ), null).toString() );
}
if( node.f1.size() > 0 )
{
St4ticScope parent = new St4ticScope( null );
Enumeration statement = node.f1.elements();
while( statement.hasMoreElements() )
{
this.visit
( (StatementExpression)statement.nextElement() , parent);
}
}
return null;
}
A second important method is variables declaration and his life-cycle, if scope is destroyed all children have also lost without using GC.
public Object visit(VariableDeclaration node, St4ticScope scope,
Object... objects) throws Exception {
St4ticVariable var = new St4ticVariable();
var.setVariableName( this.visit( node.f1 , scope, objects).toString() ) ;
var.setVariableValue( (St4ticValue) this.visit(node.f3, scope, objects) );
scope.pushChild( var.getVariableName() , var );
return null;
}
Finally, a method for invoking a Java static
method.
public Object visit(JavaStaticMethods node, St4ticScope scope,
Object... objects) throws Exception {
String identifier = St4ticReflection.fullIdentifier( node.f0.tokenImage ) ;
if( identifier != null )
{
Object currentObject = St4ticReflection.makeObject ( identifier );
if( currentObject != null ){
Enumeration e = node.f1.elements();
while( e.hasMoreElements() )
{
NodeSequence ns = (NodeSequence) e.nextElement();
if( St4ticReflection.existsField
( currentObject , ns.elementAt( 1 ).toString() ) )
{
currentObject = St4ticReflection.getFieldObject(
currentObject , ns.elementAt( 1 ).toString() );
}
else
{
LinkedList<St4ticValue> params =
new LinkedList<St4ticValue>();
params.add( (St4ticValue) this.visit
(node.f3, scope, objects) );
Enumeration eVal = node.f4.elements();
while( eVal.hasMoreElements() )
{
NodeSequence nsVal =
(NodeSequence) eVal.nextElement();
params.add( (St4ticValue) this.visit(
(MathExpression) nsVal.elementAt(1) ,
scope,
objects) );
}
if( St4ticReflection.existsSubroutine( currentObject ,
ns.elementAt( 1 ).toString() , params.toArray(
new St4ticValue[]{}
)) )
{
return St4ticReflection.invokeStaticSubroutine(
currentObject ,
ns.elementAt( 1 ).toString() ,
params.toArray( new St4ticValue[]{}
)) ;
}
break;
}
}
}
}
return null;
}
7- System:out:println( 1 + var ).
Now is the time of truth. You go to Eclipse or your favorite text editor and you create a text file called "my-first-programming-language.st4" and type in the first line:
require java lang .
In second line :
def var = 2 .
In last line :
System:out:println( 1 + var ) .
You go to Eclipse “Run...” properties and add in arguments “my-first-programming-language.st4” finally press “Run”, or if you use binary (JAR file) you can just type in your console:
$: java -jar st4tic.jar my-first-programming-language.st4
And you got a very nice output :
3
Congratulations! You win and thank you for playing, this article is over.
8- Summary
I want to write a funny and educative article because this topic is very large and big, if you read classical articles they can be discouraging, and become very hard. So now I hope you are familiarized with JavaCC and Java reflection.
- JavaCC : is tool like Yacc for generating a parser with Java code source from grammars.
- Java reflection : is library to accessing and manipulating dynamically Java objects.
- Keywords : you can use any language for your keywords (Arabic, Russian, etc...)
Hope you have enjoyed the article. Please give your suggestions and feedback for further improvement. Again, thanks for reading.