Introduction
The Internet revolution rekindled the need for platform
independence. C/C++ were not suitable for such a task, and that's when Sun's
Java came into the picture. Java was officially launched by Sun in 1995. Java
was introduced as a platform independent, true Object-Oriented language. The
objective of this article is to explain how Java achieves its platform
independence. But this can not be understood completely without a good
understanding of Compiled and Interpreted languages; and hence the article also
sheds some light on compilers and interpreters. The article also goes on to
explain JVM (Java Virtual Machine) and CLR (Common Language Runtime, which
is the runtime environment of Microsoft's .NET technology). Any discussion on
JVM and CLR would not be meaningful without discussing Just-In-Time (JIT)
compilation - a concept also discussed in the article.
The article is not intended to be an exhaustive study on JVM,
CLR, JIT, Compilers or Programming languages. It is intended to give a big picture
of how these bits and pieces are glued together to achieve platform independence.
Interested readers should consult the references given at the end to find out
more about the topics discussed in this article.
Table of Contents
How do computers work?
Computer hardware is like any other machinery. You can switch it on, and
electrons will start flowing through it. That's all that a computer can do. Like
an ignorant being - computer needs to be told specifically what it should do.
Computer programs are the tool to tell the computer what you want it to do.
Computers understand only one language - the machine code.
Machine code is a sequence of binary (1 and 0) digits. A microprocessor manufacturer
(the microprocessor is the heart of a computer) decides which sequence of bits means what.
Imagine that you want to construct your own microprocessor. You will incorporate
various tasks in it. And you need to have a unique code for each task. A
computer program will issue these codes to initiate the required task of the
microprocessor.
Let us consider the very basic task of moving a value into a
register (a register may be thought of a microprocessor's extremely fast
internal memory). This task requires the microprocessor to read the value from a
specific memory address and to put it in a specific register. Your microprocessor
thus needs to know the following:
- The specific operation code
- The memory address from where to read the value
- The register number where to put the value.
Remember that microprocessors are built so that can only apply operations
on the contents of registers and not on the memory directly. It is for this
reason that we have to move numbers into the registers from the memory before we
can apply various operations on them e.g. addition, subtraction, division etc.
So a move
operation is one of the mostly used and basic operation in a
microprocessor.
Let us assign a suitable operation code to the move
operation -
"0001". Do you see any problems with this code? There is nothing
wrong with our choice of code except that it is only 4 digits wide. This would
only allow our microprocessor to have 24 = 16 operations. Obviously
we would need much more than 16 operations to make our microprocessor
commercially viable. So let us change our code for the move
operation and assign
it a bigger value - "0000 0000 0001". Now our processor can handle 212 = 4096
operations.
Having decided on the operation code (op code), we need to decide on the
memory address.
If the computer has 512 bytes of RAM (way too small from today's standards, but
this is enough for illustration purposes ), and a single location of RAM is 2
bytes (16 bits) wide, then we have -- 512 bytes / 2 bytes -- 256 bytes of addressable
memory locations, and would require -- 28 = 256 -- at least 8 bits to
represent a memory address.
Let us assume that our microprocessor has 16 internal registers, we would
therefore need 24 = 16, 4 bits to specify a register.
With the above three design decisions, if a program wants a number to be
moved from RAM into a register then it would have to issue
the following machine code to our microprocessor:
0000 0000 0001 0000
0011 0010
The first 12 bits identify the op code, the next 8 bits identify
the memory address and the next four bits identify the register number. The
above machine code will have move the number kept in memory location
"3" into register number "2".
Remember that Machine language is the only language that the Microprocessor
(hence the Computer) understands. So ALL the applications/software will
ultimately have to be translated into Machine language before they can run on a
Computer.
Although our microprocessor is a simple one, and today's commercial microprocessors
are all built on the same principle. Every microprocessor has its own op codes
(like the move
op code of our simple microprocessor) and its own addressing
schemes. Apple Computer's are built around Motorola's microprocessors, while
IBM and IBM compatible computers are all built around Intel processors.
The microprocessor that we have built in this section is the simplest
possible. Visit [1]to find out about the latest microprocessor
development
go to top
What are compilers?
Let us use our simple microprocessor (that we built in the
how do computers work section) and try to do the simplest of task, namely adding
two numbers. This task would requires the following:
move
the first number from the memory to register # 1
move
the second number from the memory to register # 2
add
the contents of register #1 to the contents of register # 2
and put the result in register # 3
move
the contents of register # 3 to memory.
We had made a sample machine language code for move
operation in
the above section. Our move
operation could move a number from memory to the
register. We would also need another move
operation that does the opposite -
i.e. move
a number kept in a register to a memory location. We will use the
following machine code for such an operation:
0000 0000 0010 0001 0000
0011
The first 12 bits are the op code, the next four bits are the
register number and the remaining 8 bits are the memory location. Note that this
move
operation code is 0000 0000 0010 (equal to "2").
Next we need to design a code for the add
operation.
0000 0000 0011 0001 0010
0011
With the above machine codes we can instruct our computer to perform the
addition task as given below:
Instruction 1
0000 0000 0001 0000 0011
0001 (moving the number from memory location "3" to register #1)
Instruction 2
0000 0000 0001 0000 0100 0010
(moving the number from memory location "4" to register #2)
Instruction 3
0000 0000 0011 0001 0010
0011 (Adding register #1 and register # 2, and putting the result in register #3)
Instruction 4
0000 0000 0010 0011 0000
0100 (Moving the contents of register # 3 in memory location "4")
The above machine code will instruct the computer to add two numbers.
The CPU will execute the first statement and will then increment its Program
Counter Register this register keeps the memory address of the next instruction
to execute, so that now this register points to the next instruction (i.e.
instruction 2).
Now the CPU will fetch the instruction from the RAM into its
cache/registers and execute it. Once executed the same process will be repeated
for the next instruction - until the complete program (i.e. all the machine
language instructions have been executed); at which point the control is given
back to the operating system. This sequence of operation is called a fetch-execute
cycle and is a characteristic of Von-Neuman architecture (the architecture
around which all today's PCs are built). It should be noted that the execution
of an instruction takes far less time than the fetch process. This
is because the execution is implemented through hardware, while the fetch involves
moving the data back an forth from and to memory and cache/register. So in a
compiled code the bottle neck is the fetch operation.
Every microprocessor has its own machine code. Our extremely simple microprocessor
had its own machine code, Intel would have its own code and so would Motorola.
For human being it is almost impossible to remember the machine code and to
develop even a small application using machine codes. That's where the higher
level languages come in.
In C/C++, to add two numbers you would write the following code:
int i;
int j;
int k;
k = i + j;
Compare the C code with the machine code. It is far more viable
to write applications in a high level language like C/C++ then to write the same
application in machine language. But the problem is that the computer would not
understand anything but the machine language - what we need is some sort of a
translator that will take the High level C/C++ code and translate it into
machine code. Such a "translator" is called a compiler.
The compiler is a program that takes in the C/C++ file(s) as an
input and outputs an executable file, that can then be directly run on the host
computer.
As I already mentioned the existence of multiple (and incompatible)
microprocessors, this means that we will have separate compilers for separate
hardware. Thus the same C code will have to be compiled using a C-compiler for
Apple Macintosh in order to run on the Apple Computer. If you want the same
C code to run on Microsoft Windows running on the Intel platform, then you will
have to compile your C code using the C-compiler for Windows.
Simply put a compiler converts a source code file (which is a
simple text file) into an executable file that can be run on the host computer.
Those familiar with C/C++ will realize that this is an over simplification.
Your C/C++ code is not directly converted into .exe file; but is
converted into an intermediate file called an object file (.obj). If you have five
C/C++ files in your project then the compiler would generate five obj files, one
for each C/C++ file; but only one .exe file. The object file is at a slightly
higher level than the raw .exe file. In an object file the memory references are
local, and the obj file is not linked to other obj, dll, lib files that your
C/C++ program uses.
When you use the include statement #include <myfile.h>
in your C/C++, the compiler checks for the existence of myfile.h
file. If it does not find it you are given an error message and the compilation
fails. Imagine that the file myfile.h exists, and you have
used a function addNumber(int, int)
that has been declared in
myfile.h. The compiler will check to see if the
function has been declared in myfile.h. If the function does not exist then the
compilation will fail with an error message. Imagine that the function has been
declared in myfile.h. Now the compiler would successfully finish compilation -
unless there is some other error.
After successful compilation, the compiler will generate an obj
file, and will initiate the linker. The linker is a program that takes
in all the obj files in your project and looks for all the cross-referenced
files, and all the needed libraries. In our example above, the compiler ensures
that myfile.h exists. The linker ensures that the .lib file of myfile.h
must also exist. The lib file is the file that contains the code of all the
functions declared in myfile.h. Another important task that the Linker does is
to translate operating system API (Application Programming Interface) calls to appropriate memory addresses. Many
operating systems provide I/O APIs. So the programmer need not reinvent the
wheel, instead in our programs we simply make function calls to such operating
system's API functions. The linker knows the memory addresses where the code to these functions reside, and translates
the function calls to appropriate memory address with in the operating system
memory space.
The diagram below gives a simplified view of what compiler does
I will not discuss lexical, syntax, semantic analyzers, and code
generator. Interested readers should see the reference section for details on
these topics.
go to top
What are interpreters?
If compilers are one extreme to running programming languages
then pure interpreters are the other extreme. Pure interpreters do not do any
code translation as done by compilers. These interpreters take the source code
( which is written in a high language) and start executing the statements on
the host machine, one by one. These pure interpreters are unable to do any code
optimizations at all. Pure interpreters are also unable to do the syntax check;
as is done by compilers. Example of pure interpreters are the scripting languages
that come with all the operating systems. The shell scripts in Unix/Linux, the
batch files (.bat) and the command files (.cmd) in Microsoft Windows are all
examples of pure interpreted languages. When you make a batch file you simply
write the high level code, and save the file with a .bat extension.
To run your .bat file you simply type the name of the file on the command prompt.
The operating system reads the first line of the file and (tries to) execute
the first statement. If the execution is successful you get the desired results,
if the execution can not be carried out due to a syntax error, you will see
"Bad command or file name" error message on the command prompt window.
The same applies to the shell scripts written in Unix/Linux.
Some of the commercial programming languages have been known to
be interpreted e.g. BASIC, Java, Tcl/TK. And yet these languages do not behave
quiet like the description given above. The reason is simple - none of the popular
modern programming languages are pure-interpreter based. They are either compiled
(like C/C++) or adopt a hybrid approach (like Java, BASIC, Tcl/Tk). The pure
and hybrid approach may be described by the following diagrams :
Pure Interpreter
Hybrid compiler-interpreter
As is obvious from the above diagrams, today's popular interpreted
languages are not purely-interpreted. They follow the "compilation"
technique to produce an intermediate code (e.g. Microsoft's Intermediate Language -
MSIL, Sun's Java Byte Code etc.). It is this intermediate language that the
interpreter works on, and not the original high level source code. This approach
rids many of the problems inherent in pure-interpreted languages, and gives
many of the advantages of fully-compiled languages.
Readers should note that both interpreters and compilers eventually convert
the source code to machine-language; after all the computer can only run a program
in a machine language. A compiler does this conversion off-line and in one go
(as discussed in the what are compilers section );
whereas the interpreter does this conversion one-program statement-by-one. A
compiled program runs in a fetch-execute cycle whereas an interpreted
program runs in a decode-fetch-execute cycle. The decoding is done by
the interpreter, whereas the fetch and execute operations are done by the CPU.
In an interpreter the bottleneck is the decoding phase, and hence an interpreted
program may be 30-100% slower than a compiled program.
The following flowchart compares the compiled and interpreted programs execution.
The execution of an interpreted application is shown below
It is evident from the above flowcharts, that an interpreted program
has an overhead of decoding each statement one-by-one; thus in an interpreted
program the bottleneck is the decoding process.
The readers would be asking themselves an obvious question "Why
are some languages developed as interpreted and others as compiled?. What
are the advantages/disadvantages of booth these approaches?" This is the topic of the next section.
go to top
Pros and cons of compiled and interpreted languages
Languages can be developed either as fully-compiled, pure-interpreted,
or hybrid compiled-interpreted. As a matter of fact, most of the current programming
languages have both a compiled and interpreted versions available.
Both compiled and interpreted approaches have their advantages
and disadvantages. I will start with the compiled languages.
Compiled languages
- One of the biggest advantages of Compiled languages is their execution speed.
A program written in C/C++ runs 30-70 % faster then an equivalent program
written in Java.
- Compiled code also takes less memory as compared to an interpreted program.
- On the down side - a compiler is much more difficult to write than an interpreter.
- A compiler does not provide much help in debugging a program - how many
times have you received a "Null pointer exception" in your C code
and have spent hours trying to figure out where in your source code did the
exception occurred.
- The executable Compiled code is much bigger in size than an equivalent interpreted
code e.g. a C/C++ .exe file is much bigger than an equivalent Java .class
file
- Compiled programs are targeted towards a particular platform and hence are platform dependent.
- Compiled programs do not allow security to be implemented with in the code
- e.g. a compiled program can access any area of the memory, and can do whatever
it wants with your PC (most of the viruses are made in compiled languages).
- Due to loose security and platform dependence - a compiled language is not
particularly suited to be used to develop Internet or web-based applications.
Interpreted languages
- Interpreted language provides excellent debugging support. A Java programmer
only spends a few minutes fixing a "Null pointer exception", because
Java runtime not only specifies the nature of exception but also gives the
exact line number and function call sequence (the famous stack trace information)
where the exception occurred. This facility is something that a compiled language
can never provide.
- Another advantage is that Interpreters are much easier to build then a compiler.
- One of the biggest advantages of Interpreters is that they make platform-independence
possible.
- Interpreted language also allow high degree of security - something badly
needed for an Internet application.
- An intermediate language code size is much smaller than a compiled executable
code.
- Platform independence, and tight security are the two most important factors
that make an interpreted language ideally suited for Internet and web-based
applications.
- Interpreted languages have some serious drawbacks. The interpreted applications
take up more memory and CPU resources. This is because in order to run a program
written in interpreted language; the corresponding interpreter must be run
first. Interpreters are sophisticated, intelligent and resource hungry programs
and they take up lot of CPU cycles and RAM.
- Due to interpreted application's decode-fetch-execute cycle; they are much
slower than compiled programs.
- Interpreters also do lot of code-optimization, security violation checking
at run-time; these extra steps take up even more resources and further slows
the application down.
go to top
Platform dependence issues in compiled languages
C/C++ is a compiled language i.e. it functions similar
to figure 1 given above. Although there is at least
one (may be more) interpreter of C/C++ that exists as well. Your C/C++
source file(s) are converted to .obj code, and then a linker converts it to
an executable code. This executable code may be run on the host computer. Both
the .obj and the executable code are machine platform/dependent. The exe file
can only be run on a particular hardware and on a particular operating system.
There are compilers available for almost all the known combination of operating
system-hardware. If you have Linux running on Intel then the required compiler
usually comes as a part of installation package of Linux. If you have Windows
running on Intel, then you can use one of many compilers such as Borland's C++ or Microsoft's C++ compilers.
Similarly a C/C++ compiler exists for Apple Macintosh as well. So the only thing
in your C/C++ program that seems to be portable and platform independent is
the actual source code - sorry to disappoint you here!!!. Even this statement
is only partially correct. Your C/C++ code will only be portable if you have
only used ANSI C standards. With various vendor specific extensions of C/C++,
it is highly unlikely that your C/C++ code would automatically compile for all
the platforms. So if you want to ensure that your code compiles on ALL the platforms;
then before incorporating any API or function you should ensure that it is a standard
and not vendor specific. Usually the GUI functions available in C/C++/VC++ are
always platform dependent. So a simple MessageBox( ) API that you are so accustomed
to in your VC++, will not work in Unix. As a matter of fact much of what you
code in VC++ will not work on any other platform - even Windows NT applications may not
run on Windows 2000 and vice versa. So although C/C++ results in one of the
most efficient executables - it falls down on its face when it comes to platform-independence.
While this shortcoming of C/C++ was well known to all, it did not pose any problem
until the Internet became a household tool. The Internet brought with itself
the need to be able to have a single application run on multiple platforms without
any changes. This is when Sun rose to the occasion and developed Java.
go to top
How does a Java program work?
A Java programmer writes his code in a file with an extension
.java. The source file will import several Java
framework classes/packages/libraries e.g. java.lang, java.utils etc. In order
for the programmer to produce a java file; he must have the JDK (Java Development
Kit) installed on his/her computer. The JDK is a comprehensive set of software that includes
all the bits and pieces required for developing Java applications. These
includes the JVM (Java Virtual Machine), JRE (Java Runtime Environment; actually the
JVM is a part of the JRE ), Java packages and framework classes, javac (the java
compiler), and the Java Debugger.
Once the program is completed the programmer would compile the
java source code using the java compiler. The output of the compiler is a .class
file.
So if you have put your code in a file named Test.java; you would use the javac
program (the Java compiler) to compile your source file(s) into a class file named Test.class.
Your Test.java is a Java source text file while the
Test.class file is in an intermediate Java-byte code file, this file
is actually the machine independent intermediate code that can be executed on
any computer with the JRE installed.
To run your Test.class file you will use the Java Runtime
Environment. Use the java command to run the test file.
Given above is an extremely simplified discussion of how to run
a Java program. But before you can run your Java programs you will have to set
your CLASSPATH (an environment variable) to point to all the referenced
libraries/packages. You will also have to use javac with appropriate switches
and arguments to properly compile your Test.java file.
The basic idea is that in your Java program you will use Java
framework classes/packages/libraries or even third party packages (e.g. import
com.wrq.apptrieve.*" will tell the compiler that you will be referencing
the classes in this package). The compiler needs to be aware of the location of
these packages in order to successfully compile "Test.java". Once
compiled the JRE would also need an access to these external packages to be able
to run your program successfully. The JRE comes with the basic framework
classes/packages so that the JRE is already aware of these packages; however for
third party/external packages you will have tell JRE where to find them by
setting the CLASSPATH properly.
Once the JRE locates all the necessary packages/files/libraries
it can then run your program.
What gives Java the platform independence is the ubiquity of JRE.
JREs are available for most of the commercial and popular platforms. What this
means to a programmer is that he/she needs to code once and the same program
will run on any platform. This is unlike the program written in Visual
C++/Visual Basic etc. which can only run on the targeted platform.
go to top
What is a Java virtual machine?
Before I discuss the JVM in details, let me clarify a few related terms.
- Java Development Kit (JDK): This includes ALL the basic
Java framework packages, a compiler (javac), JRE, a JVM, debugger etc. in short
all you need to develop, debug, compile and run our Java program.
- Java Runtime Environment (JRE): This is a subset of the JDK.
It does not include a debugger, compiler, and framework classes. This includes
the bare minimum that a computer needs in order to run a .class
file.
- Java Virtual Machine (JVM): JVM is a part of JRE. The
.class file is passed over to JVM which then runs the program. The
JRE ensures that the code does not violate any of the security restrictions.
Remember that the byte-code (.class file) is not directly run on the
host machine; it needs to be converted to the host machine's language. This
conversion is done by the JVM. While converting the JVM ensures the security and
may also optimize the code. There are many commercial JVMs available in the
market - different JVMs have different capabilities, and varying degree of
performance. In order to produce efficient, code with minimum delay a JVM
needs to have great amount of intelligence built into it. Which would also make
the JVM larger in size. Remember that for a Java program to run, the JVM must be
loaded in the memory, and it is obvious that a large sized JVM would need much
more computer resources than a compact one. So there has to be a fine balance
between the size of a JVM and its capabilities. This is why a Java program is
always 30-70% slower than equivalent C++ program.
The initial JVMs were extremely slow and were resource hungry -
thus explaining the constant churning of your hard-disk when you ran a Java
program. In recent years lot of efficient JVMs have surfaced. These JVMs use
different compilation techniques to produce efficient machine code in as less a
time as possible. One such technique is called Just-In-Time (JIT) compilation.
This technique has also been used in .NET.
Just In Time Compilation (JIT): A detail discussion on
Just-In-Time compilation may be found in the references of this article. I will
only discuss JIT briefly.
Just-in-time (JIT) compilers promise to improve the performance
of Java applications. Rather than letting the JVM run byte code, a JIT compiler
translates code into the host machine's native language. Thus, applications gain
the performance enhancement of compiled code while maintaining Java's
portability[8]. Given below is a pictorial description of how JIT works.[5]
A simple JVM without the JIT enhancement would receive the
java-byte-code (.class file), and would convert an instruction to the
host machine's machine code and would and run it one-by-one, the overhead and
delay in this approach is obvious and has already been discussed in this
article. But when a JIT is used, the JIT compiler converts the byte-code
.class file directly into the host machine's native machine language
and runs it directly - thus reducing the overhead. All JVMs used today
have JIT enhancement built into them by default, if you don't want the JIT,
you will need to tell the JRE implicitly through using appropriate switches
while running the programs.
Although the JIT compile provides great improvement in program's
execution speed, it involves the overhead of converting the byte-code to native
code at runtime. It is for this reason that despite the JIT the Java programs are
still slower that an equivalent C/C++ program.
A Java Applet is a special Java program that is only allowed to
run inside a browser window. When you embed a Java Applet in your web page, the
browser sees the Applet tag and downloads the byte code (the .class
file) for the applet from the specified location. Once the byte code is
downloaded, the browser uses the JVM (included in the browser itself) to run
the Applet, ensuring that the Applet does not execute any insecure APIs - mainly
the APIs that access the client machine hardware.
Given the concept of the JVM, it is obvious that any programming
language that compiles into Java byte code can use the JVM for running the program.
We are all aware of how Java code (.java) is converted into byte code (.class)
which is then run by the JVM on the host machine. What if we make a compiler of
C++, that converts a C++ source file (.c or .cpp) into a java-byte code file
(.class) rather than into an .obj file. Theoretically it is possible,
whether it is practical or not is a different issue all together. In fact there
have been many languages that have compilers which produce java byte code that
can then be run by the JVM. A detail of such languages can be found in [9]. This
article belittles Microsoft's claim that the CLR is the only platform to support the language
antagonism. JVM can also (and in fact already is) be used by different languages.
go to top
What is the CLR?
What is Microsoft's Common Language Runtime (CLR)? It is the life
line of .NET applications. Before I describe the CLR - let's explain what is meant
by runtime. A runtime is an environment in which programs are
executed. The CLR is therefore an environment in which we can run our .NET
applications that have been compiled to IL. Java programmers are familiar with the
JRE (Java Runtime Environment). Consider the CLR as an equivalent to the JRE.
The above diagram shows various components of the CLR. Let's discuss
each in details. [12] has an in-depth analysis.
The Common Type System (CTS) is responsible for interpreting the data types into the
common format - e.g. how many bytes is an integer.
The second component, the IL Compiler takes in the IL
code and converts it to the host machine language. The execution support is
similar to the language runtime (e.g. in VB the runtime was VBRunxxx.dll;
however with VB.NET we do not need individual language runtimes anymore).
Security component in the CLR ensures that the assembly (the program being executed)
has permissions to execute certain functions. The garbage collector is similar
to the garbage collector found in Java. Its function is to reclaim the memory
when the object is no longer in use, this avoids memory leaks and dangling
pointers. The class loader component is similar to the class loader found in
Java. Its sole purpose is to load the classes needed by the executing
application.
Here's the complete picture.
The programmer must first write the source code and then compile
it. Windows programmers have always compiled their programs directly into
machine code - but with .NET things have changed. The language compiler would compile the
program into an intermediate language "MSIL" or simply "IL"
(much like Java Byte code). The IL is fed to the CLR then CLR would use the IL
compiler to convert the IL to the host machine code.
.NET introduces the concept of "managed code" and
"unmanaged code". The CLR assumes the responsibility of allocating and
de-allocating the memory. Any code that tries to bypass the CLR and attempts to
handle these functions itself is considered "unsafe"; and the compiler
would not compile the code. If the user insists on bypassing the CLR memory
management functionality then he must specifically write such code in using the
"unsafe" and "fixed" key words (see C# programmers guide for
details). Such a code is called "unmanaged" code, as opposed to
"managed code" that relies on CLR to do the memory allocation and
de-allocation.
The IL code thus produced has two major issues with it. First it
does not take advantage of platform specific aspects that could enhance the
program execution. (for example if a platform has some complicated graphics
rendering algorithm implemented in hardware then a game would run much faster if
it exploit this feature; however, since IL cannot be platform specific it can
not take advantage of such opportunities). Second issue is that IL can not be
run directly on a machine since it is an intermediate code and not machine
code. To address these issues the CLR uses an IL compiler. The CLR uses JIT compilers
to compile the IL code into native code. In Java the byte
code is interpreted by a Virtual Machine (JVM). This interpretation caused Java
applications to run extremely slow. The introduction of JIT in JVM improved the
execution speed. In the CLR Microsoft has eliminated the virtual machine step. The
IL code is compiled to native machine and is not interpreted at all. For such
a compilation the CLR uses the following two JIT compilers:
- Econo-JIT : This compiler has a very fast compilation time; but it
produces un-optimized code - thus the program may start quickly but would run
slow. This compiler is suitable for running scripts.
- Standard-JIT: This compiler has a slow compilation
time; but it produces highly optimized code. Most of the times the CLR would use
this compiler to run your IL code.
- Install Time Compilation: This technique allows CLR
to compile your application into native code at the time of installation. So the
installation may take a few minutes more - but the code would run at speeds
close to a native C/C++ application.
Once your program has been compiled into host machine code, it
can begin execution. During execution the CLR provides security and memory
management services to your code (unless you have specifically used unmanaged
code).
go to top
Conclusion
It is clear from the above discussion; that Microsoft has done
what it does best. It has observed the JRE/JVM for four years; and then has
come up with a more efficient and stable runtime environment that builds on top of
the strengths of JRE/JVM and removes its shortcomings.
So what should you expect
when you start using the CLR?. You should most definitely expect your programs to
run faster than an equivalent Java program [11]; but your program would still
run slower than an equivalent C/C++ program - or any other program that is
compiled into machine language. That's a limitation that ALL interpreted
languages have, and that's the price you pay for platform independence.
JVM is available for most of the platforms (hence your Java
program is really platform independent); while CLR (at the time of writing of
this article) is only available for Microsoft Windows platforms (hence a .NET
program is not really platform independent, it only promises to be platform
independent). Microsoft has not unveiled any future program to develop CLR for
other platforms; though it is inevitable that third parties would come up with
CLRs for non-Microsoft platforms.
Editor's Note: Since this article was written Microsoft has announced the
Rotor project which provides a shared-source implementation of the
CLR. There are also a few other projects around; one is the
Mono project which is trying to
bring the CLR to the Linux platform.
In the long run, both JVMs and CLRs would run in lock steps.
Each would learn from the other, and the end result would be a healthy
competition and a much better run time environment for the end-user.
go to top
References:
- An introduction to Intel
P4 NetBrust Architecture
- An introduction to Compilers by ............
- Programming languages - implementation and design by
.......
- A
Report on Interpreted Programming languages by Xiaoli Zhang & Helen Wong
-
Design,
Implementation, and Evaluation of Optimizations in a Just-In-Time Compiler
-
Optimizing
Java Bytecodes, Michal Cierniak and Wei Li. In Concurrency: Software and
Practice, 1997.
- Inlining
of Virtual Methods, David Detlefs and Ole Agesen. In ECOOP 1999, http://www.di.fc.ul.pt/ecoop99
-
Compiler,
Interpreters and Byte Code by Alan Joch
- JVM
and CLR by Jon Udell
- Just
in time compilers by Matt Welsh
- .NET and J2EE application comparison
-
An Introduction to .NET
by Kashif Manzoor