Articles About Bird
Table of Contents
Introduction
I'm developing this language because I don't find the other languages perfect
for everything. C++ is known to be one of the fastest languages, but its
syntax (especially header files) and the lack of C# like high level features makes
developing slower in my opinion. Debugging can be also hard in C++. But C# is a
managed language and it limits low level programming and application
performance. 3D graphics is about 1.5-2 times slower in
C# than in C++.
I have been working on Bird since March 2010 in C#. It's a strongly typed
native language. Its performanceeseems to
be competitive with C++ compilers currently and it's going toand it's going to
have features
from high level languages besides new things. There are many things that I
haven't implemented yet, but it can be used for smaller programs. The syntax is
similar to C# and C++ with some modification in order to make code smaller and
improve readability. I was planning to make a C# parser too, but I stopped
working on it for now. I will start working on a new in the future. The
libraries are similar to .NET, the basic functions are going to be implemented.
So I think it won't be hard to understand.
Requirements for Running a Program
Samples have a C++ equivalent code to compare performance. In order to
compile them MinGW, Clang
are needed to be installed and set in the PATH
variable, but it's optional.
Visual C++ compiler usage requires the path to "vcvarsall.bat" in to be set
in "Run - VC++.bat" files.
Creating Programs with Bird
The compiler can be run from command line by "Bird.exe" which is in
the
"Binaries" directory:
Bird.exe -x -nodefaultlib -lBirdCore -entry Namespace.Main Something.bird -out Something.exe
I've made .bat files for the samples, so using command line for them
is not needed.
The -x
means that the compiler should run the output file after
it had been compiled. The input files
can be Bird files, C, C++ or Object files.
Libraries can be specified by the -l
option. Currently the
BirdCore
and BlitzMax
are available that are included by default. The -nodefaultlib
disables them. BlitzMax is another programming language,
its functions are needed for graphics because I haven't implemented them yet.
Object and archive files also can be the output file to use it in other languages. It can be specified
with the -format
attribute. These are its possible values:
app | Executable file |
arc | Archive file, it doesn't contain the libraries, they need to be
linked to the exe. |
obj | Object file, only contains the .bird files' assembly.
The other files and libraries are not included. |
Syntax
A Simple Function
using System
void Main()
Console.Write "Enter a number: "
var Number = Convert.ToInt32(Console.ReadLine())
if Number == 0: Console.WriteLine "The number is zero"
else if Number > 0: Console.WriteLine "The number is positive"
else Console.WriteLine "The number is negative"
for var i in 1 ... 9
Console.WriteLine "{0} * {1} = {2}", i, Number, i * Number
Console.WriteLine "End of the program"
The indication of code blocks are done based on the whitespaces in front of lines.
One scope always have the same number of whitespaces. Colon can be
used to make the command able to have the inner block in the same line. The
compiler needs to know where the previous expression ends. If there's no
expression, like the else statement without if, the colon is not needed.
Functions can be called without brackets, if the
returned value is not used. In the for
loop the var
keyword means the type of i
, which is the same as the initial value
(1
-> int
). The three dots means that the
first value of i
is 1
, and it includes the value at
the right side, so the last value is 9
.
I was thinking about making able to declare variable without type (or the
var
keyword), but it could lead to bugs if the name of the variable
is misspelled.
Literals
Number literals can have different radix and type. $
means
hexadecimal, %
means binary. Hexadecimal letters have to be
uppercase to distinguish them from the type notation, which is the lowercase short form of
the type at the end of number:
$FFb
-$1Asb
%100
Chained Comparison Operators
I think that this could have been implemented in C languages, because in
some cases it can be useful. Each sub-expression runs only once, so it's also faster that making
two relation connected with and
.
bool IsMouseOver(int x, y, w, h)
return x <= MouseX() < x + w and y <= MouseY() < y + h
The relation operators can only face to one direction to make them
distinguishable from generic parameters.
Aliases
It's similar to the using alias directive in C# and the typedef
keyword of C++, but in Bird aliases can be created for everything, even for
variables. The order of declaration doesn't matter, so it's possible to do
this:
alias int32 = int
alias int64 = long
alias varalias = variable
int64 variable
Tuples
Tuple Basics
Tuples are similar to structs but they don't have name. Unlike .NET tuple, Bird tuples are
value types. Tuples are a grouping of named or unnamed values that can have
different types. For example:
var a = (1, 2)
var b = (1, 2f)
var c = ("Something", 10)
In this case the reference of members are done by the index of it (e.g.
a_tuple.0
), but they can get a name:
alias vec2 = (float x, y)
const var v = (2, 3 to vec2
const var vx = v.x
const var vy = v.y
These variables can be declared as constant because the compiler
interprets (2, 3)
as a single constant.
Tuples can be also used to swap the values of variables:
a, b = b, a
Tuples as Vectors
Vector operations are based on tuples. The SSE/SSE2 packed instructions
will be emitted by tuple operations instead of vectorization. The Cross
function can be written as:
alias float3 = (float x, y, z)
float3 Cross(float3 a, b)
return a.y * b.z - a.z * b.y,
a.z * b.x - a.x * b.z,
a.x * b.y - a.y * b.x
Vector function will be defined in the Math
class, some of
them are already implemented. Without using the float3
type,
this is how it can be written using unnamed members:
float, float, float Cross((float, float, float) a, b)
return a.1 * b.2 - a.2 * b.1,
a.2 * b.0 - a.0 * b.2,
a.0 * b.1 - a.1 * b.0
Tuple Extraction
It's possible to extract a tuple in a similar way as swapping variables:
float x, y, z
x, y, z = Cross(a, b)
Or if var
is used, it can be written in a single line:
(var x, var y, var z) = Cross(a, b)
The var
have to be written before all the variable in order
to make the compiler able to decide which is an existing variable. E.g. if (var x, y, z) = ...
would be
interpreted as three new variable then it wouldn't be possible to refer to
an existing y
, z
.
For Loops
for var i in 0 ... 9
for var i in 0 .. 10
Both loops mean the same. Two dots means that i
won't have
the value at right, in case of three dots it will have that value.
for var x, y in 0 .. Width, 0 .. Height
This is the same thing as two nested loops. The x
goes from
0
to Width-1
, the y
goes from 0
to Height-1
. The loop with y
variable is the inner
one. The break
command exits from both. It can be also
written like this:
for var x, y in (0, 0) .. (Width, Height)
If only one number is specified then it will be the initial or the final
value of all variables.
So this is the same as the previous:
for var x, y in 0 .. (Width, Height)
If there is two point, it's possible to make a single for loop that
runs with all points that are in their rect. In this case the x
variable goes from P1.0
to P2.0
, the y
goes from P1.1
to P2.1
:
var P1 = (10, 12)
var P2 = (100, 110)
for var x, y in P1 ... P2
/ Something
The step
can be used to specify how much the loop variables
are increased. It can be both a scalar or a tuple with the same rules. It
adds 1
to i
and 2
to j
at every cycle. The next loop increases i
with 1
and j
with 2
:
for var i, j in 1 .. 20 step (1, 2)
Other Loops
The while
, do-while
loop is similar to C
languages:
var i = 1
while i < 100
i *= 2
i = 1
do
i *= 2
while i < 100
I created two new that the code can be written smaller with. The repeat
does something as many times
as specified in the parameter. the cycle
makes an infinite
cycle.
Structures
Structures can contain fields, methods, constructors, etc. The new
operator, if the type is not specified, it creates an object with the same
type as it is converted to. In this program it is the return type. The
original is the var
type that is always automatically changed
to another type:
struct Rect
public float X, Y, Width, Height
public Rect(float X, Y, Width, Height)
this.X = X
this.Y = Y
this.Width = Width
this.Height = Height
public Rect Copy()
return new(X, Y, Width, Height)
public Rect Copy_2()
return new:
X = this.X
Y = this.Y
Width = this.Width
Height = this.Height
public float GetValue(int i)
switch i
case 0: return X
case 1: return Y
case 2: return Width
case 3: return Height
default return 0
I would note that there is never need to use the break
command at the end of the case
block. But I'm not sure that
there is need for the switch
statement, I never use it,
if
conditions are much more simple in my opinion, especially in C#
where the
case
block must be leaved with some jumping command.
Strings
The most important .NET functions have been implemented.
I haven't made a GC
yet, so objects will remain allocated until the application exits. It's not a
problem for now.
using System
void Main()
Console.WriteLine "adfdfgh".PadRight(10) + "Something"
Console.WriteLine "adfdh".PadRight(10) + "Something"
Console.WriteLine
Console.WriteLine "adfdfgh".Contains("fdf")
Console.WriteLine "adfdfgh".Contains("fdfh")
Console.WriteLine
Console.WriteLine "adfdfgh".Replace('d', 'f')
Console.WriteLine "adfdfgh".Replace("d", "ddd")
Console.WriteLine "adfdfghléáőúó".ToUpper()
Arrays
Reference Typed Arrays
This is how 1D reference array can be declared and initialized:
var Array1D_1 = new int[234]
var Array1D_2 = new[]: 1, 2, 3
var Array1D_3 = new[]:
1
2
3
var Array1D_4 = new[]:
1, 2
3, 4
The compiler takes into account how many dimension are there before
interpreting the initial values. The values can be separated with both
brackets and new lines. If it founds one less dimensions than specified, the
new lines are dimension separators too. I'm not sure it's good, I may remove
it the future because it's a bit ambiguous. But it can be also made with
using only brackets.
var Array2D_1 = new[,]: (1, 2), (3, 4)
var Array2D_2 = new[,]:
1, 2
3, 4
var Array2D_3 = new[,]:
(1000, 1001, 1002, 1003, 1004, 1005
1006, 1007, 1008, 1009, 1010, 1011)
(2000, 2001, 2002, 2003, 2004, 2005
2006, 2007, 2008, 2009, 2010, 2011)
Fixed Size Arrays
These are value types and stored on the stack. Their type is marked with
the size unlike reference arrays (e.g. int[10]
). This is how
can they be created:
int[5] Arr1 = new
int[5] Arr2 = default
The default
keyword is the same as in C#. It's just optional
to specify the type if it can be inferred. In this case it is the same as
the destination variable. The same thing happens with
new
, it would be
new (int[5])()
. The
new
for value types means the same as
default
. All values in both
arrays are initialized to zero. Initial value can be specified as:
var FixedArr1D = [0, 1, 2, 3]
var FixedArr2D = [(0, 1), (2, 3)]
byte[4] FixedArr1D_2 = [0, 1, 2, 3]
The FixedArr1D_2 array can be declared without an error, because the
compiler takes the type of the variable into account before evaluating the
initial value.
Fixed size arrays can be converted to reference types with an implicit
conversion:
double[] Arr = [0, 1, 2]
Func [0, 1, 2, 3]
void Func(double[] Arr)
long[], byte[] GetArrays()
return [0, 1, 2], [2, 3, 4, 5]
Pointer and Length
The notation of this kind of array (or rather tuple) is T[*]
(T is a arbitrary type), that is actually a short form of
(T*,
uint_ptr Length)
. It can be useful for unsafe programming. I created
it because I had to write two variables for the same purpose. Both reference
type and fixed size arrays can be converted to it implicitly:
using System
void OutputFloats(float[*] Floats)
for var i in 0 .. Floats.Length
Console.WriteLine Floats[i]
void Main()
OutputFloats [0, 1, 2]
var Floats = Memory.Allocate(sizeof(float) * 3) to float*
for var i in 0 .. 3: Floats[i] = i + 10.5f
OutputFloats (Floats, 3)
Parameters with ref, out
Using ref
it's possible to use a parameter as input and output. The
out can be used for only output, but it makes sure that the variable gets a
value:
using System
void OutputFunc(ref int x)
Console.WriteLine x
x++
void Func(out int x)
x = 10
void Main()
Func out var x
OutputFunc ref x
OutputFunc ref x
OutputFunc ref x
A variable passed with ref
must have a value before the function is
called, out
parameters must be set to a value before
leaving the function. These checks can be bypassed with unsafe_ref
.
Named and Optional Parameters
Only parameters that have to be specified are that don't have default value:
IntPtr Graphics(int Width, Height, Depth = 0, Hertz = 60, Flags = 0)
Graphics 800, 600
Graphics 800, 600, 32
With named parameters, the earlier parameters are not need to be specified:
Graphics Width: 800, Height: 600
Graphics 800, 600, Hertz: 75
Properties and Indexers
They are marked with colon. Properties are handled as variables, when
using them the compiler calls the set
and get
methods. In case of indexer parameters can be specified too:
class Class
int _Something
public int Something:
get return _Something
set _Something = value
public int AutomaticallyImplementedProperty:
get
set
public int this[int Index]:
get return Index * 2
public int NamedIndexer[int Index]:
get return this[Index]
void Main()
Class Obj = new
Console.WriteLine Obj[3]
Console.WriteLine Obj.NamedIndexer[4]
Operator Functions
Operators can be defined for structures and classes that wouldn't allow
it by default:
class Class
int _Something
public static void operator ++(Class Obj)
Obj._Something++
void Main()
Class Obj = new
Obj++
Getting the Address of a R Value
Sometimes a parameter have to be passed with a pointer to it. In Bird, the
address can be queried from constants and R values too, and it automatically
copies to a variable:
using System
int[,] CreateIntArray2D(uint_ptr Width, Height)
var Obj = new Array(id_desc_ptr(int[,]), [Width, Height], 4)
return reinterpret_cast<int[,]>(Obj)
int[] CreateIntArray1D()
const uint_ptr Length = 16
uint_ptr[*] Dimensions = (new: Pointer = &Length, Length = 1)
var Obj = new Array(id_desc_ptr(int[]), Dimensions, 4)
return reinterpret_cast<int[]>(Obj)
The type of [Width, Height]
expression is uint_ptr[2]
, so when it casted to
uint_ptr*
the compiler have to query the address. So it creates a new
variable that will be assigned to
[Width, Height]
and it gets the
address of this variable. It does the same with
&Length
in the
second function.
reinterpret_cast
basically does nothing, it just changes the type of an expression node like casting a pointer.
Reference Equality Operator
The ===
and !==
operator can be used to compare
the references of objects. It does the same thing as the
Object.ReferenceEquals
. The
==
can be also used for
this, but it can be overwritten with an operator function.
public bool StringReferenceEquals(string A, B)
return A === B
Higher Order Functions
The type of a function can be marked with ->
. At the left side
there are the input parameters, at the right side the output parameters. The
calling convention and modifiers also can be specified. E.g.
birdcall
string, object -> int, float
. When there are multiple outputs, the
return type becomes a tuple. In the future I plan to allow all functions to have
multiple output in a similar way.
using System
int GetInt(int x)
return x + 1
int Test((int -> int) Func)
return Func(2)
void Main()
var Time = Environment.TickCount
var Sum = 0
for var i in 0 .. 100000000
Sum += Test(GetInt)
Time = Environment.TickCount - Time
Console.WriteLine "Time: " + Time + " ms"
Console.WriteLine "Sum: " + Sum
This little sample shows how it works. I made it in C# too, and these are the
performance result with my machine:
Compiler | Bird | C# |
Time | 719 ms | 2234 ms |
Actually it is implemented very simply. Higher order functions are just a
tuples of an object and a function pointer
(object Self, void*
Pointer)
. The
Self
member can be null if the function is
static. It's possible to create a static function pointer with the
static
keyword:
static int -> float
. When a nonstatic
function is called, the
Pointer
member is converted to a
function pointer. If the
Self
is not null, it is also added to
the parameters. This is how the
Test
function is extracted:
int Test((int -> int) Func)
return if Func.Self == null: (Func.Pointer to (static int -> int))(2)
else (Func.Pointer to (static object, int -> int))(Func.Self, 2)
To make it run faster, the parameter can be replaced to a function pointer.
It runs in 542 ms in this way.
int Test((static int -> int) Func)
return Func(2)
History
- 1/1/2013: Higher order functions, stack alignment, and many refactoring
- 10/11/2012: Scope resolution operator, changed casting
operator, the
to
keyword has the same syntax as is
and
as
operator and it doesn't allow ambiguous code. - 22/9/2012: Parameter arrays, better x86 performance
- 18/8/2012: Implemented stackalloc, pointer and length arrays, new and default without specifying the type, static constructors, checked, unchecked, generic parameters with <> (only at reinterpret_cast)
- 18/7/2012: Object type casting, boxing, unboxing, is as xor operator, low-level reflection, improved x86 code generation
- 16/6/2012: Exception handling, try-catch-finally, constants also can be declared inside a function
- 19/5/2012: Arrays, object initializations, address can be taken of r values, ref, out parameters, added Visual C++ compilation of samples
- 2/5/2012: Improved performance, identifier aliases instead of typedefs, strings, reference equality operator (===, !==), binary files can be linked into the assembly, changed the name from Anonymus to Bird