Table of Contents
In this article I want to talk about a JavaScript library I created and published to npm and NuGet. The library, dubbed arrgh.js, brings proper .NET like collections and LINQ to JavaScript. If you know LINQ in C# you already know how arrgh.js works!
In this article we'll see what arrgh.js is, how it was created, how it's tested and documented and how it's published to npm and NuGet. I've added the code, as it was at the time of this writing, to this article. For the most recent version of the code you can check out the GitHub repository. You can also check out the full documentation.
I'm pretty sure we've all had to work with an array in JavaScript one time or another.
An array that has a forEach function, but only starting IE9 (and, of course, my customer needed IE8 support).
An array that only recently added support for a contains functions, but called it includes (I recently read they chose includes over contains because adding contains to the standard would break a popular framework, what the...).
An array that's also a queue and a stack, and sort of partially a list, it can be added to, but not removed from.
Removing an item is as tedious as searching for the index of the item you want to remove, split the array at that point, skip an item, and put the remaining parts back together again. Manually.
And can you remember if you need splice or slice?
All in all I found the array to be one big frustration (and that can actually be said for JavaScript as a whole). Needless to say I went looking for alternatives. Basically, what I wanted was C# like collections with LINQ support in JavaScript. Of course this had been done before, but the libraries I found didn't meet all of my requirements, did not work in older browsers, were not sufficiently documented, did not have lazy evaluation, missed collection types such as a Dictionary (HashMap) or did not implement them how I wanted to use them. The best I found was linq.js, but this one wanted to look so much like C# it has PascalCased everything, while JavaScript uses camelCasing (later I found I had downloaded an old version since the latest version does use camelCasing).
So I decided to build my own JavaScript collections and LINQ. Also because that's just really fun to do. I called it arrgh.js, a combination of array and argh!, that last one being my screams of utter frustration when working with JavaScript and the array in particular.
If you want to skip everything and get straight to work you can install it using npm or NuGet.
npm install arrgh.js
Install-Package arrgh.js
A lot of coding adventures start with an empty text file, as did mine. I've created a folder for the project, called arrgh.js
, and then a folder for my source code, called src
. Within that folder I created the arrgh.js
file and started writing. I basically had two options, augment the JavaScript array class (which is considered bad practice and might break either the JavaScript array or my implementation in the future) or start from scratch and create my own collection object. I chose the latter.
My first implementation was just a wrapper around an array. Simple and naïve.
var Enumerable = function (arr) {
this.arr = arr;
};
Enumerable.prototype.forEach = function (callback) {
var i;
for (i = 0; i < this.arr.length; i += 1) {
callback(this.arr[i], i);
}
};
var e = new Enumerable(["Hello", "Enumerable"]);
e.forEach(function (s) {
console.log(s);
});
And then I implemented where
and select
in pretty much the same way.
Enumerable.prototype.where = function (predicate) {
var filtered = [];
this.forEach(function (e, i) {
if (predicate(e, i)) {
filtered.push(e);
}
});
return filtered;
};
var e = new Enumerable([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
var evens = e.where(function (n) {
return n % 2 === 0;
});
Easy as that may be it wasn't going to cut it. Imagine, in a later stage, doing something like the following:
Enumerable.range(0, 1000000).where(isEven).take(10);
It would now go through a million elements, check if they're even and then take 10. If we wanted 10 we didn't need to go through more than 20 elements! So let's make this code so that we can lazy evaluate elements when necessary. We're going to implement the Iterator Pattern. This allows us to go through the elements of a collection one by one, meaning that if we ask the next 20 elements of a collection that is theoretically infinite we only have to compute the first 20 elements.
So, the Enumerable
is going to implement a getIterator
method (I wanted to call it getEnumerator
, like in C#, but Enumerator
is already reserved in JavaScript). getIterator
will return an object that can return the next element of a collection. Of course that also means we'll have to rewrite the forEach
method.
var ArrayIterator = function (arr) {
var currentIndex = -1;
this.moveNext = function () {
currentIndex += 1;
return currentIndex < arr.length;
}
this.current = function () {
return arr[currentIndex];
};
};
var Enumerable = function (arr) {
this.getIterator = function () {
return new ArrayIterator(arr);
}
};
Enumerable.prototype.forEach = function (callback) {
var iterator = this.getIterator();
var currentIndex = 0;
while (iterator.moveNext()) {
callback(iterator.current(), currentIndex);
currentIndex += 1;
}
};
var e = new Enumerable(["Hello", "Enumerable"]);
e.forEach(function (s) {
console.log(s);
});
The Iterator
supports two functions, moveNext
and current
. This should be familiar as .NET's IEnumerator supports the same. Notice I didn't implement a reset
method because .NET only implemented it for COM interoperability, which JavaScript does not do. The same goes for dispose
in the generic IEnumerator<T>. The usage, funny enough, stays the same.
Now if we look at the forEach
method we see that it first asks for the Iterator
using getIterator
. Then it moves through the collection by calling moveNext
and current
. When no more elements are to be has moveNext
returns false
and forEach
stops looping and returns to the caller. Now there are a few rules that every Iterator
should take into account. First, moveNext
can be called forever, but once it returns false
each subsequent call should return false
as well. Also, whenever moveNext
returns false
, current
should return undefined
.
There is just a small tweak I want to give to the forEach
method. It should implement a sort of break functionality. You don't want to be forced to loop through the entire collection every time you call forEach
. So having the callback return false or any falsey value (excluding undefined
and null
) will break the loop.
function isNull(obj) {
return obj === undefined || obj === null;
}
Enumerable.prototype.forEach = function (callback) {
var iterator = this.getIterator();
var cont;
var currentIndex = 0;
while ((isNull(cont) || cont) && iterator.moveNext()) {
cont = callback(iterator.current(), currentIndex);
currentIndex += 1;
}
};
var e = new Enumerable(["Hello", "Enumerable"]);
e.forEach(function (s) {
console.log(s);
return false;
});
As you can see it's now possible to break out of the loop by returning a falsey value. I decided to not include undefined
and null
because undefined
is the default return value for any function and I don't want to force the user to always explicitly return true
(or any thruthey value). For simplicity, I chose to treat undefined
and null
as being the same value (that is, not a value). This is actually how forEach
ended up in the published version of arrgh.js.
Using this design we have to completely rethink how a function such as where
works. In the previous example it returned an array
, which is a no go. If we return an Enumerable
instead we can chain our functions. However, an Enumerable
expects an array
as input, which is also not applicable when we use it in where
. The point about this whole Iterator
thing is that we don't want to evaluate the result right away, instead we want to return an Iterator
for a follow up function to use. That sounds awfully difficult, but I believe a code sample says more than a thousand words.
var isArray = function (obj) {
return Object.prototype.toString.call(obj) === "[object Array]";
};
var Enumerable = function (enumerable) {
var getIterator;
if (isArray(enumerable)) {
getIterator = function () {
return new ArrayIterator(enumerable);
};
} else if (typeof enumerable=== "function") {
getIterator = enumerable;
} else {
throw new Error("Invalid input parameter.");
}
this.getIterator = getIterator;
};
var WhereIterator = function (source, predicate) {
var iterator = source.getIterator();
var index = -1;
var current;
this.moveNext = function () {
while (iterator.moveNext()) {
index += 1;
current = iterator.current();
if (predicate(current, index)) {
return true;
}
}
current = undefined;
return false;
};
this.current = function () {
return current;
};
};
Enumerable.prototype.where = function (predicate) {
var self = this;
return new Enumerable(function () {
return new WhereIterator(self, predicate);
});
};
Enumerable.prototype.filter = Enumerable.prototype.where;
var e = new Enumerable([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
var evens = e.where(function (n) {
return n % 2 === 0;
});
First of all, the isArray
function. It's utter madness, but this is the only 100% thrustworthy way to tell if an object
is an array
in JavaScript. There is actually an npm package for just that one line of code, which is also utter madness. Newer browsers have this function implemented by default, but I wanted this library to be compatible with IE8. I also want it to be lightweight, meaning no dependencies on other libraries. So that's isArray
.
Next, as you can see, we made the Enumerable
accept a parameter that may be an array
or a function
(which is assumed to be a getIterator
function). This allows us to create Enumerables
with all sorts of Iterators
. The Iterator
overload is used by the where
function, which passes to it a function that creates a WhereIterator
.
Now the WhereIterator
, at first sight, looks like an hideous beast (although it's a puppy compared to some other Iterators
we'll end up with). The where
function is always called on an Enumerable
, which will be the source to filter. We get the Iterator
of the source and then simply move through it. Whenever an item satisfies the condition we return to the caller and indicate there are possibly more values. When the source has no more items we set the current
to undefined
and return to the caller indicating no more items are found. Again, the usage of where
remains the same.
Last, but not least, we create an alias for where
. I thought that would be nice as JavaScript, and other languages as well, use the name filter
instead of where
.
Because where
now returns another Enumerable
it becomes really hard to debug this code, after all, the only way of knowing what's in an Enumerable
is by enumerating over it using forEach
. So let's create another function real quick, toArray
. With toArray
we can simply convert an Enumerable
to a regular JavaScript array
and continue our business like usual.
Enumerable.prototype.toArray = function () {
var arr = [];
this.forEach(function (elem) {
arr.push(elem);
});
return arr;
};
var e = new Enumerable([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
var evens = e.where(function (n) {
return n % 2 === 0;
}).toArray();
Another useful, and really easy, function, for making any collection read-only, is asEnumerable
, which we can implement now that the Enumerable
takes a function as input.
Enumerable.prototype.asEnumerable = function () {
return new Enumerable(this.getIterator);
};
Now if you test the previous code in the debugger you'll notice getIterator
return an ArrayIterator
or a WhereIterator
. In the final library there are about 20 Iterators
. Wouldn't it be nice if getIterator
always returned Iterator
? It allows us to check if an object
is any Iterator
. So at the very least we're going to need a base class for Iterator
.
So, two choices. Either create a base class Iterator
and then inherit ArrayIterator
and WhereIterator
, or create a class Iterator
and pass the moveNext
and current
functions to it, keeping the internals to their respective functions. I chose the latter. Let's see what that means for our code.
var Iterator = function (moveNext, current) {
this.moveNext = moveNext;
this.current = current;
};
var getArrayIterator = function (arr) {
var len = arr.length,
index = -1;
return new Iterator(function () {
if (arr.length !== len) {
throw new Error("Collection was modified, enumeration operation may not execute.");
}
index += 1;
return index < len;
}, function () {
return arr[index];
});
};
var Enumerable = function (enumerable) {
var getIterator;
if (isArray(enumerable)) {
getIterator = function () {
return getArrayIterator(enumerable);
};
} else if (typeof enumerable=== "function") {
getIterator = enumerable;
} else {
throw new Error("Invalid input parameter.");
}
this.getIterator = getIterator;
};
Enumerable.prototype.where = function (predicate) {
var self = this;
return new Enumerable(function () {
var iterator = self.getIterator();
var index = -1;
var current;
return new Iterator(function () {
while (iterator.moveNext()) {
index += 1;
current = iterator.current();
if (predicate(current, index)) {
return true;
}
}
current = undefined;
return false;
}, function () {
return current;
});
});
};
As you can see the ArrayIterator
and WhereIterator
are gone and there is only a single Iterator
constructor function left. This approach has pros and cons. The pro is, obviously, that there is only one Iterator
class now that can change its behavior according to the constructing function. Another (arguably negligible) pro is that this syntax is a little shorter saving about 2 KB (on 20 KB) in the minified version. The cons are that our code now makes heavy use of closures (which isn't necessarily bad) and that the implementing classes are a little harder to read. I've tucked away the Iterator
for arrays in a getArrayIterator
function because this Iterator
is used by the List
class as well, as we'll see in a moment (for that reason I also added the check if length
has not changed).
What we've seen so far are the basics of Enumerables
and arrgh.js. All other methods, like all
, distinct
, any
, orderBy
and select
are implemented using the same methodology as where
. Simply return an Enumerable
with a customer Iterator
.
The currently released version of Enumerable
is a little more complex, as it also accepts other Enumerables
and strings
as input and also multiple input parameters (like C# params
). It's just a few more if
's though, so you shouldn't have a problem figuring that out.
By the way, here are the implementations of range
and take
, so you can test out that little code snippet Enumerable.range(0, 1000000).where(n => n % 2 === 0).take(10);
and see that it really just evaluates 18 values.
Enumerable.range = function (start, count) {
if (!isNull(count)) {
if (count < 0) {
throw new Error("Count cannot be lower than 0.");
}
if (start + (count - 1) > Number.MAX_SAFE_INTEGER) {
throw new Error("Start and count can not exceed " + MAX_SAFE_INTEGER + ".");
}
}
return new Enumerable(function () {
if (isNull(count)) {
var moved = false;
return new Iterator(function () {
if (!moved) {
moved = true;
} else {
start += 1;
}
return start <= Number.MAX_SAFE_INTEGER;
}, function () {
if (!moved || start > Number.MAX_SAFE_INTEGER) {
return undefined;
}
return start;
});
} else {
var index = -1;
return new Iterator(function () {
index += 1;
return index < count;
}, function () {
if (index === -1 || index >= count) {
return undefined;
}
return start + index;
});
}
});
};
Enumerable.prototype.take = function (count) {
var self = this;
return new Enumerable(function () {
var iterator = self.getIterator(),
index = -1;
return new Iterator(function () {
index += 1;
return index < count && iterator.moveNext();
}, function () {
if (index === -1 || index >= count) {
return undefined;
}
return iterator.current();
});
});
};
Iterating over Enumerable.range(0)
will actually go on until Number.MAX_SAFE_INTEGER
(which is injected in the released version because of browser support), which is 9007199254740991, and will probably crash your browser, unless you limit the result with any
, some
, take
, takeWhile
, first
or firstOrDefault
).
With the Enumerable
in place we can continue with out next class, the List
. To create a List
we're going to inherit from Enumerable
. Again, there is a package for this, but it's so very small and I don't want any dependencies, so I simply created my own helper method (don't ask about inherit
and Temp
, it's just more JavaScript absurdity).
var Temp = function () {
return;
};
function inherit(inheritor, inherited) {
Temp.prototype = inherited.prototype;
inheritor.prototype = new Temp();
Temp.prototype = null;
inheritor.prototype.constructor = inheritor;
}
var List = function (arr) {
var self = this;
Enumerable.call(this, function () {
return getArrayIterator(self);
});
arr = arr || [];
if (isArray(arr)) {
var i;
for (i = 0; i < arr.length; i += 1) {
this[i] = arr[i];
}
} else {
throw new Error("Invalid input parameter.");
}
this.length = arr.length;
};
inherit(List, Enumerable);
var l = new List(["Hello", "List"]);
console.log(l[0]);
console.log(l[1]);
The List
constructor takes an array
as input parameter and adds the contents of the array
to itself. This seemed like a good idea, as List
and array
are now interchangeable in read-only scenarios, but I ran into a lot of problems when implementing add
and remove
methods. For example, what to do when the user of List
adds the next index manually? When adding an index on an array
the length
is adjusted accordingly, but that is not something we can do ourselves. Likewise, when a user changes the length
on an array
indexes are added or removed accordingly, which is also not something we can do ourselves. Ultimately, I decided to keep the length
property as it is (with the risk of an unsuspecting user changing it and breaking the List
) and ditch the array-like approach.
The next approach was to encapsulate an array
, like the C# List<T>
class does as well. Unfortunately, JavaScript doesn't know private members, but the convention seems to be to prefix privates with an underscore. Personally, I prefer to create an object called _
(underscore) that contains all private members.
var List = function (arr) {
var self = this;
arr = arr ? arr.slice() : [];
Enumerable.call(this, function () {
return getArrayIterator(self._.arr);
});
this._ = {
arr: arr
};
this.length = arr.length;
};
inherit(List, Enumerable);
List.prototype.get = function (index) {
if (index < 0 || index >= this.length) {
throw new Error("Index was out of range. Must be non-negative and less than the size of the collection.");
}
return this._.arr[index];
};
var l = new List(["Hello", "List"]);
console.log(l.get(0));
console.log(l.get(1));
console.log(l.get(2));
Again, the List
as it is currently published accepts more input parameters, such as other Enumerables
and strings
, but the basis remains an implicitly private array
and an implicit read-only length
property. Notice also how getArrayIterator
is used.
So let's see how add
and remove
are implemented then.
List.prototype.add = function (item) {
this._.arr.push(item);
this.length += 1;
};
List.prototype.remove = function (item) {
var index = this.indexOf(item);
if (index >= 0) {
this._.arr.splice(index, 1);
this.length -= 1;
return true;
}
return false;
};
As you can see it's still array manipulation that makes you go argh!, but at least it's nicely encapsulated in a List
class that has many useful functions and is consistent across browsers.
The indexOf
function is inherited from Enumerable
and I'm not going into it as it's just one of many. However, what's worth mentioning is that the List
class actually overrides it. Since List
knows the length of itself as well as the index of its contents, something Enumerable
does not, we can optimize some functions, like indexOf
, on List
. I'm going to show you how that's done, but I'm going to show you using the count
function instead.
Enumerable.prototype.count = function (predicate) {
var count = 0;
predicate = predicate || alwaysTrue;
this.forEach(function (elem) {
if (predicate(elem)) {
count += 1;
}
});
return count;
};
List.prototype.count = function (predicate) {
if (!predicate) {
return this.length;
} else {
return Enumerable.prototype.count.call(this, predicate);
}
};
When the predicate
is not specified the List
can simply return its length
property while an Enumerable
must first evaluate all of its elements.
In the released version of arrgh.js the List
constructor allows for Enumerables
to be passed as input so the Enumerable.toList
function is really very easy. Of course the List
will have to enumerate over the collection to fill up an array internally.
Enumerable.prototype.toList = function () {
return new List(this);
};
All in all the List
class is not very complicated. Other functions it has are addRange
, clear
, insert
, set
and sort
. Without a doubt sort
is the most complex, but we'll get to that later (it reuses the orderBy
functionality).
The Dictionary
, unlike the List
, is a pretty complex beast! It's also the one class that I haven't seen satisfactory implemented in other LINQ libraries. If you know a bit about the JavaScript internals you'll know that every JavaScript object is actually implemented as a hashmap (which a Dictionary
basically is). Believe me, I did some Googling to get this to work, but people on the internet usually refer to using the JavaScript object
. Well, I've got a few objections to using an object
as a Dictionary
. First, the only keys it allows are strings
. Second, it's not easily iterated over, you'll need to use a for loop and check for hasOwnProperty
, and then JSLint will complain that you should actually use Object.keys
(which is, of course, not supported in older browsers). Furthermore, objects miss a lot of nice functionality, like everything Enumerable
gives us. An object
is just not a collection.
The real problem lies in implementing a proper Dictionary
. As said, it's also called a hashmap, but where are we going to get hashes? In .NET every object has a GetHashCode
function that's actually implemented in some COM object, probably directly against your hardware (I don't know if that's true, but I do know it's pretty impossible for us mere mortals to implement). JavaScript doesn't have all that so we'll have to implement it ourselves. As I said, impossible, so I looked at the next best option which is... Using an object
as hashmap.
So here's the first problem, objects only use strings as keys, but in .NET we can use any object as key, not just strings. So here's the deal, we're going to use the toString
implementation of objects, which can be overridden. However, since toString
is often used for debugging purposes or for presentation on screen we'll allow an extra custom getHash
method. And if even that is not enough we'll allow an equality comparer for getting hashes and comparing keys. The equality comparer also solves another problem: hash collisions.
For this purpose I'll show you a default equality comparer as well as the add
function before anything else, as they're used to determine the hash of an object.
function isActualNaN (obj) {
return obj !== obj;
}
var defaultEqComparer = {
equals: function (x, y) {
return x === y || (isActualNaN(x) && isActualNaN(y));
},
getHash: function (obj) {
var hash;
if (obj === null) {
hash = "null";
} else if (obj === undefined) {
hash = "undefined";
} else if (isActualNaN(obj)) {
hash = "NaN";
} else {
hash = typeof obj.getHash === "function" ?
obj.getHash() :
typeof obj.toString === "function" ? obj.toString() : Object.prototype.toString.call(obj);
}
return hash;
}
};
var Dictionary = function (eqComparer) {
var self = this;
Enumerable.call(self, function () {
var iterator = self._.entries.getIterator();
return new Iterator(function () {
return iterator.moveNext();
}, function () {
var current = iterator.current();
if (current) {
return { key: current.key, value: current.value };
}
return undefined;
});
});
this.length = 0;
this._ = {
eqComparer: ensureEqComparer(eqComparer),
keys: {},
entries: new List()
};
};
inherit(Dictionary, Enumerable);
function dictionaryContainsKey (dictionary, hash, key) {
if (dictionary._.keys.hasOwnProperty(hash)) {
return dictionary._.keys[hash].contains(key, function (x, y) {
return dictionary._.eqComparer.equals(x.key, y);
});
}
return false;
}
Dictionary.prototype.add = function (key, value) {
var hash = this._.eqComparer.getHash(key);
if (dictionaryContainsKey(this, hash, key)) {
throw new Error("Key [" + key + "] is already present in the dictionary.");
}
if (!this._.keys[hash]) {
this._.keys[hash] = new List();
}
var pair = { key: key, value: value };
this._.keys[hash].add(pair);
this._.entries.add(pair);
this.length += 1;
};
Luckily it's not quite as difficult as it looks. The crux is really in the defaultEqComparer
. I'd like to note that undefined
, null
and NaN
are valid keys and that NaN
is checked for equality with NaN
(normally, NaN === NaN
yields false
) An equality comparer has a getHash
function and an equals
function. The getHash
function gets the hashes of objects
, which, in our case, is really just a string
. When two objects
produce the same hash the equals
function is used to check if the objects are equal (which is not always the case). An example will clear things up.
var d = new Dictionary({
equals: function (x, y) {
return x === y;
},
getHash: function (obj) {
return obj.firstName;
}
});
d.add({
firstName: "Bill",
lastName: "Gates"
});
d.add({
firstName: "Bill",
lastName: "Clinton"
});
Since both objects have a firstName
of Bill, which is used as a hash, there is a hash collision (both objects produce the same hash). However, since the equals
function deems the object not equal both objects are added to the Dictionary
as a key (instead of throwing an error saying the key is already present).
In the add
implementation you can see that the hash is calculated and when it's not present we add it to the keys
object. The hash maps to a List
object which is used to hold all values with that specific hash. The more elements have the same hash the slower the lookup of a key with that hash becomes. That's pretty important as hashmaps usually have an O(1) lookup time, but this is more like O(1-ish). The default hash for any JavaScript object is "[object Object]"
, so be sure to overwrite toString
, implement a getHash
or use a custom equality comparer or your lookup time will be that of a List
. The keys
object is basically going to look as follows.
key = {
Bill: [
Bill Clinton,
Bill Gates
],
AnotherHash: [ value ],
["[object Object]"]: [ objectsWithDefaultToString ]
};
Now, for the Dictionary
Iterator
. You'll notice there is a List
of entries in the Dictionary
. This is used in the Iterator
. Using the hashes
object we lose the ordering of our elements, so we keep all elements in entries
as well. It makes iterating quite easy as we'll just have to iterate through the List
. Notice that the key-value pair is copied during iteration. That's because the key-value pair to the client is read-only (of course clients can mess this up by altering _.entries
directly). I should notice .NET makes use of a linked list internally while our List
uses an array internally. There are pros and cons to both, like speed of updating (linked list wins) and memory usage (array wins).
Here's the remove
function, which removes a key-value pair using a given key. The getPairByKey
looks for the hash and then for the key in the mapped List
. It does this using firstOrDefault
, which returns the first instance of an item in the List
or a default when the item is not found (note to self: could've been singleOrDefault
instead as a key can't be added to a Dictionary
twice).
function getPairByKey (dict, hash, key, whenNotExists) {
var elem;
if (!dict._.keys.hasOwnProperty(hash)) {
whenNotExists();
} else {
var def = {};
elem = dict._.keys[hash].firstOrDefault(function (kvp) {
return dict._.eqComparer.equals(kvp.key, key);
}, def);
if (elem === def) {
whenNotExists();
}
}
return elem;
}
Dictionary.prototype.remove = function (key) {
var hash = this._.eqComparer.getHash(key);
var notFound;
var pair;
pair = getPairByKey(this, hash, key, function () {
notFound = true;
});
if (notFound) {
return false;
}
var keys = this._.keys[hash];
keys.remove(pair);
this._.entries.remove(pair);
if (!keys.any()) {
delete this._.keys[hash];
}
this.length -= 1;
return true;
};
Here are the functions to get if a key is present in the dictionary and to get the value of a specific key.
Dictionary.prototype.containsKey = function (key) {
var hash = this._.eqComparer.getHash(key);
return dictionaryContainsKey(this, hash, key);
};
Dictionary.prototype.get = function (key) {
var hash = this._.eqComparer.getHash(key);
return getPairByKey(this, hash, key, function () {
throw new Error("Key [" + key + "] was not found in the dictionary.");
}).value;
};
Here's another nice one, tryGet
(TryGetValue
in .NET). This function tries to get a value using the specified key. Normally, when you try to get an item using a key that does not exist an error is thrown. However, when using tryGet
you don't get an error, you get a boolean
indicating whether the key was found and if it was you get the value too. In .NET you get the value in an out
parameter, but JavaScript does not have that concept. Instead I'm returning an object containing a success
boolean and a value
object. When success
is true
then the value
holds the value for that key (which may be undefined
), when success
is false
value
will always be undefined
. This function is actually the only one where I had to work around a .NET out
parameter.
Dictionary.prototype.tryGet = function (key) {
var hash = this._.eqComparer.getHash(key),
notFound,
pair = getPairByKey(this, hash, key, function () {
notFound = true;
});
if (notFound) {
return {
success: false,
value: undefined
};
}
return {
success: true,
value: pair.value
};
};
The usage of a Dictionary
is, luckily, very easy.
var d = new Dictionary();
var billGates = {
firstName: "Bill",
lastName: "Gates"
};
var billClinton = {
firstName: "Bill",
lastName: "Clinton"
};
d.add(billGates, "Richest man in the world.");
d.add(billClinton, "Was president of the USA.");
console.log(d.containsKey(billGates));
console.log(d.get(billClinton));
d.remove(billClinton);
console.log(d.containsKey(billClinton));
To actually run this sample you'll need the full implementation of arrgh.js, not just the snippets I've shown so far.
We can now also implement Enumerable.toDictionary
.
function identity(x) {
return x;
}
Enumerable.prototype.toDictionary = function (keySelector, elementSelector, eqComparer) {
if (typeof arguments[1] === "function") {
elementSelector = arguments[1];
eqComparer = arguments[2];
} else {
eqComparer = arguments[1];
}
elementSelector = elementSelector || identity;
eqComparer = ensureEqComparer(eqComparer);
var d = new Dictionary(eqComparer);
this.forEach(function (elem) {
d.add(keySelector(elem), elementSelector(elem));
});
return d;
};
var names = new Enumerable(["John", "Annie", "Bill", "Sander"]);
var d = names.toDictionary(n => n[0]);
d = names.toDictionary(n => n[0], n => n.toUpperCase());
d = names.toDictionary(n => n, {
equals: function (x, y) {
return x === y;
},
getHash: function (obj) {
return obj[0];
}
});
d = names.toDictionary(n => n, n => n.toUpperCase(), {
equals: function (x, y) {
return x === y;
},
getHash: function (obj) {
return obj[0];
}
});
The toDictionary
function has a few overloads, elementSelector
and eqComparer
are both optional. So if the seconds argument to toDictionary
is a function it's elementSelector
, if it's an object it's eqComparer
.
You thought the Dictionary
was complicated? Then enter a world of pain, the world of the OrderedEnumerable
. Have you ever noticed how you can order a collection in .NET using someCollection.OrderBy(...).ThenBy(...).ThenByDescending(...).ToList()
? The OrderBy
returns an IEnumerable
, but not just any IEnumerable
, a special IOrderedEnumerable
which has the ThenBy
and ThenByDescending
extension methods. You won't see it on the outside in .NET, but the IOrderedEnumerable
actually keeps some internal variables like the collection you called OrderBy
or ThenBy
on and whether or not the sorting is ascending or descending. The tricky part is that, ultimately, you're going to enumerate the collection that ThenByDescending
returns, but it needs to know about its parent because ThenByDescending
needs to apply additional sorting and now overwrite the sorting of its parent. This is actually the only collection in LINQ that does not enumerate over its parent, but rather uses the parent to adjust its own enumeration.
First, I'm going to show you the methods on Enumerable
and OrderedEnumerable
as well as a comparer function. The comparer function compares two keys and returns a positive number when the first key is bigger than the second, a negative number when the first key is smaller than the second and a 0 when both keys are equal. In .NET I've found null
to be smaller than anything else, then NaN
(in case of double?
), then normal sorting like you'd expect. In my implementation I'm treating undefined
as smaller than null
. This is a fundamental difference with the JavaScript array sort
function that ignores undefined
and always places it at the end of the array (you can use a custom comparer, but undefined is still ignored). So, in my implementation, undefined
is not ignored and if you put in a custom comparer you can still move undefined
to the back to the collection.
function defaultCompare(x, y) {
if (isNull(x) || isNull(y)) {
var noVal = function (a, b, val) {
if (a === b) {
return 0;
}
if (a === val && b !== val) {
return -1;
}
if (a !== val && b === val) {
return 1;
}
};
var eq = noVal(x, y, undefined);
if (eq === undefined) {
return noVal(x, y, null);
}
return eq;
}
if (isActualNaN(x) && isActualNaN(y)) {
return 0;
}
if (isActualNaN(x)) {
return -1;
}
if (isActualNaN(y)) {
return 1;
}
if (x > y) {
return 1;
}
if (x < y) {
return -1;
}
return 0;
}
var OrderedEnumerable = function (source, keySelector, compare, descending) {
compare = compare || defaultCompare;
descending = descending ? -1 : 1;
};
inherit(OrderedEnumerable, Enumerable);
Enumerable.prototype.orderBy = function (keySelector, compare) {
return new OrderedEnumerable(this, keySelector, compare, false);
};
Enumerable.prototype.orderByDescending = function (keySelector, compare) {
return new OrderedEnumerable(this, keySelector, compare, true);
};
OrderedEnumerable.prototype.thenBy = function (keySelector, compare) {
return new OrderedEnumerable(this, keySelector, compare, false);
};
OrderedEnumerable.prototype.thenByDescending = function (keySelector, compare) {
return new OrderedEnumerable(this, keySelector, compare, true);
};
Using only this fairly easy code, save for the big bloated defaultCompare
function, we have everything we need to do the actual sorting. The prototype
functions are the only way for a user to get a reference to OrderedEnumerable
as the constructor is not exposed. The sorting uses a quicksort algorithm. This isn't an article about algorithms, but let me give you the basics. In a collection we take a so-called pivot value, ideally the middle value. We now take two counters, one that starts at 0 and one that starts at the last index of the collection (length - 1). We now compare the 0th index element to the pivot using the compare function, if the 0th element is bigger or equal to the pivot we stay on this index and move to the next loop (where index is length - 1), if the element is smaller than the pivot we move on to the next element and do the same. When we get to the next loop we do the same, except this time we check if the item is smaller than the pivot and if it is we leave it at that. After that we switch the values. Once we reach the pivot we're going to recursively do the same for all values on the left side of the pivot and all values on the right side of the pivot. Furthermore, quicksort is an in-place algorithm, meaning it alters the current collection, rather than creating a new one and keeping the input intact. Here's a little awkward visual representation of these steps.
2, 5, 3, 4, 1
^ p -
2, 5, 3, 4, 1
^ p -
2, 5, 3, 4, 1
- p ^
2, 1, 3, 4, 5
* *
2, 1, 3, 4, 5
p ^
1, 2, 3, 4, 5
*
1, 2, 3, 4, 5
p ^
1, 2, 3, 4, 5
sorted.
To make it even more complicated, the quicksort algorithm has one drawback, it's not stable. That means that if two elements are equal they may still be swapped losing their relative order in the collection. So if the input collection has Bill Clinton
and Bill Gates
, in that order, and we sort by first name, then the output collection may have switched their ordering to Bill Gates
and Bill Clinton
. In many scenarios this is not a problem, but .NET implements OrderBy
and ThenBy
with a stable quicksort. Luckily we can solve this relatively easy. Instead of comparing the actual elements of a collection we're going to order a list of indexes, each index maps to an element and if the elements of two indexes are equal we compare the indexes instead. Here is the implementation of the stable quicksort.
function stableQuicksort(map, startIndex, endIndex, compare) {
var low = startIndex,
high = endIndex,
pindex = Math.floor((low + high) / 2),
pivot = map[pindex],
lindex,
hindex,
result,
temp;
while (low <= high) {
lindex = map[low];
result = compare(lindex, pivot);
while (result < 0 || (result === 0 && lindex < pivot)) {
low += 1;
lindex = map[low];
result = compare(lindex, pivot);
}
hindex = map[high];
result = compare(hindex, pivot);
while (result > 0 || (result === 0 && hindex > pivot)) {
high -= 1;
hindex = map[high];
result = compare(hindex, pivot);
}
if (low <= high) {
temp = map[low];
map[low] = map[high];
map[high] = temp;
low += 1;
high -= 1;
}
}
if (low < endIndex) {
stableQuicksort(map, low, endIndex, compare);
}
if (high > startIndex) {
stableQuicksort(map, startIndex, high, compare);
}
}
And now, the moment you've all been waiting for, the implementation of the OrderedEnumerable
.
var OrderedEnumerable = function (source, keySelector, compare, descending) {
var self = this;
var keys;
var compare = compare || defaultCompare;
var descending = descending ? -1 : 1;
self.getSource = function () {
if (source.getSource) {
return source.getSource();
}
return source;
};
self.computeKeys = function (elements, count) {
var arr = new Array(count);
var i;
for (i = 0; i < count; i += 1) {
arr[i] = keySelector(elements[i]);
}
keys = arr;
if (source.computeKeys) {
source.computeKeys(elements, count);
}
};
self.compareKeys = function (i, j) {
var result = 0;
if (source.compareKeys) {
result = source.compareKeys(i, j);
}
if (result === 0) {
result = compare(keys[i], keys[j]) * descending;
}
return result;
};
Enumerable.call(this, function () {
var sourceArr = self.getSource().toArray();
var count = sourceArr.length;
var map = new Array(count);
var index;
self.computeKeys(sourceArr, count);
for (index = 0; index < count; index += 1) {
map[index] = index;
}
stableQuicksort(map, 0, count - 1, self.compareKeys);
index = -1;
return new Iterator(function () {
index += 1;
return index < count;
}, function () {
return sourceArr[map[index]];
});
});
};
I'm going to admit right away that cost me a couple of tries and a whole lot of time. So let's go through it step by step. Since we need to sort the entire source collection before we can enumerate the getIterator
function first sorts everything and then returns a rather small Iterator
.
First, getIterator
uses getSource
to evaluate the collection that needs to be sorted (this could be the result of a where
, a select
, etc.) and converts it to an array
. The getSource
function returns the first source
that is not an OrderedEnumerable
(tested on the presence of a getSource
function). So someCollection.where(...).orderBy(...).thenBy(...).thenByDescending().getIterator()
will sort the result of the where
function.
Next, we're going to compute the keys, the values that need to be sorted. We do this only once (and always once, even if we never need them). So suppose we need to sort a collection of people by firstName
, then keys
is now an array
containing "John", "Bill", "Steve", etc.
Then we create the map
, that is the indexes we're going to sort. Remember we need the indexes to do a stable sort. We then pass the map, the entire range of the collection (0 to length - 1) and the compareKeys
function to the stableQuicksort
function which works its magic.
The compareKeys
function does the actual comparing and returns a positive integer, a negative integer or 0. The nice part is that if the source
contains a compareKeys
function it uses that function. Only when the source's compareKeys
returns 0 does the current function compare it's keys. So in case of someCollection.orderBy(p => p.firstName).thenBy(p => p.lastName).toArray();
the lastName
of two elements is only compared when the firstName
of those elements is equal. Keep in mind we're comparing indexes, so we need to get the actual value from the keys
array.
The stableQuicksort
mixes up the indexes around based on the keys they map to. That means the map
is sorted, but the source
is not. So in the Iterator
, using the current index, we can get the index of the element in the sourceArr
using the index in the map
. Here's a little example.
var sourceArr = [2, 5, 3, 4, 1];
var sourceArrComparer = function (x, y) {
return defaultCompare(sourceArr[x], sourceArr[y]);
};
var map = [0, 1, 2, 3, 4];
stableQuicksort(map, 0, 4, sourceArrComparer);
console.log(map);
console.log(sourceArr[map[0]]);
console.log(sourceArr[map[1]]);
And with this stableQuicksort
function we can also write the sort
function on the List
(which is an in-place sort). The List
sort
allows to sort an entire List
or only a part of a List
, but we've got that all covered. So you can study that code at your own leisure.
The Lookup
is not very complicated, but also not very pretty (I've decided to put it all in a big function so it has no additional prototype
functions). A lookup is basically a collection of collections where each collection has a key that groups them together. Internally it uses a Dictionary
(which is basically also implemented as a lookup, come to think of it).
var Lookup = function (source, keySelector, elementSelector, eqComparer) {
var d;
Enumerable.call(this, function () {
var iterator = d.getIterator();
return new Iterator(iterator.moveNext, function () {
var current = iterator.current();
if (isNull(current)) {
return current;
}
var group = current.value.asEnumerable();
group.key = current.key;
return group;
});
});
if (typeof elementSelector !== "function") {
eqComparer = elementSelector;
elementSelector = null;
}
elementSelector = elementSelector || identity;
d = new Dictionary(eqComparer);
source.forEach(function (elem) {
var key = keySelector(elem);
var element = elementSelector(elem);
if (d.containsKey(key)) {
d.get(key).add(element);
} else {
d.add(key, new List([element]));
}
});
this.length = d.length;
this.get = function (key) {
var group;
if (d.containsKey(key)) {
group = d.get(key).asEnumerable();
group.key = key;
} else {
group = new Enumerable();
group.key = key;
}
return group;
};
};
inherit(Lookup, Enumerable);
As you can see the source is iterated and each value with a certain key is added to a List
that is associated with that key. It's basically the keys object in a Dictionary
, except that the key in this case is explicitly not a hash. In the get
function we see that the List
with a specified key is returned as an Enumerable
(making it read-only) and the key
is added to the Enumerable
. When a key is not present an empty Enumerable
is returned with the specified key
attached to it. When enumerating we do pretty much the same.
As with the OrderedEnumerable
the only method for a user to get a Lookup
is by using the toLookup
function on an Enumerable
.
Enumerable.prototype.toLookup = function (keySelector, elementSelector, eqComparer) {
if (typeof arguments[1] === "function") {
elementSelector = arguments[1];
eqComparer = arguments[2];
} else {
eqComparer = arguments[1];
}
elementSelector = elementSelector || identity;
eqComparer = ensureEqComparer(eqComparer);
return new Lookup(this, keySelector, elementSelector, eqComparer);
};
var names = new Enumerable(["Bianca", "John", "Bill", "Annie", "Barney"]);
var l = names.toLookup(n => n[0]);
l = names.toLookup(n => n[0], n => n.toUpperCase());
l = names.toLookup(n => n, {
equals: function (x, y) {
return x === y;
},
getHash: function (obj) {
return obj[0];
}
});
l = names.toLookup(n => n, n => n.toUpperCase(), {
equals: function (x, y) {
return x[0] === y[0];
},
getHash: function (obj) {
return obj[0];
}
});
Unfortunately, I can't show you all there is to arrgh.js in just one article. The Enumerable
class already has 55 functions of which most have at least one overload. However, I have shown you all classes in arrgh.js as well as some functions. The full documentation should help you get on your way. You might want to check a certain function in .NET, chances are arrgh.js works the same.
So a few remarks. In most cases I don't check for parameter types. Everything works as documented, but if you pass a string where an object is expected who knows what will happen. The upside to this approach is that I can omit over 100 type checks which really helps in keeping the code small and fast. The downside, of course, is that a function may produce an error or, worse, incorrect results, and you might never know (of course you do, you tested your code).
Also, I thought it might be useful to mention what arrgh.js exposes.
var arrgh = (function (undefined, MAX_SAFE_INTEGER) {
"use strict";
return {
Enumerable: Enumerable,
Dictionary: Dictionary,
Iterator: Iterator,
List: List
};
}(undefined, Number.MAX_SAFE_INTEGER || 9007199254740991));
The Number.MAX_SAFE_INTEGER
is necessary for the Enumerable.range
which has an upper limit of, you guessed it, Number.MAX_SAFE_INTEGER
(which is not supported in older browsers, hence the literal value).
arrgh.js is a pretty well tested library if I dare say so myself. It currently counts 779 tests that ensure correct and consistent results in IE8, IE11, Firefox and Chrome (I'm on a Win7 machine, so no Edge or Safari, but I don't see why they shouldn't work). It would be silly to walk you through all those tests, but it's pretty useful to show you how they work and how to get them to work for you. For the remainder of this article we'll need Node.js and npm, so head over to their website, download it and install it. npm is the package manager for Node.js.
When you're done installing open up a command prompt and navigate to the arrgh.js root folder. Once there, create a package.json
, either manually (create a file called package.json
and put {}
in it) or by typing npm init
in the command prompt. Once you've got a package.json install Jasmine, the testing framework of choice, using npm install jasmine --save-dev
.
cd C:\arrgh.js
npm install jasmine --save-dev
npm will create a node_modules
folder and install everything that's necessary for Jasmine to run. Additionally, --save-dev
will create a developer dependency in your package.json. That means you don't have to check in your packages to source control, but you can easily restore all your dependencies using npm install
.
Now let's write some tests! Create a folder called test
in your root. We now need two things, tests, and a page that shows us the test results. Let's first create the page for the test results as it's pretty easy. In the test folder create an html file, I've called it index.browser.html (there is also an index.html file in my project, but it's for an automated Jenkins build environment, which I'll not cover in this article). In the html file post the following HTML.
<!doctype html>
<html>
<head>
<title>arrgh.js tests</title>
<link rel="shortcut icon" type="image/png" href="../node_modules/jasmine-core/images/jasmine_favicon.png">
<link rel="stylesheet" href="../node_modules/jasmine-core/lib/jasmine-core/jasmine.css">
</head>
<body>
<script src="../node_modules/jasmine-core/lib/jasmine-core/jasmine.js"></script>
<script src="../node_modules/jasmine-core/lib/jasmine-core/jasmine-html.js"></script>
<script src="../node_modules/jasmine-core/lib/jasmine-core/boot.js"></script>
<script src="../src/arrgh.js"></script> <!-- Path to your arrgh.js file -->
<script src="spec/tests.js"></script>
</body>
</html>
For this example I'll use a single tests.js, but I have actually divided my tests into separate files. Ultimately tests.js is going to run them all.
Now, in the test folder, create a folder called spec
. This is where your actual tests are going to be. In that folder create the file tests.js. Writing a test is now easy as pie. Jasmine is a Behavior-Driven Development (BDD) framework, meaning we're going to describe what a test should do and then do it.
describe("arrgh.Enumerable", function () {
describe("toArray", function () {
it("should produce an array containing some elements", function () {
var e = new arrgh.Enumerable([1, 2, 3, 4, 5]);
expect(e.toArray()).toEqual([1, 2, 3, 4, 5]);
});
it("should produce an empty array", function () {
var e = new arrgh.Enumerable();
expect(e.toArray()).toEqual([]);
});
});
describe("count", function () {
it("should have the count of the initial array", function () {
var e = new arrgh.Enumerable([1, 2, 3, 4, 5]);
expect(e.count()).toBe(5);
});
});
});
describe
does exactly what it says, describe what's about to come. You can nest it as much as you want. Somewhere in your describes you're going to have an it
. Inside the it
you write your test and compare your expected value to your actual outcome, using expect
and toEqual
. toEqual
compares reference types such as objects and arrays for equality (without them having to have the same reference). toBe
compares value types such as integers and booleans. You could use toEqual
instead of toBe
, but not the other way around.
If you open the html page we created earlier you should now see the results of your tests. Try failing a test and see what happens.
As I said, I have separated my tests in different files. In tests.js I declare some global variables to be used in my other tests (actually, Jasmine has a better solution for this, but somehow it was flaky across browsers, so I went with the globals). The separation is based on the type of collection, iterators and some tests that group well together, such as all joins.
So, in a nutshell, here's tests.js.
var p0 = {
first: "Sander",
last: "Rossel",
age: 28,
hobbies: ["Programming" ,"Gaming" ,"Music"]
};
var p1 = {
first: "Bill",
last: "Murray",
age: 58,
hobbies: ["Hiking", "Travelling"]
};
(function () {
"use strict";
describe("arrgh.js tests", function () {
testEnumerable();
});
}());
And then in test-Enumerable.js.
var testEnumerable = function () {
"use strict";
describe("Enumerable", function () {
describe("contains", function () {
it("should return true when the collection contains the object", function () {
var e = new arrgh.Enumerable(people);
expect(e.contains(p3)).toBe(true);
});
});
});
});
I should mention that I haven't checked the actual ages and hobbies of the (famous) people in the test set.
The next thing we'll do is generate some documentation with JSDoc. You can install JSDoc using npm. Next to installing it in the current project we're also going to install it globally so we can use the CLI (Command Line Interface).
npm install jsdoc --save-dev
npm install jsdoc -g
JSDoc works with annotations in comments. You can simply put comments in any document, run it through JSDoc and end up with some nice documentation. The type of comment has to be /**/ as // is ignored. The annotations start with @. I've chosen to comment the functions directly in my source code. Don't worry about the size of the file as we'll remove all comment (except one) later when we're going to minimize the source.
We'll start with documenting the global namespace.
var arrgh = (function (undefined, MAX_SAFE_INTEGER) {
After that we can document types within that namespace.
var List = function (enumerable) {
@memberof
, @constructor
, and @extends
are pretty self explaining. The default format of @param
is @param {type} name - description
. For an optional parameter put the name between brackets, like [name]
. For a default value add =default, like [enumerable=[]]
(default is an empty array). When a function accepts a parameter of multiple types you can simply list them like in the example, {(type1, type2, etc.)}
. An asterix indicates any type is accepted.
Now that we have a constructor we can document functions for it.
List.prototype.get = function (index) {
It's also possible to document global types, like input functions, callbacks, or objects like the equality comparer.
var Lookup = function (source, keySelector, elementSelector, eqComparer) {
Let's generate some documentation. Open up the command prompt and simply type jsdoc myFile.js
. JSDoc should create an out folder with the generated documentation. The documentation should now look something like this.
You can style the output and write it to specific folders using CLI parameters, but we're going to do that in a moment using Gulp.
Next up we're going to automate some stuff. Every time I save a file, source or test, I want to lint my JavaScript, run tests, generate documentation, minify code and whatever so my heart desires. And if I can automate that I also want to be able to run it all once with a single command. Enter Gulp.
Gulp is a build automation tool. With Gulp we take some input, run it through some task and pass the output as input to the next task. Finally, we write the final output to some destination, like a file or the console.
We're going to install Gulp in our project and globally so we can make easy use of the CLI again.
npm install gulp --save-dev
npm install gulp -g
Next, we're going to create a file in the root folder of our project and call it gulpfile.js
. Gulp, on its own, doesn't do much. We're going to need a couple of plugins. Let's start by linting our source file. To do this we'll first need the gulp-minify plugin.
npm install gulp-minify --save-dev.
Now that we have Gulp and the minify plugin we can actually put some useful code in the gulpfile.
var gulp = require('gulp');
var minify = require('gulp-minify');
gulp.task('minify', function () {
return gulp.src('src/*.js')
.pipe(minify({
ext: {
src: '.debug.js',
min: '.js'
},
preserveComments: 'some'
}))
.pipe(gulp.dest('dist'));
});
gulp.task('default', function () {
gulp.start('minify');
});
This may look familiar if you've done some work in Node.js before. We start off by requiring gulp
and gulp-minify
. These two lines of code will load the packages into our runtime (from the node_modules folder). Next, we're creating a task in gulp
called minify
. We're getting the source using gulp.src
and pipe it to the minify module. We can pass a config object to the minify module specifying the extensions for our source file and our minified file. The minified file is going to be called arrgh.js
and the source file arrgh.debug.js
. We're preserving some comments (being the license at the top). The result is piped again and written to the dist
folder.
In the default task we're running the minify task.
Open up the command, navigate to your project folder and now simply run gulp
. This will run the default task. You can also run a specific task by specifying its name.
gulp
gulp minify
Either of those will run the gulpfile and minify our source.
For subsequent runs we want to delete all former build files and start fresh. We also want to lint our JavaScript. We'll need some more plugins.
npm install gulp-clean --save-dev
npm install jshint --save-dev
npm install gulp-jshint --save-dev
The gulpfile now looks as follows.
var gulp = require('gulp');
var minify = require('gulp-minify');
var clean = require('gulp-clean');
var jshint = require('gulp-jshint');
gulp.task('clean', function () {
return gulp.src([
'dist/'
], { read: false })
.pipe(clean());
})
.task('minify', ['clean'], function () {
return gulp.src('src/*.js')
.pipe(minify({
ext: {
src: '.debug.js',
min: '.js'
},
preserveComments: 'some'
}))
.pipe(gulp.dest('dist'));
})
.task('lint', function () {
return gulp.src('src/*.js')
.pipe(jshint('jshint.conf.json'))
.pipe(jshint.reporter('default'));
});
gulp.task('default', function () {
gulp.start(['minify', 'lint']);
});
The minify task is now depending upon the clean task. We can't minify until we've cleaned our old stuff. The default task will now run minify
(which will run clean
) and lint
. The lint task makes use of jshint which can take a json file as parameter. That's really cool as we can now configure jshint in an external file and keep our gulpfile clean. So create a jshint.conf.json
file in your project folder. You can find all kind of configuration options in the jshint documentation. Here's my config file.
{
"bitwise": true,
"curly": true,
"eqeqeq": true,
"esversion": 3,
"forin": true,
"freeze": true,
"futurehostile": true,
"latedef": true,
"nocomma": true,
"nonbsp": true,
"nonew": true,
"notypeof": true,
"strict": true,
"undef": true,
"unused": true
}
Next is the JSDoc task. For arrgh.js I've also installed another template as I didn't like the default very much. The template is called jaguarjs,
npm install gulp-jsdoc3 --save-dev
npm install jaguarjs-jsdoc --save-dev
The gulp-jsdoc3 also makes use of an external config file. It tells JSDoc where to write too, what template to use, and you can also configure your template if the template supports it. In the published version I've also got a README.md (that's also used by GitHub) which I've included in this config file and is written to the index page. We don't have it here so I have omitted it.
{
"tags": {
"allowUnknownTags": true,
"dictionaries": ["jsdoc"]
},
"templates": {
"applicationName": "arrgh.js",
"meta": {
"title": "arrgh.js",
"description": "A lightweight JavaScript library that brings proper .NET-like collections and LINQ to the browser.",
"keyword": "JavaScript, LINQ, collections, Array"
}
},
"opts": {
"destination": "docs",
"private": true,
"template": "node_modules/jaguarjs-jsdoc"
}
}
The funny thing with the JSDoc plugin is that it's a bit of an anti-pattern in Gulp. It doesn't pipe anything, it just takes input, writes documentation and pipes the input to the next job. No matter though, we can still create a task to generate our documentation.
var gulp = require('gulp');
var minify = require('gulp-minify');
var clean = require('gulp-clean');
var jshint = require('gulp-jshint');
var jsdoc = require('gulp-jsdoc3');
gulp.task('clean', function () {
return gulp.src([
'dist/',
'docs/'
], { read: false })
.pipe(clean());
})
.task('minify', ['clean'], function () {
return gulp.src('src/*.js')
.pipe(minify({
ext: {
src: '.debug.js',
min: '.js'
},
preserveComments: 'some'
}))
.pipe(gulp.dest('dist'));
})
.task('lint', function () {
return gulp.src('src/*.js')
.pipe(jshint('jshint.conf.json'))
.pipe(jshint.reporter('default'));
})
.task('jsdoc', ['clean'], function () {
return gulp.src('src/*.js')
.pipe(jsdoc(require('./jsdoc.conf.json')))
});
gulp.task('default', function () {
gulp.start(['minify', 'lint', 'jsdoc']);
});
Next, we want to do all of these tasks automatically when we make a change to arrgh.js. We can do this using gulp.watch
.
gulp.watch(['src/*.js', '*.conf.json'], ['minify', 'lint', 'jsdoc']);
gulp.task('default', function () {
gulp.start(['minify', 'lint', 'jsdoc']);
});
Now whenever something changes in our source or in a config file Gulp will run the minify
, lint
and jsdoc
tasks. This time, when you run gulp
, you'll notice it won't terminate like it always did. That's because it's now watching your files. To terminate batch jobs in the command use ctrl+c.
My gulpfile does not currently look like that (although it did at one point), but you should now know how Gulp works and how to create and run tasks.
We're missing just one thing, automated testing! Unfortunately, using Jasmine alone we can't automate anything. We'll need a testing framework. There are a few out there, but I've chosen Karma. We'll start by installing Karma. We'll need a lot of plugins (again), so get ready to install Karma, the Jasmine plugin and some browser launchers (install the browser launchers that apply to you).
npm install karma --save-dev
npm install karma-jasmine --save-dev
npm install karma-chrome-launcher --save-dev
npm install karma-firefox-launcher --save-dev
npm install karma-ie-launcher --save-dev
Here's the Karma part in the gulpfile.
var karma = require('karma').Server;
.task('test', function (done) {
new karma({
configFile: __dirname + '/karma.conf.js',
}, function (err) {
if (err > 0) {
return done(err);
}
return done();
}).start();
});
Unfortunately, when err > 0
Karma (or Node.js) shows a really ugly and useless stack trace in the console. It can be helped using gulp-util
, but I'm not going into that in this article. Once again we have an external config file.
module.exports = function(config) {
config.set({
frameworks: ['jasmine'],
files: [
'src/*.js',
'test/spec/*.js'
],
reporters: ['progress'],
port: 9876,
autoWatch: true,
browsers: ['Chrome'],
singleRun: true
});
};
If you want to test every time one of the files changes set singleRun
to false
.
Now when you run gulp test from your console you should see a browser starting up, doing some Karma stuff, closing, and then have the result in your console.
Let's add some code coverage, for what use is a test if you don't know what it's covering.
npm install karma-coverage --save-dev
You'll only need to change your config file.
module.exports = function(config) {
config.set({
frameworks: ['jasmine'],
files: [
'src/*.js',
'test/spec/*.js'
],
preprocessors: {
'src/*.js': ['coverage']
},
reporters: ['progress', 'coverage'],
port: 9876,
autoWatch: true,
browsers: ['Chrome'],
singleRun: true,
coverageReporter: {
reporters: [
{ type : 'html', subdir: 'html' }
],
dir : 'test/coverage/',
check: {
global: {
statements: 95,
branches: 95,
functions: 95,
lines: 95
},
}
}
});
};
With this setup you'll get a nice and detailed HTML report in /test/coverage/html
. You can set some global thresholds too, so the task will fail whenever the minimum coverage is not met.
Not in the scope of this article, but good to know, you can install additional plugins for CI systems such as Jenkins and Travis. For example, Jenkins works with JUnit reports, so you could install karma-junit-reporter
. Jenkins also uses cobertura format for code coverage, which is already present in the karma-coverage
plugin, but should be configured (add another reporter to coverageReporter.reporters
).
So now we have everything linted, tested, minified and documented (don't forget to also test your minified file). Time to release this baby!
Now that we have everything in place let's see if we can publish arrgh.js to npm and NuGet. Obviously, it's my package, so you can't actually publish it. The name is already taken and it's mine. You could come up with a new name and publish it anyway, but that would be kind of lame on your part.
First we'll check out how to publish to npm. First of all you'll need an account on npmjs.com. You can sign up on the website or create a user in the console by typing npm adduser
(although I've never tried it). Once you've created an account you can login to npm using your console.
npm login
The console will now ask for your username and password. Now that you're logged in you can, theoretically, simply publish by using npm publish
.
cd myProject
npm publish
npm then looks at your package.json
file and uses it to create a page for your package. We haven't discussed the package.json yet, only that you needed it to save your developer dependencies. The package.json file contains information about your project, like the name, a description, the author, license, dependencies, etc. As far as I know name, version and description are mandatory, everything else is optional. Here is the most important part of my package.json file.
{
"name": "arrgh",
"version": "0.9.2",
"description": "A lightweight JavaScript library that brings proper .NET-like collections and LINQ to the browser.",
"main": "arrgh.js",
"files": [
"arrgh.js",
"arrgh.debug.js"
],
"scripts": {
"prepublish": "gulp && xcopy .\\dist\\arrgh.js .\\arrgh.js* && xcopy .\\dist\\arrgh.debug.js .\\arrgh.debug.js*"
}
}
The name and version need to be unique for every publish you do. So you can't publish any version of arrgh
, because the name arrgh belongs to me. My current published version is 0.9.2 so I can't publish that again.
By default npm will just publish your entire folder and sub folders, so you can specify specific files to target. The files will be published as they are, including their current path. So if you publish folder/subfolder/my-script.js
the published package will also contain folder/subfolder/my-script.js
. You can also create an npmignore
file which works the same as a gitignore
file (npm will ignore the files and patterns included in both gitignore and npmignore, but npmignore overrides gitignore). Some files, like the package.json, will be released no matter your settings. You can find such stuff in the npm developers documentation.
Here's a fun story. Personally, I hate it when I install a package that I just want to use and I get a ton of files that I'd never use, like a gulpfile, because why would I need a gulpfile as the consumer of a Node.js package!?. So I just wanted to publish the mandatory package.json, readme.md and the contents of the dist
folder, but not the folder itself. So I put dist/arrgh.js
and dist/arrgh.debug.js
in the files
field in the package.json and the dist
folder got distributed. Under version 1.0.0. No problem, I thought. I'll just unpublish it and publish it again, but correctly this time. So I unpublished it, thinking everything would be gone (even I can't see it), but it's actually still there and now I'll never be able to publish 1.0.0 again. So I went with 0.9.0 as a test instead...
Of course I'm here so you don't have to make that mistake. You can check out what would be published if you actually did by using npm pack
. This will create a tarball file with the would-be published files. Unfortunately, it's not possible to grab the files from dist
and put them in the root, so I used the package.json scripts
object to copy the contents of dist to the root folder before publishing. I'm also running Gulp, just in case. If the Gulp build or tests fail npm won't publish. You can also specify other scripts, like postpublish
, preinstall
, install
, etc.
When updating you can increment your version manually or using npm version major/minor/patch in the command. npm uses semvar internally.
And that's how I got my very own package on npm!
Next to npm I wanted to release my package to NuGet, because I usually code in C#. Before you start you'll need your package.json version of NuGet, which is called the nuspec
. Like the package.json the nuspec file contains information about your package. It's an XML file with fields such as id, version, title and description. It's not all that big, so here's my complete nuspec file for version 0.9.2.
="1.0"
<package>
<metadata>
<id>arrgh.js</id>
<version>0.9.2</version>
<title>arrgh.js</title>
<description>A lightweight JavaScript library that brings proper .NET-like collections and LINQ to the browser.</description>
<authors>Sander Rossel</authors>
<owners>Sander Rossel</owners>
<language>JavaScript</language>
<licenseUrl>https://spdx.org/licenses/MIT</licenseUrl>
<projectUrl>https://sanderrossel.github.io/arrgh.js/</projectUrl>
<requireLicenseAcceptance>false</requireLicenseAcceptance>
<releaseNotes>Fixed Dictionary implementation.</releaseNotes>
<copyright>Copyright 2016</copyright>
<tags>JavaScript LINQ collections array List HashMap Dictionary</tags>
</metadata>
<files>
<file src="dist\*.js" target="content\Scripts" />
</files>
</package>
Notice how I can very awesomely tell NuGet what files to include and where to put them!? Nice! The target has to start with lib
, content
or tools
, so I chose content
. You should read the entire Nuspec documentation for more information.
To publish a package to NuGet you'd probably use Visual Studio, but I haven't used VS at all for this project, so why start now? Instead, we can download the Windows x86 Commandline tool over at NuGet downloads. I've actually also included it in GitHub. To build a package simply start up the command and use nuget pack
.
cd folder_of_project_including_nuget_cli
nuget pack
This will create a nupkg
file that you can use for private repositories (you can create them in Visual Studio, just head over to your NuGet settings, create a new repository and target some folder on your computer, now NuGet will automatically pick up nupkg files in that folder) or you can upload them to NuGet manually. Of course, you'll need an account at NuGet too.
It is possible to push packages directly from the command, but I haven't actually tried it. You can find how to do it in the documentation. Uploading manually is so easy, however, that I haven't even bothered. Of course, it's nice (and necessary) if you have a CI server like Jenkins, TFS or Travis.
And that's how I got my very own package on NuGet!
Support for AMD/RequireJS and CommonJS/Node.js was added in release 1.1.0. You can find these changes on GitHub.
Everything works nice in the browser now, but this is JavaScript and the web, there is never one way to do anything. The same goes for loading JavaScript files. In the browser you can simply reference a script in your HTML and it will work (your package is exposed as a global variable), but that is just one way to expose your script. Another method of exposing your script is through AMD (Asynchronous Module Definition), supported by RequireJS.
require(['node_modules/arrgh/arrgh.js'], function (arrgh) {
});
This is easy as you do not have to reference all your scripts in the correct order in your HTML file.
Yet another method, used by Node.js, the back-end JavaScript solution, is CommonJS. CommonJS allows your to require files when needed and has the same benefits as RequireJS, but with even easier syntax.
var arrgh = require('arrgh');
Unfortunately, supporting either one of those methods requires us to change our JavaScript and will prevent us from "simply" loading the script in our HTML. Luckily, we can support all three.
var arrgh = function () { ... };
define(arrgh);
module.exports = arrgh();
window.arrgh = arrgh();
So basically, what it comes down to, is to check if RequireJS is loaded, if it is we use define
, if CommonJS is loaded we use module.exports
and else we attach arrgh
to the window
object.
This is fairly easy code that is also easily found on Google (and there is probably already an npm package for it as well).
(function (name, definition) {
"use strict";
if (typeof module !== "undefined") {
module.exports = definition();
}
else if (typeof define === "function" && typeof define.amd === "object") {
define(definition);
}
else {
window[name] = definition();
}
}("arrgh", function () { ... });
That code makes sure we use the correct method of loading scripts. Please note that these methods are mutually exclusive and that CommonJS takes precedence over AMD and AMD takes precedence over simple loading.
Of course we still need to test our stuff. I've added one simple AMD test that I executed manually.
describe("arrgh.js tests", function () {
it("should load using require.js", function (done) {
require(['../src/arrgh.js'], function (arrgh) {
expect(arrgh.Enumerable).not.toBeNull();
done();
});
});
});
The Node.js tests are a little more complex since we are going to run on a complete new platform. I've copied the tests.js file and created a file called node-spec.js (the naming is important for the jasmine-node library we're going to use). The file is exactly the same as the file it was copied from save for the definition of arrgh.js and the loading of the other scripts.
var arrgh = require('../../src/arrgh.js');
var fs = require('fs');
eval(fs.readFileSync('./test/spec/test-Enumerable_overridden.js','utf-8'));
eval(fs.readFileSync('./test/spec/test-Iterators.js','utf-8'));
It isn't my best code, but if it looks stupid and it works then it ain't stupid. We can now install jasmine-node and gulp-jasmine-node using npm. In our gulpfile we can add the following task.
var jasmineNode = require('gulp-jasmine-node');
[...]
.task('test-node', ['test'], function () {
return gulp.src('test/spec/node-spec.js')
.pipe(jasmineNode({ reporter: [ new jasmine.TerminalReporter({ color: true }) ] }));
})
Unfortunately, our tests run a little different in Node.js and [NaN]
does not equal [NaN]
anymore, so tests including NaN
had to be slightly changed.
it("should add NaN to the list", function () {
var l = new arrgh.List();
l.add(NaN);
var arr = l.toArray();
expect(arr.length).toBe(1);
expect(arr[0]).toBeNaN();
});
One more thing, to make require('arrgh')
work in Node.js we need our arrgh.js file to be named index.js. So, in our package.json we need to add an extra xcopy command "xcopy .\\dist\\arrgh.js .\\index.js*
" and we also need to add index.js to the files
node in package.json.
After some testing, changing the docs, and playing around we're ready to publish! I've published this change to 1.1.0 since 0.9 was actually already version 1.0... Anyway, enjoy these new additions to arrgh.js!
So, that was my journey to create my very own JavaScript LINQ library and publish it to npm and NuGet. I have learned a lot along the way and I hope I've passed some of that knowledge on to you, the reader. I'm currently using arrgh
for a project at work and it made some stuff really very easy and consise.
If you decide to use arrgh
as well please let me know, I'd love to hear your feedback! Any tips, improvements, bugs or missing features can be reported here or on GitHub issues.
Happy coding!