Arrgh.js - Bringing LINQ to JavaScript

Sander Rossel

5.00/5 (71 votes)

14 May 2017CPOL43 min read

78.3K

170

Creating a lightweight JavaScript library that brings proper .NET-like collections and LINQ to JavaScript.

Download arrgh.js-master.zip - 1.5 MB

Introduction
The code
Testing with Jasmine
Documentation using JSDoc
Automation with Gulp
- Karma
Publishing arrgh.js
- npm
- NuGet
Adding support for AMD/Require.js and CommonJS/Node.js
Conclusion

Introduction

In this article I want to talk about a JavaScript library I created and published to npm and NuGet. The library, dubbed arrgh.js, brings proper .NET like collections and LINQ to JavaScript. If you know LINQ in C# you already know how arrgh.js works!

In this article we'll see what arrgh.js is, how it was created, how it's tested and documented and how it's published to npm and NuGet. I've added the code, as it was at the time of this writing, to this article. For the most recent version of the code you can check out the GitHub repository. You can also check out the full documentation.

I'm pretty sure we've all had to work with an array in JavaScript one time or another.
An array that has a forEach function, but only starting IE9 (and, of course, my customer needed IE8 support).
An array that only recently added support for a contains functions, but called it includes (I recently read they chose includes over contains because adding contains to the standard would break a popular framework, what the...).
An array that's also a queue and a stack, and sort of partially a list, it can be added to, but not removed from.
Removing an item is as tedious as searching for the index of the item you want to remove, split the array at that point, skip an item, and put the remaining parts back together again. Manually.
And can you remember if you need splice or slice?

All in all I found the array to be one big frustration (and that can actually be said for JavaScript as a whole). Needless to say I went looking for alternatives. Basically, what I wanted was C# like collections with LINQ support in JavaScript. Of course this had been done before, but the libraries I found didn't meet all of my requirements, did not work in older browsers, were not sufficiently documented, did not have lazy evaluation, missed collection types such as a Dictionary (HashMap) or did not implement them how I wanted to use them. The best I found was linq.js, but this one wanted to look so much like C# it has PascalCased everything, while JavaScript uses camelCasing (later I found I had downloaded an old version since the latest version does use camelCasing).

So I decided to build my own JavaScript collections and LINQ. Also because that's just really fun to do. I called it arrgh.js, a combination of array and argh!, that last one being my screams of utter frustration when working with JavaScript and the array in particular.

If you want to skip everything and get straight to work you can install it using npm or NuGet.

npm install arrgh.js

Install-Package arrgh.js

The code

A lot of coding adventures start with an empty text file, as did mine. I've created a folder for the project, called arrgh.js, and then a folder for my source code, called src. Within that folder I created the arrgh.js file and started writing. I basically had two options, augment the JavaScript array class (which is considered bad practice and might break either the JavaScript array or my implementation in the future) or start from scratch and create my own collection object. I chose the latter.

arrgh.Enumerable

My first implementation was just a wrapper around an array. Simple and naïve.

JavaScript

var Enumerable = function (arr) {
   this.arr = arr;
};

Enumerable.prototype.forEach = function (callback) {
    var i;
    for (i = 0; i < this.arr.length; i += 1) {
        callback(this.arr[i], i);
    }
};

// Usage.
var e = new Enumerable(["Hello", "Enumerable"]);
e.forEach(function (s) {
    console.log(s);
});

And then I implemented where and select in pretty much the same way.

JavaScript

Enumerable.prototype.where = function (predicate) {
    var filtered = [];
    this.forEach(function (e, i) {
        if (predicate(e, i)) {
            filtered.push(e);
        }
    });
    return filtered;
};

// Usage.
var e = new Enumerable([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
var evens = e.where(function (n) {
    return n % 2 === 0;
});

Easy as that may be it wasn't going to cut it. Imagine, in a later stage, doing something like the following:

JavaScript

Enumerable.range(0, 1000000).where(isEven).take(10);

It would now go through a million elements, check if they're even and then take 10. If we wanted 10 we didn't need to go through more than 20 elements! So let's make this code so that we can lazy evaluate elements when necessary. We're going to implement the Iterator Pattern. This allows us to go through the elements of a collection one by one, meaning that if we ask the next 20 elements of a collection that is theoretically infinite we only have to compute the first 20 elements.

So, the Enumerable is going to implement a getIterator method (I wanted to call it getEnumerator, like in C#, but Enumerator is already reserved in JavaScript). getIterator will return an object that can return the next element of a collection. Of course that also means we'll have to rewrite the forEach method.

JavaScript

var ArrayIterator = function (arr) {
    var currentIndex = -1;
    this.moveNext = function () {
        currentIndex += 1;
        // Return whether more elements are available.
        return currentIndex < arr.length;
    }
    this.current = function () {
        return arr[currentIndex];
    };
};

var Enumerable = function (arr) {
    this.getIterator = function () {
        return new ArrayIterator(arr);
    }
};

Enumerable.prototype.forEach = function (callback) {
    var iterator = this.getIterator();
    var currentIndex = 0;
    while (iterator.moveNext()) {
        callback(iterator.current(), currentIndex);
        currentIndex += 1;
    }
};

// Usage.
var e = new Enumerable(["Hello", "Enumerable"]);
e.forEach(function (s) {
    console.log(s);
});

The Iterator supports two functions, moveNext and current. This should be familiar as .NET's IEnumerator supports the same. Notice I didn't implement a reset method because .NET only implemented it for COM interoperability, which JavaScript does not do. The same goes for dispose in the generic IEnumerator<T>. The usage, funny enough, stays the same.

Now if we look at the forEach method we see that it first asks for the Iterator using getIterator. Then it moves through the collection by calling moveNext and current. When no more elements are to be has moveNext returns false and forEach stops looping and returns to the caller. Now there are a few rules that every Iterator should take into account. First, moveNext can be called forever, but once it returns false each subsequent call should return false as well. Also, whenever moveNext returns false, current should return undefined.

There is just a small tweak I want to give to the forEach method. It should implement a sort of break functionality. You don't want to be forced to loop through the entire collection every time you call forEach. So having the callback return false or any falsey value (excluding undefined and null) will break the loop.

JavaScript

function isNull(obj) {
    return obj === undefined || obj === null;
}

Enumerable.prototype.forEach = function (callback) {
    var iterator = this.getIterator();
    var cont;
    var currentIndex = 0;
    while ((isNull(cont) || cont) && iterator.moveNext()) {
        cont = callback(iterator.current(), currentIndex);
        currentIndex += 1;
    }
};

// Usage.
var e = new Enumerable(["Hello", "Enumerable"]);
e.forEach(function (s) {
    console.log(s);
    // Break after the first element.
    // "Enumerable" will never log.
    return false;
});

As you can see it's now possible to break out of the loop by returning a falsey value. I decided to not include undefined and null because undefined is the default return value for any function and I don't want to force the user to always explicitly return true (or any thruthey value). For simplicity, I chose to treat undefined and null as being the same value (that is, not a value). This is actually how forEach ended up in the published version of arrgh.js.

Using this design we have to completely rethink how a function such as where works. In the previous example it returned an array, which is a no go. If we return an Enumerable instead we can chain our functions. However, an Enumerable expects an array as input, which is also not applicable when we use it in where. The point about this whole Iterator thing is that we don't want to evaluate the result right away, instead we want to return an Iterator for a follow up function to use. That sounds awfully difficult, but I believe a code sample says more than a thousand words.

JavaScript

var isArray = function (obj) {
    return Object.prototype.toString.call(obj) === "[object Array]";
};

var Enumerable = function (enumerable) {
    var getIterator;
    if (isArray(enumerable)) {
        getIterator = function () {
            return new ArrayIterator(enumerable);
        };
    } else if (typeof enumerable=== "function") {
        getIterator = enumerable;
    } else {
        throw new Error("Invalid input parameter.");
    }
    this.getIterator = getIterator;
};

var WhereIterator = function (source, predicate) {
    var iterator = source.getIterator();
    var index = -1;
    var current;
    this.moveNext = function () {
        while (iterator.moveNext()) {
            index += 1;
            current = iterator.current();
            if (predicate(current, index)) {
                return true;
            }
        }
        current = undefined;
        return false;
    };
    this.current = function () {
        return current;
    };
};

Enumerable.prototype.where = function (predicate) {
    var self = this;
    return new Enumerable(function () {
        return new WhereIterator(self, predicate);
    });
};

// Create an alias.
Enumerable.prototype.filter = Enumerable.prototype.where;

// Usage.
var e = new Enumerable([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
var evens = e.where(function (n) {
    return n % 2 === 0;
});

First of all, the isArray function. It's utter madness, but this is the only 100% thrustworthy way to tell if an object is an array in JavaScript. There is actually an npm package for just that one line of code, which is also utter madness. Newer browsers have this function implemented by default, but I wanted this library to be compatible with IE8. I also want it to be lightweight, meaning no dependencies on other libraries. So that's isArray.

Next, as you can see, we made the Enumerable accept a parameter that may be an array or a function (which is assumed to be a getIterator function). This allows us to create Enumerables with all sorts of Iterators. The Iterator overload is used by the where function, which passes to it a function that creates a WhereIterator.

Now the WhereIterator, at first sight, looks like an hideous beast (although it's a puppy compared to some other Iterators we'll end up with). The where function is always called on an Enumerable, which will be the source to filter. We get the Iterator of the source and then simply move through it. Whenever an item satisfies the condition we return to the caller and indicate there are possibly more values. When the source has no more items we set the current to undefined and return to the caller indicating no more items are found. Again, the usage of where remains the same.

Last, but not least, we create an alias for where. I thought that would be nice as JavaScript, and other languages as well, use the name filter instead of where.

Because where now returns another Enumerable it becomes really hard to debug this code, after all, the only way of knowing what's in an Enumerable is by enumerating over it using forEach. So let's create another function real quick, toArray. With toArray we can simply convert an Enumerable to a regular JavaScript array and continue our business like usual.

JavaScript

Enumerable.prototype.toArray = function () {
    var arr = [];
    this.forEach(function (elem) {
        arr.push(elem);
    });
    return arr;
};

// Usage.
var e = new Enumerable([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]);
var evens = e.where(function (n) {
    return n % 2 === 0;
}).toArray();

Another useful, and really easy, function, for making any collection read-only, is asEnumerable, which we can implement now that the Enumerable takes a function as input.

JavaScript

Enumerable.prototype.asEnumerable = function () {
    return new Enumerable(this.getIterator);
};

arrgh.Iterator

Now if you test the previous code in the debugger you'll notice getIterator return an ArrayIterator or a WhereIterator. In the final library there are about 20 Iterators. Wouldn't it be nice if getIterator always returned Iterator? It allows us to check if an object is any Iterator. So at the very least we're going to need a base class for Iterator.

So, two choices. Either create a base class Iterator and then inherit ArrayIterator and WhereIterator, or create a class Iterator and pass the moveNext and current functions to it, keeping the internals to their respective functions. I chose the latter. Let's see what that means for our code.

JavaScript

var Iterator = function (moveNext, current) {
    this.moveNext = moveNext;
    this.current = current;
};

var getArrayIterator = function (arr) {
    var len = arr.length,
    index = -1;
    return new Iterator(function () {
        if (arr.length !== len) {
            throw new Error("Collection was modified, enumeration operation may not execute.");
        }
        index += 1;
        return index < len;
    }, function () {
        return arr[index];
    });
};

var Enumerable = function (enumerable) {
    var getIterator;
    if (isArray(enumerable)) {
        getIterator = function () {
            return getArrayIterator(enumerable);
        };
    } else if (typeof enumerable=== "function") {
        getIterator = enumerable;
    } else {
        throw new Error("Invalid input parameter.");
    }
    this.getIterator = getIterator;
};

Enumerable.prototype.where = function (predicate) {
    var self = this;
    return new Enumerable(function () {
        var iterator = self.getIterator();
        var index = -1;
        var current;
        return new Iterator(function () {
            while (iterator.moveNext()) {
                index += 1;
                current = iterator.current();
                if (predicate(current, index)) {
                    return true;
                }
            }
            current = undefined;
            return false;
        }, function () {
            return current;
        });
    });
};

As you can see the ArrayIterator and WhereIterator are gone and there is only a single Iterator constructor function left. This approach has pros and cons. The pro is, obviously, that there is only one Iterator class now that can change its behavior according to the constructing function. Another (arguably negligible) pro is that this syntax is a little shorter saving about 2 KB (on 20 KB) in the minified version. The cons are that our code now makes heavy use of closures (which isn't necessarily bad) and that the implementing classes are a little harder to read. I've tucked away the Iterator for arrays in a getArrayIterator function because this Iterator is used by the List class as well, as we'll see in a moment (for that reason I also added the check if length has not changed).

What we've seen so far are the basics of Enumerables and arrgh.js. All other methods, like all, distinct, any, orderBy and select are implemented using the same methodology as where. Simply return an Enumerable with a customer Iterator.

The currently released version of Enumerable is a little more complex, as it also accepts other Enumerables and strings as input and also multiple input parameters (like C# params). It's just a few more if's though, so you shouldn't have a problem figuring that out.

By the way, here are the implementations of range and take, so you can test out that little code snippet Enumerable.range(0, 1000000).where(n => n % 2 === 0).take(10); and see that it really just evaluates 18 values.

JavaScript

Enumerable.range = function (start, count) {
    if (!isNull(count)) {
        if (count < 0) {
            throw new Error("Count cannot be lower than 0.");
        }
        if (start + (count - 1) > Number.MAX_SAFE_INTEGER) {
            throw new Error("Start and count can not exceed " + MAX_SAFE_INTEGER + ".");
        }
    }
    return new Enumerable(function () {
        if (isNull(count)) {
            var moved = false;
            return new Iterator(function () {
                if (!moved) {
                    moved = true;
                } else {
                    start += 1;
                }
                return start <= Number.MAX_SAFE_INTEGER;
            }, function () {
                if (!moved || start > Number.MAX_SAFE_INTEGER) {
                    return undefined;
                }
                return start;
            });
        } else {
            var index = -1;
            return new Iterator(function () {
                index += 1;
                return index < count;
            }, function () {
                if (index === -1 || index >= count) {
                    return undefined;
                }
                return start + index;
            });
        }
    });
};

Enumerable.prototype.take = function (count) {
    var self = this;
    return new Enumerable(function () {
        var iterator = self.getIterator(),
        index = -1;
        return new Iterator(function () {
            index += 1;
            return index < count && iterator.moveNext();
        }, function () {
            if (index === -1 || index >= count) {
                return undefined;
            }
            return iterator.current();
        });
    });
};

Iterating over Enumerable.range(0) will actually go on until Number.MAX_SAFE_INTEGER (which is injected in the released version because of browser support), which is 9007199254740991, and will probably crash your browser, unless you limit the result with any, some, take, takeWhile, first or firstOrDefault).

arrgh.List

With the Enumerable in place we can continue with out next class, the List. To create a List we're going to inherit from Enumerable. Again, there is a package for this, but it's so very small and I don't want any dependencies, so I simply created my own helper method (don't ask about inherit and Temp, it's just more JavaScript absurdity).

JavaScript

var Temp = function () {
    // This will shut up JSLint :-)
    // Minify will remove 'return' so no precious bytes are lost.
    return;
};

function inherit(inheritor, inherited) {
    Temp.prototype = inherited.prototype;
    inheritor.prototype = new Temp();
    Temp.prototype = null;
    inheritor.prototype.constructor = inheritor;
}

var List = function (arr) {
    var self = this;

    Enumerable.call(this, function () {
        return getArrayIterator(self);
    });

    arr = arr || [];
    if (isArray(arr)) {
        var i;
        for (i = 0; i < arr.length; i += 1) {
            this[i] = arr[i];
        }
    } else {
        throw new Error("Invalid input parameter.");
    }
    this.length = arr.length;
};
inherit(List, Enumerable);

// Usage.
var l = new List(["Hello", "List"]);
console.log(l[0]);
console.log(l[1]);

The List constructor takes an array as input parameter and adds the contents of the array to itself. This seemed like a good idea, as List and array are now interchangeable in read-only scenarios, but I ran into a lot of problems when implementing add and remove methods. For example, what to do when the user of List adds the next index manually? When adding an index on an array the length is adjusted accordingly, but that is not something we can do ourselves. Likewise, when a user changes the length on an array indexes are added or removed accordingly, which is also not something we can do ourselves. Ultimately, I decided to keep the length property as it is (with the risk of an unsuspecting user changing it and breaking the List) and ditch the array-like approach.

The next approach was to encapsulate an array, like the C# List<T> class does as well. Unfortunately, JavaScript doesn't know private members, but the convention seems to be to prefix privates with an underscore. Personally, I prefer to create an object called _ (underscore) that contains all private members.

JavaScript

var List = function (arr) {
    var self = this;
    // Copy the original array so
    // manipulating the original array
    // will not affect the List in any way
    arr = arr ? arr.slice() : [];

    Enumerable.call(this, function () {
        return getArrayIterator(self._.arr);
    });

    this._ = {
        arr: arr
    };
    this.length = arr.length;
};
inherit(List, Enumerable);

List.prototype.get = function (index) {
    if (index < 0 || index >= this.length) {
        throw new Error("Index was out of range. Must be non-negative and less than the size of the collection.");
    }
    return this._.arr[index];
};

// Usage.
var l = new List(["Hello", "List"]);
console.log(l.get(0));
console.log(l.get(1));
// Throw error.
console.log(l.get(2));

Again, the List as it is currently published accepts more input parameters, such as other Enumerables and strings, but the basis remains an implicitly private array and an implicit read-only length property. Notice also how getArrayIterator is used.

So let's see how add and remove are implemented then.

JavaScript

List.prototype.add = function (item) {
    this._.arr.push(item);
    this.length += 1;
};

List.prototype.remove = function (item) {
    // indexOf inherited from Enumerable.
    var index = this.indexOf(item);

    if (index >= 0) {
        this._.arr.splice(index, 1);
        this.length -= 1;
        return true;
    }
    return false;
};

As you can see it's still array manipulation that makes you go argh!, but at least it's nicely encapsulated in a List class that has many useful functions and is consistent across browsers.

The indexOf function is inherited from Enumerable and I'm not going into it as it's just one of many. However, what's worth mentioning is that the List class actually overrides it. Since List knows the length of itself as well as the index of its contents, something Enumerable does not, we can optimize some functions, like indexOf, on List. I'm going to show you how that's done, but I'm going to show you using the count function instead.

JavaScript

Enumerable.prototype.count = function (predicate) {
    var count = 0;
    predicate = predicate || alwaysTrue;

    this.forEach(function (elem) {
        if (predicate(elem)) {
            count += 1;
        }
    });
    return count;
};

List.prototype.count = function (predicate) {
    if (!predicate) {
        return this.length;
    } else {
        return Enumerable.prototype.count.call(this, predicate);
    }
};

When the predicate is not specified the List can simply return its length property while an Enumerable must first evaluate all of its elements.

In the released version of arrgh.js the List constructor allows for Enumerables to be passed as input so the Enumerable.toList function is really very easy. Of course the List will have to enumerate over the collection to fill up an array internally.

JavaScript

Enumerable.prototype.toList = function () {
    return new List(this);
};

All in all the List class is not very complicated. Other functions it has are addRange, clear, insert, set and sort. Without a doubt sort is the most complex, but we'll get to that later (it reuses the orderBy functionality).

arrgh.Dictionary

The Dictionary, unlike the List, is a pretty complex beast! It's also the one class that I haven't seen satisfactory implemented in other LINQ libraries. If you know a bit about the JavaScript internals you'll know that every JavaScript object is actually implemented as a hashmap (which a Dictionary basically is). Believe me, I did some Googling to get this to work, but people on the internet usually refer to using the JavaScript object. Well, I've got a few objections to using an object as a Dictionary. First, the only keys it allows are strings. Second, it's not easily iterated over, you'll need to use a for loop and check for hasOwnProperty, and then JSLint will complain that you should actually use Object.keys (which is, of course, not supported in older browsers). Furthermore, objects miss a lot of nice functionality, like everything Enumerable gives us. An object is just not a collection.

The real problem lies in implementing a proper Dictionary. As said, it's also called a hashmap, but where are we going to get hashes? In .NET every object has a GetHashCode function that's actually implemented in some COM object, probably directly against your hardware (I don't know if that's true, but I do know it's pretty impossible for us mere mortals to implement). JavaScript doesn't have all that so we'll have to implement it ourselves. As I said, impossible, so I looked at the next best option which is... Using an object as hashmap.

So here's the first problem, objects only use strings as keys, but in .NET we can use any object as key, not just strings. So here's the deal, we're going to use the toString implementation of objects, which can be overridden. However, since toString is often used for debugging purposes or for presentation on screen we'll allow an extra custom getHash method. And if even that is not enough we'll allow an equality comparer for getting hashes and comparing keys. The equality comparer also solves another problem: hash collisions.

For this purpose I'll show you a default equality comparer as well as the add function before anything else, as they're used to determine the hash of an object.

JavaScript

function isActualNaN (obj) {
    return obj !== obj;
}

var defaultEqComparer = {
    equals: function (x, y) {
        return x === y || (isActualNaN(x) && isActualNaN(y)); // NaN edge case.
    },
    getHash: function (obj) {
        var hash;
        if (obj === null) {
            hash = "null";
        } else if (obj === undefined) {
            hash = "undefined";
        } else if (isActualNaN(obj)) {
            hash = "NaN";
        } else {
            hash = typeof obj.getHash === "function" ?
            obj.getHash() :
            typeof obj.toString === "function" ? obj.toString() : Object.prototype.toString.call(obj);
        }
        return hash;
    }
};

var Dictionary = function (eqComparer) {
    var self = this;

    Enumerable.call(self, function () {
        var iterator = self._.entries.getIterator();
        return new Iterator(function () {
            return iterator.moveNext();
        }, function () {
            var current = iterator.current();
            if (current) {
                return { key: current.key, value: current.value };
            }
            return undefined;
        });
    });

    this.length = 0;
    this._ = {
        eqComparer: ensureEqComparer(eqComparer),
        keys: {},
        entries: new List()
    };
};
inherit(Dictionary, Enumerable);

function dictionaryContainsKey (dictionary, hash, key) {
    if (dictionary._.keys.hasOwnProperty(hash)) {
        return dictionary._.keys[hash].contains(key, function (x, y) {
            return dictionary._.eqComparer.equals(x.key, y);
        });
    }
    return false;
}

Dictionary.prototype.add = function (key, value) {
    var hash = this._.eqComparer.getHash(key);
    if (dictionaryContainsKey(this, hash, key)) {
        throw new Error("Key [" + key + "] is already present in the dictionary.");
    }

    if (!this._.keys[hash]) {
        this._.keys[hash] = new List();
    }
    var pair = { key: key, value: value };
    this._.keys[hash].add(pair);
    this._.entries.add(pair);

    this.length += 1;
};

Luckily it's not quite as difficult as it looks. The crux is really in the defaultEqComparer. I'd like to note that undefined, null and NaN are valid keys and that NaN is checked for equality with NaN (normally, NaN === NaN yields false) An equality comparer has a getHash function and an equals function. The getHash function gets the hashes of objects, which, in our case, is really just a string. When two objects produce the same hash the equals function is used to check if the objects are equal (which is not always the case). An example will clear things up.

JavaScript

var d = new Dictionary({
    equals: function (x, y) {
        return x === y;
    },
    getHash: function (obj) {
        return obj.firstName;
    }
});

d.add({
    firstName: "Bill",
    lastName: "Gates"
});
d.add({
    firstName: "Bill",
    lastName: "Clinton"
});

Since both objects have a firstName of Bill, which is used as a hash, there is a hash collision (both objects produce the same hash). However, since the equals function deems the object not equal both objects are added to the Dictionary as a key (instead of throwing an error saying the key is already present).

In the add implementation you can see that the hash is calculated and when it's not present we add it to the keys object. The hash maps to a List object which is used to hold all values with that specific hash. The more elements have the same hash the slower the lookup of a key with that hash becomes. That's pretty important as hashmaps usually have an O(1) lookup time, but this is more like O(1-ish). The default hash for any JavaScript object is "[object Object]", so be sure to overwrite toString, implement a getHash or use a custom equality comparer or your lookup time will be that of a List. The keys object is basically going to look as follows.

JavaScript

key = {
    Bill: [
        Bill Clinton,
        Bill Gates
    ],
    AnotherHash: [ value ],
    ["[object Object]"]: [ objectsWithDefaultToString ]
};

Now, for the Dictionary Iterator. You'll notice there is a List of entries in the Dictionary. This is used in the Iterator. Using the hashes object we lose the ordering of our elements, so we keep all elements in entries as well. It makes iterating quite easy as we'll just have to iterate through the List. Notice that the key-value pair is copied during iteration. That's because the key-value pair to the client is read-only (of course clients can mess this up by altering _.entries directly). I should notice .NET makes use of a linked list internally while our List uses an array internally. There are pros and cons to both, like speed of updating (linked list wins) and memory usage (array wins).

Here's the remove function, which removes a key-value pair using a given key. The getPairByKey looks for the hash and then for the key in the mapped List. It does this using firstOrDefault, which returns the first instance of an item in the List or a default when the item is not found (note to self: could've been singleOrDefault instead as a key can't be added to a Dictionary twice).

JavaScript

function getPairByKey (dict, hash, key, whenNotExists) {
    var elem;
    if (!dict._.keys.hasOwnProperty(hash)) {
        whenNotExists();
    } else {
        var def = {};
        elem = dict._.keys[hash].firstOrDefault(function (kvp) {
            return dict._.eqComparer.equals(kvp.key, key);
        }, def);
        if (elem === def) {
            whenNotExists();
        }
    }
    return elem;
}

Dictionary.prototype.remove = function (key) {
    var hash = this._.eqComparer.getHash(key);
    var notFound;
    var pair;

    pair = getPairByKey(this, hash, key, function () {
        notFound = true;
    });
    if (notFound) {
        return false;
    }

    var keys = this._.keys[hash];
    keys.remove(pair);
    this._.entries.remove(pair);
    if (!keys.any()) {
        delete this._.keys[hash];
    }
    this.length -= 1;
    return true;
};

Here are the functions to get if a key is present in the dictionary and to get the value of a specific key.

JavaScript

Dictionary.prototype.containsKey = function (key) {
    var hash = this._.eqComparer.getHash(key);
    return dictionaryContainsKey(this, hash, key);
};

Dictionary.prototype.get = function (key) {
    var hash = this._.eqComparer.getHash(key);
    return getPairByKey(this, hash, key, function () {
        throw new Error("Key [" + key + "] was not found in the dictionary.");
    }).value;
};

Here's another nice one, tryGet (TryGetValue in .NET). This function tries to get a value using the specified key. Normally, when you try to get an item using a key that does not exist an error is thrown. However, when using tryGet you don't get an error, you get a boolean indicating whether the key was found and if it was you get the value too. In .NET you get the value in an out parameter, but JavaScript does not have that concept. Instead I'm returning an object containing a success boolean and a value object. When success is true then the value holds the value for that key (which may be undefined), when success is false value will always be undefined. This function is actually the only one where I had to work around a .NET out parameter.

JavaScript

Dictionary.prototype.tryGet = function (key) {
    var hash = this._.eqComparer.getHash(key),
    notFound,
    pair = getPairByKey(this, hash, key, function () {
        notFound = true;
    });
    if (notFound) {
        return {
            success: false,
            value: undefined
        };
    }
    return {
        success: true,
        value: pair.value
    };
};

The usage of a Dictionary is, luckily, very easy.

var d = new Dictionary();

var billGates = {
    firstName: "Bill",
    lastName: "Gates"
};
var billClinton = {
    firstName: "Bill",
    lastName: "Clinton"
};

d.add(billGates, "Richest man in the world.");
d.add(billClinton, "Was president of the USA.");

console.log(d.containsKey(billGates));
// Logs "true"

console.log(d.get(billClinton));
// Logs "Was president of the USA."

d.remove(billClinton);
console.log(d.containsKey(billClinton));
// Logs "false"

To actually run this sample you'll need the full implementation of arrgh.js, not just the snippets I've shown so far.

We can now also implement Enumerable.toDictionary.

JavaScript

function identity(x) {
    return x;
}

Enumerable.prototype.toDictionary = function (keySelector, elementSelector, eqComparer) {
    if (typeof arguments[1] === "function") {
        elementSelector = arguments[1];
        eqComparer = arguments[2];
    } else {
        eqComparer = arguments[1];
    }
    elementSelector = elementSelector || identity;
    eqComparer = ensureEqComparer(eqComparer);

    var d = new Dictionary(eqComparer);
    this.forEach(function (elem) {
        d.add(keySelector(elem), elementSelector(elem));
    });
    return d;
};

// Usage.
var names = new Enumerable(["John", "Annie", "Bill", "Sander"]);

// Names by first letter (throws if first letter is not unique).
var d = names.toDictionary(n => n[0]);

// Names as uppercase by first letter.
d = names.toDictionary(n => n[0], n => n.toUpperCase());

// Names using first letter as a key.
d = names.toDictionary(n => n, {
    equals: function (x, y) {
        return x === y;
    },
    getHash: function (obj) {
        return obj[0];
    }
});

// Names using first letter as a key and uppercased as value.
d = names.toDictionary(n => n, n => n.toUpperCase(), {
    equals: function (x, y) {
        return x === y;
    },
    getHash: function (obj) {
        return obj[0];
    }
});

The toDictionary function has a few overloads, elementSelector and eqComparer are both optional. So if the seconds argument to toDictionary is a function it's elementSelector, if it's an object it's eqComparer.

arrgh.OrderedEnumerable

You thought the Dictionary was complicated? Then enter a world of pain, the world of the OrderedEnumerable. Have you ever noticed how you can order a collection in .NET using someCollection.OrderBy(...).ThenBy(...).ThenByDescending(...).ToList()? The OrderBy returns an IEnumerable, but not just any IEnumerable, a special IOrderedEnumerable which has the ThenBy and ThenByDescending extension methods. You won't see it on the outside in .NET, but the IOrderedEnumerable actually keeps some internal variables like the collection you called OrderBy or ThenBy on and whether or not the sorting is ascending or descending. The tricky part is that, ultimately, you're going to enumerate the collection that ThenByDescending returns, but it needs to know about its parent because ThenByDescending needs to apply additional sorting and now overwrite the sorting of its parent. This is actually the only collection in LINQ that does not enumerate over its parent, but rather uses the parent to adjust its own enumeration.

First, I'm going to show you the methods on Enumerable and OrderedEnumerable as well as a comparer function. The comparer function compares two keys and returns a positive number when the first key is bigger than the second, a negative number when the first key is smaller than the second and a 0 when both keys are equal. In .NET I've found null to be smaller than anything else, then NaN (in case of double?), then normal sorting like you'd expect. In my implementation I'm treating undefined as smaller than null. This is a fundamental difference with the JavaScript array sort function that ignores undefined and always places it at the end of the array (you can use a custom comparer, but undefined is still ignored). So, in my implementation, undefined is not ignored and if you put in a custom comparer you can still move undefined to the back to the collection.

JavaScript

function defaultCompare(x, y) {
    if (isNull(x) || isNull(y)) {
        // Treat undefined as smaller than null
        // and both as smaller than anything else.
        var noVal = function (a, b, val) {
            if (a === b) {
                return 0;
            }
            if (a === val && b !== val) {
                return -1;
            }
            if (a !== val && b === val) {
                return 1;
            }
        };
        var eq = noVal(x, y, undefined);

        if (eq === undefined) {
            return noVal(x, y, null);
        }
        return eq;
    }

    // Treat NaN as smaller than anything else
    // except undefined and null.
    if (isActualNaN(x) && isActualNaN(y)) {
        return 0;
    }
    if (isActualNaN(x)) {
        return -1;
    }
    if (isActualNaN(y)) {
        return 1;
    }

    if (x > y) {
        return 1;
    }
    if (x < y) {
        return -1;
    }
    return 0;
}

var OrderedEnumerable = function (source, keySelector, compare, descending) {
    compare = compare || defaultCompare;
    descending = descending ? -1 : 1;
    // ...
};
inherit(OrderedEnumerable, Enumerable);

Enumerable.prototype.orderBy = function (keySelector, compare) {
    return new OrderedEnumerable(this, keySelector, compare, false);
};

Enumerable.prototype.orderByDescending = function (keySelector, compare) {
    return new OrderedEnumerable(this, keySelector, compare, true);
};

OrderedEnumerable.prototype.thenBy = function (keySelector, compare) {
    return new OrderedEnumerable(this, keySelector, compare, false);
};

OrderedEnumerable.prototype.thenByDescending = function (keySelector, compare) {
    return new OrderedEnumerable(this, keySelector, compare, true);
};

Using only this fairly easy code, save for the big bloated defaultCompare function, we have everything we need to do the actual sorting. The prototype functions are the only way for a user to get a reference to OrderedEnumerable as the constructor is not exposed. The sorting uses a quicksort algorithm. This isn't an article about algorithms, but let me give you the basics. In a collection we take a so-called pivot value, ideally the middle value. We now take two counters, one that starts at 0 and one that starts at the last index of the collection (length - 1). We now compare the 0th index element to the pivot using the compare function, if the 0th element is bigger or equal to the pivot we stay on this index and move to the next loop (where index is length - 1), if the element is smaller than the pivot we move on to the next element and do the same. When we get to the next loop we do the same, except this time we check if the item is smaller than the pivot and if it is we leave it at that. After that we switch the values. Once we reach the pivot we're going to recursively do the same for all values on the left side of the pivot and all values on the right side of the pivot. Furthermore, quicksort is an in-place algorithm, meaning it alters the current collection, rather than creating a new one and keeping the input intact. Here's a little awkward visual representation of these steps.

2, 5, 3, 4, 1
^     p     -

2, 5, 3, 4, 1
   ^  p     -

2, 5, 3, 4, 1
   -  p     ^

2, 1, 3, 4, 5
   *        *

2, 1, 3, 4, 5
p  ^

1, 2, 3, 4, 5
*

1, 2, 3, 4, 5
         p  ^

1, 2, 3, 4, 5

sorted.

To make it even more complicated, the quicksort algorithm has one drawback, it's not stable. That means that if two elements are equal they may still be swapped losing their relative order in the collection. So if the input collection has Bill Clinton and Bill Gates, in that order, and we sort by first name, then the output collection may have switched their ordering to Bill Gates and Bill Clinton. In many scenarios this is not a problem, but .NET implements OrderBy and ThenBy with a stable quicksort. Luckily we can solve this relatively easy. Instead of comparing the actual elements of a collection we're going to order a list of indexes, each index maps to an element and if the elements of two indexes are equal we compare the indexes instead. Here is the implementation of the stable quicksort.

function stableQuicksort(map, startIndex, endIndex, compare) {
    var low = startIndex,
    high = endIndex,
    pindex = Math.floor((low + high) / 2),
    pivot = map[pindex],
    lindex,
    hindex,
    result,
    temp;

    while (low <= high) {
        lindex = map[low];
        result = compare(lindex, pivot);
        // First loop, going from start to pivot.
        while (result < 0 || (result === 0 && lindex < pivot)) {
            low += 1;
            lindex = map[low];
            result = compare(lindex, pivot);
        }

        hindex = map[high];
        result = compare(hindex, pivot);
        // Second loop, going from end to pivot.
        while (result > 0 || (result === 0 && hindex > pivot)) {
            high -= 1;
            hindex = map[high];
            result = compare(hindex, pivot);
        }

        // Swap elements.
        if (low <= high) {
            temp = map[low];
            map[low] = map[high];
            map[high] = temp;
            low += 1;
            high -= 1;
        }
    }

    // Recursively sort collection left and right of the pivot.
    if (low < endIndex) {
        stableQuicksort(map, low, endIndex, compare);
    }
    if (high > startIndex) {
        stableQuicksort(map, startIndex, high, compare);
    }
}

And now, the moment you've all been waiting for, the implementation of the OrderedEnumerable.

JavaScript

var OrderedEnumerable = function (source, keySelector, compare, descending) {
    var self = this;
    var keys;
    var compare = compare || defaultCompare;
    var descending = descending ? -1 : 1;

    self.getSource = function () {
        if (source.getSource) {
            return source.getSource();
        }
        return source;
    };

    self.computeKeys = function (elements, count) {
        var arr = new Array(count);
        var i;
        for (i = 0; i < count; i += 1) {
            arr[i] = keySelector(elements[i]);
        }
        keys = arr;
        if (source.computeKeys) {
            source.computeKeys(elements, count);
        }
    };
    self.compareKeys = function (i, j) {
        var result = 0;
        if (source.compareKeys) {
            result = source.compareKeys(i, j);
        }
        if (result === 0) {
            result = compare(keys[i], keys[j]) * descending;
        }
        return result;
    };
    Enumerable.call(this, function () {
        var sourceArr = self.getSource().toArray();
        var count = sourceArr.length;
        var map = new Array(count);
        var index;
        self.computeKeys(sourceArr, count);
        for (index = 0; index < count; index += 1) {
            map[index] = index;
        }
        stableQuicksort(map, 0, count - 1, self.compareKeys);
        index = -1;
        return new Iterator(function () {
            index += 1;
            return index < count;
        }, function () {
            return sourceArr[map[index]];
        });
    });
};

I'm going to admit right away that cost me a couple of tries and a whole lot of time. So let's go through it step by step. Since we need to sort the entire source collection before we can enumerate the getIterator function first sorts everything and then returns a rather small Iterator.

First, getIterator uses getSource to evaluate the collection that needs to be sorted (this could be the result of a where, a select, etc.) and converts it to an array. The getSource function returns the first source that is not an OrderedEnumerable (tested on the presence of a getSource function). So someCollection.where(...).orderBy(...).thenBy(...).thenByDescending().getIterator() will sort the result of the where function.

Next, we're going to compute the keys, the values that need to be sorted. We do this only once (and always once, even if we never need them). So suppose we need to sort a collection of people by firstName, then keys is now an array containing "John", "Bill", "Steve", etc.

Then we create the map, that is the indexes we're going to sort. Remember we need the indexes to do a stable sort. We then pass the map, the entire range of the collection (0 to length - 1) and the compareKeys function to the stableQuicksort function which works its magic.

The compareKeys function does the actual comparing and returns a positive integer, a negative integer or 0. The nice part is that if the source contains a compareKeys function it uses that function. Only when the source's compareKeys returns 0 does the current function compare it's keys. So in case of someCollection.orderBy(p => p.firstName).thenBy(p => p.lastName).toArray(); the lastName of two elements is only compared when the firstName of those elements is equal. Keep in mind we're comparing indexes, so we need to get the actual value from the keys array.

The stableQuicksort mixes up the indexes around based on the keys they map to. That means the map is sorted, but the source is not. So in the Iterator, using the current index, we can get the index of the element in the sourceArr using the index in the map. Here's a little example.

var sourceArr = [2, 5, 3, 4, 1];
var sourceArrComparer = function (x, y) {
    return defaultCompare(sourceArr[x], sourceArr[y]);
};
var map = [0, 1, 2, 3, 4];
stableQuicksort(map, 0, 4, sourceArrComparer);
console.log(map); // [4, 0, 2, 3, 1]
console.log(sourceArr[map[0]]); // 1
console.log(sourceArr[map[1]]); // 2
// etc.

And with this stableQuicksort function we can also write the sort function on the List (which is an in-place sort). The List sort allows to sort an entire List or only a part of a List, but we've got that all covered. So you can study that code at your own leisure.

arrgh.Lookup

The Lookup is not very complicated, but also not very pretty (I've decided to put it all in a big function so it has no additional prototype functions). A lookup is basically a collection of collections where each collection has a key that groups them together. Internally it uses a Dictionary (which is basically also implemented as a lookup, come to think of it).

JavaScript

var Lookup = function (source, keySelector, elementSelector, eqComparer) {
    var d;
    Enumerable.call(this, function () {
        var iterator = d.getIterator();
        return new Iterator(iterator.moveNext, function () {
            var current = iterator.current();
            if (isNull(current)) {
                return current;
            }
            var group = current.value.asEnumerable();
            group.key = current.key;
            return group;
        });
    });

    // The elementSelector is optional,
    // putting the eqComparer at argument[2].
    if (typeof elementSelector !== "function") {
        eqComparer = elementSelector;
        elementSelector = null;
    }
    elementSelector = elementSelector || identity;

    d = new Dictionary(eqComparer);
    source.forEach(function (elem) {
        var key = keySelector(elem);
        var element = elementSelector(elem);
        if (d.containsKey(key)) {
            d.get(key).add(element);
        } else {
            d.add(key, new List([element]));
        }
    });

    this.length = d.length;
    this.get = function (key) {
        var group;
        if (d.containsKey(key)) {
            group = d.get(key).asEnumerable();
            group.key = key;
        } else {
            group = new Enumerable();
            group.key = key;
        }
        return group;
    };
};
inherit(Lookup, Enumerable);

As you can see the source is iterated and each value with a certain key is added to a List that is associated with that key. It's basically the keys object in a Dictionary, except that the key in this case is explicitly not a hash. In the get function we see that the List with a specified key is returned as an Enumerable (making it read-only) and the key is added to the Enumerable. When a key is not present an empty Enumerable is returned with the specified key attached to it. When enumerating we do pretty much the same.

As with the OrderedEnumerable the only method for a user to get a Lookup is by using the toLookup function on an Enumerable.

JavaScript

Enumerable.prototype.toLookup = function (keySelector, elementSelector, eqComparer) {
    if (typeof arguments[1] === "function") {
        elementSelector = arguments[1];
        eqComparer = arguments[2];
    } else {
        eqComparer = arguments[1];
    }
    elementSelector = elementSelector || identity;
    eqComparer = ensureEqComparer(eqComparer);

    return new Lookup(this, keySelector, elementSelector, eqComparer);
};

// Usage.
var names = new Enumerable(["Bianca", "John", "Bill", "Annie", "Barney"]);

// Names by first letter (throws if first letter is not unique).
var l = names.toLookup(n => n[0]);

// Names as uppercase by first letter.
l = names.toLookup(n => n[0], n => n.toUpperCase());

// Names using first letter as a key.
l = names.toLookup(n => n, {
    equals: function (x, y) {
        return x === y;
    },
    getHash: function (obj) {
        return obj[0];
    }
});

// Names using first letter as a key and uppercased as value.
l = names.toLookup(n => n, n => n.toUpperCase(), {
    equals: function (x, y) {
        return x[0] === y[0];
    },
    getHash: function (obj) {
        return obj[0];
    }
});

arrgh

Unfortunately, I can't show you all there is to arrgh.js in just one article. The Enumerable class already has 55 functions of which most have at least one overload. However, I have shown you all classes in arrgh.js as well as some functions. The full documentation should help you get on your way. You might want to check a certain function in .NET, chances are arrgh.js works the same.

So a few remarks. In most cases I don't check for parameter types. Everything works as documented, but if you pass a string where an object is expected who knows what will happen. The upside to this approach is that I can omit over 100 type checks which really helps in keeping the code small and fast. The downside, of course, is that a function may produce an error or, worse, incorrect results, and you might never know (of course you do, you tested your code).

Also, I thought it might be useful to mention what arrgh.js exposes.

JavaScript

var arrgh = (function (undefined, MAX_SAFE_INTEGER) {
    "use strict";

    // Lots of code here...

    return {
        Enumerable: Enumerable,
        Dictionary: Dictionary,
        Iterator: Iterator,
        List: List
    };
}(undefined, Number.MAX_SAFE_INTEGER || 9007199254740991));

The Number.MAX_SAFE_INTEGER is necessary for the Enumerable.range which has an upper limit of, you guessed it, Number.MAX_SAFE_INTEGER (which is not supported in older browsers, hence the literal value).

Testing with Jasmine

arrgh.js is a pretty well tested library if I dare say so myself. It currently counts 779 tests that ensure correct and consistent results in IE8, IE11, Firefox and Chrome (I'm on a Win7 machine, so no Edge or Safari, but I don't see why they shouldn't work). It would be silly to walk you through all those tests, but it's pretty useful to show you how they work and how to get them to work for you. For the remainder of this article we'll need Node.js and npm, so head over to their website, download it and install it. npm is the package manager for Node.js.

When you're done installing open up a command prompt and navigate to the arrgh.js root folder. Once there, create a package.json, either manually (create a file called package.json and put {} in it) or by typing npm init in the command prompt. Once you've got a package.json install Jasmine, the testing framework of choice, using npm install jasmine --save-dev.

cd C:\arrgh.js
npm install jasmine --save-dev

npm will create a node_modules folder and install everything that's necessary for Jasmine to run. Additionally, --save-dev will create a developer dependency in your package.json. That means you don't have to check in your packages to source control, but you can easily restore all your dependencies using npm install.

Now let's write some tests! Create a folder called test in your root. We now need two things, tests, and a page that shows us the test results. Let's first create the page for the test results as it's pretty easy. In the test folder create an html file, I've called it index.browser.html (there is also an index.html file in my project, but it's for an automated Jenkins build environment, which I'll not cover in this article). In the html file post the following HTML.

<!doctype html>
<html>
<head>
  <title>arrgh.js tests</title>
  <link rel="shortcut icon" type="image/png" href="../node_modules/jasmine-core/images/jasmine_favicon.png">
  <link rel="stylesheet" href="../node_modules/jasmine-core/lib/jasmine-core/jasmine.css">
</head>
<body>
  <script src="../node_modules/jasmine-core/lib/jasmine-core/jasmine.js"></script>
  <script src="../node_modules/jasmine-core/lib/jasmine-core/jasmine-html.js"></script>
  <script src="../node_modules/jasmine-core/lib/jasmine-core/boot.js"></script>

  <script src="../src/arrgh.js"></script> <!-- Path to your arrgh.js file -->

  <script src="spec/tests.js"></script>
</body>
</html>

For this example I'll use a single tests.js, but I have actually divided my tests into separate files. Ultimately tests.js is going to run them all.

Now, in the test folder, create a folder called spec. This is where your actual tests are going to be. In that folder create the file tests.js. Writing a test is now easy as pie. Jasmine is a Behavior-Driven Development (BDD) framework, meaning we're going to describe what a test should do and then do it.

describe("arrgh.Enumerable", function () {
    describe("toArray", function () {
        it("should produce an array containing some elements", function () {
            var e = new arrgh.Enumerable([1, 2, 3, 4, 5]);
            expect(e.toArray()).toEqual([1, 2, 3, 4, 5]);
        });

        it("should produce an empty array", function () {
            var e = new arrgh.Enumerable();
            expect(e.toArray()).toEqual([]);
        });
    });

    describe("count", function () {
        it("should have the count of the initial array", function () {
            var e = new arrgh.Enumerable([1, 2, 3, 4, 5]);
            expect(e.count()).toBe(5);
        });
    });
});

describe does exactly what it says, describe what's about to come. You can nest it as much as you want. Somewhere in your describes you're going to have an it. Inside the it you write your test and compare your expected value to your actual outcome, using expect and toEqual. toEqual compares reference types such as objects and arrays for equality (without them having to have the same reference). toBe compares value types such as integers and booleans. You could use toEqual instead of toBe, but not the other way around.

If you open the html page we created earlier you should now see the results of your tests. Try failing a test and see what happens.

As I said, I have separated my tests in different files. In tests.js I declare some global variables to be used in my other tests (actually, Jasmine has a better solution for this, but somehow it was flaky across browsers, so I went with the globals). The separation is based on the type of collection, iterators and some tests that group well together, such as all joins.

So, in a nutshell, here's tests.js.

JavaScript

var p0 = {
    first: "Sander",
    last: "Rossel",
    age: 28,
    hobbies: ["Programming" ,"Gaming" ,"Music"]
};

var p1 = {
    first: "Bill",
    last: "Murray",
    age: 58,
    hobbies: ["Hiking", "Travelling"]
};

// Other globals...

(function () {
    "use strict";
    describe("arrgh.js tests", function () {
        testEnumerable();
        // Other tests...
    });
}());

And then in test-Enumerable.js.

var testEnumerable = function () {
    "use strict";

    describe("Enumerable", function () {
        // A whole lot of other tests...

        describe("contains", function () {
            it("should return true when the collection contains the object", function () {
                var e = new arrgh.Enumerable(people);
                expect(e.contains(p3)).toBe(true);
            });
            // A whole lot of other tests...
        });
        // Even more tests...
    });
});

I should mention that I haven't checked the actual ages and hobbies of the (famous) people in the test set.

Documentation using JSDoc

The next thing we'll do is generate some documentation with JSDoc. You can install JSDoc using npm. Next to installing it in the current project we're also going to install it globally so we can use the CLI (Command Line Interface).

npm install jsdoc --save-dev
npm install jsdoc -g

JSDoc works with annotations in comments. You can simply put comments in any document, run it through JSDoc and end up with some nice documentation. The type of comment has to be /**/ as // is ignored. The annotations start with @. I've chosen to comment the functions directly in my source code. Don't worry about the size of the file as we'll remove all comment (except one) later when we're going to minimize the source.

We'll start with documenting the global namespace.

/**
 * Contains all collection classes used by arrgh.js.
 * @namespace arrgh
 */
var arrgh = (function (undefined, MAX_SAFE_INTEGER) {

After that we can document types within that namespace.

JavaScript

/**
 * Represents a list of objects that can be accessed by index. Provides methods to manipulate the list.
 * @memberof arrgh
 * @constructor
 * @extends arrgh.Enumerable
 * @param {(Array|String|arrgh.Enumerable|params)} [enumerable=[]] - An array, string or enumerable whose elements are copied to the new list.
 */
var List = function (enumerable) {

@memberof, @constructor, and @extends are pretty self explaining. The default format of @param is @param {type} name - description. For an optional parameter put the name between brackets, like [name]. For a default value add =default, like [enumerable=[]] (default is an empty array). When a function accepts a parameter of multiple types you can simply list them like in the example, {(type1, type2, etc.)}. An asterix indicates any type is accepted.

Now that we have a constructor we can document functions for it.

JavaScript

/**
 * Gets the item at the specified index.
 * @function get
 * @memberof arrgh.List
 * @instance
 * @param {Number} index - The index at which the item should be retrieved.
 * @returns {*} - Returns the item at the specified index.
 * @throws Throws an error when the index is smaller than zero or equal or greater than the length of the collection.
 */
List.prototype.get = function (index) {

It's also possible to document global types, like input functions, callbacks, or objects like the equality comparer.

JavaScript

/**
 * A function that tests if two elements are equal.
 * @callback equals
 * @param {*} x - The element to test for equality.
 * @param {*} y - The element to test on.
 * @returns {Boolean} - Return whether the elements are equal.
 */

/**
 * Returns a hash code for the specified object.
 * @callback getHash
 * @param {*} obj - The object for which a hash code is to be returned.
 * @returns {String} - A hash code for the specified object.
 */

/**
 * Defines methods to support the comparison of objects for equality.
 * @name equalityComparer
 * @type {Object}
 * @property {equals} [equals=(===)] - A function that tests if two elements are equal.
 * @property {getHash} [getHash=getHash() || toString()] - A function that computes an element's hash code.
 */

/**
 * Represents a collection of keys each mapped to one or more values.
 * @memberof arrgh
 * @private
 * // ...
 * @param {equalityComparer} [eqComparer=(===)] - An object that tests if two keys are equal.
 */
var Lookup = function (source, keySelector, elementSelector, eqComparer) {

Let's generate some documentation. Open up the command prompt and simply type jsdoc myFile.js. JSDoc should create an out folder with the generated documentation. The documentation should now look something like this.

You can style the output and write it to specific folders using CLI parameters, but we're going to do that in a moment using Gulp.

Automation with Gulp

Next up we're going to automate some stuff. Every time I save a file, source or test, I want to lint my JavaScript, run tests, generate documentation, minify code and whatever so my heart desires. And if I can automate that I also want to be able to run it all once with a single command. Enter Gulp.

Gulp is a build automation tool. With Gulp we take some input, run it through some task and pass the output as input to the next task. Finally, we write the final output to some destination, like a file or the console.

We're going to install Gulp in our project and globally so we can make easy use of the CLI again.

npm install gulp --save-dev
npm install gulp -g

Next, we're going to create a file in the root folder of our project and call it gulpfile.js. Gulp, on its own, doesn't do much. We're going to need a couple of plugins. Let's start by linting our source file. To do this we'll first need the gulp-minify plugin.

npm install gulp-minify --save-dev.

Now that we have Gulp and the minify plugin we can actually put some useful code in the gulpfile.

JavaScript

var gulp = require('gulp');
var minify = require('gulp-minify');

gulp.task('minify', function () {
    return gulp.src('src/*.js')
    .pipe(minify({
        ext: {
            src: '.debug.js',
            min: '.js'
        },
        preserveComments: 'some'
    }))
    .pipe(gulp.dest('dist'));
});

gulp.task('default', function () {
    gulp.start('minify');
});

This may look familiar if you've done some work in Node.js before. We start off by requiring gulp and gulp-minify. These two lines of code will load the packages into our runtime (from the node_modules folder). Next, we're creating a task in gulp called minify. We're getting the source using gulp.src and pipe it to the minify module. We can pass a config object to the minify module specifying the extensions for our source file and our minified file. The minified file is going to be called arrgh.js and the source file arrgh.debug.js. We're preserving some comments (being the license at the top). The result is piped again and written to the dist folder.

In the default task we're running the minify task.

Open up the command, navigate to your project folder and now simply run gulp. This will run the default task. You can also run a specific task by specifying its name.

gulp
gulp minify

Either of those will run the gulpfile and minify our source.

For subsequent runs we want to delete all former build files and start fresh. We also want to lint our JavaScript. We'll need some more plugins.

npm install gulp-clean --save-dev
npm install jshint --save-dev
npm install gulp-jshint --save-dev

The gulpfile now looks as follows.

JavaScript

var gulp = require('gulp');
var minify = require('gulp-minify');
var clean = require('gulp-clean');
var jshint = require('gulp-jshint');

gulp.task('clean', function () {
    return gulp.src([
        'dist/'
    ], { read: false })
    .pipe(clean());
})
.task('minify', ['clean'], function () {
    return gulp.src('src/*.js')
    .pipe(minify({
        ext: {
            src: '.debug.js',
            min: '.js'
        },
        preserveComments: 'some'
    }))
    .pipe(gulp.dest('dist'));
})
.task('lint', function () {
    return gulp.src('src/*.js')
    .pipe(jshint('jshint.conf.json'))
    .pipe(jshint.reporter('default'));
});

gulp.task('default', function () {
    gulp.start(['minify', 'lint']);
});

The minify task is now depending upon the clean task. We can't minify until we've cleaned our old stuff. The default task will now run minify (which will run clean) and lint. The lint task makes use of jshint which can take a json file as parameter. That's really cool as we can now configure jshint in an external file and keep our gulpfile clean. So create a jshint.conf.json file in your project folder. You can find all kind of configuration options in the jshint documentation. Here's my config file.

JavaScript

{
    "bitwise": true,
    "curly": true,
    "eqeqeq": true,
    "esversion": 3,
    "forin": true,
    "freeze": true,
    "futurehostile": true,
    "latedef": true,
    "nocomma": true,
    "nonbsp": true,
    "nonew": true,
    "notypeof": true,
    "strict": true,
    "undef": true,
    "unused": true
}

Next is the JSDoc task. For arrgh.js I've also installed another template as I didn't like the default very much. The template is called jaguarjs,

npm install gulp-jsdoc3 --save-dev
npm install jaguarjs-jsdoc --save-dev

The gulp-jsdoc3 also makes use of an external config file. It tells JSDoc where to write too, what template to use, and you can also configure your template if the template supports it. In the published version I've also got a README.md (that's also used by GitHub) which I've included in this config file and is written to the index page. We don't have it here so I have omitted it.

JavaScript

{
    "tags": {
        "allowUnknownTags": true,
        "dictionaries": ["jsdoc"]
    },
    "templates": {
        "applicationName": "arrgh.js",
        "meta": {
            "title": "arrgh.js",
            "description": "A lightweight JavaScript library that brings proper .NET-like collections and LINQ to the browser.",
            "keyword": "JavaScript, LINQ, collections, Array"
        }
    },
    "opts": {
        "destination": "docs",
        "private": true,
        "template": "node_modules/jaguarjs-jsdoc"
    }
}

The funny thing with the JSDoc plugin is that it's a bit of an anti-pattern in Gulp. It doesn't pipe anything, it just takes input, writes documentation and pipes the input to the next job. No matter though, we can still create a task to generate our documentation.

JavaScript

var gulp = require('gulp');
var minify = require('gulp-minify');
var clean = require('gulp-clean');
var jshint = require('gulp-jshint');
var jsdoc = require('gulp-jsdoc3');

gulp.task('clean', function () {
    return gulp.src([
        'dist/',
        'docs/'
    ], { read: false })
    .pipe(clean());
})
.task('minify', ['clean'], function () {
    return gulp.src('src/*.js')
    .pipe(minify({
        ext: {
            src: '.debug.js',
            min: '.js'
        },
        preserveComments: 'some'
    }))
    .pipe(gulp.dest('dist'));
})
.task('lint', function () {
    return gulp.src('src/*.js')
    .pipe(jshint('jshint.conf.json'))
    .pipe(jshint.reporter('default'));
})
.task('jsdoc', ['clean'], function () {
    return gulp.src('src/*.js')
    .pipe(jsdoc(require('./jsdoc.conf.json')))
});

gulp.task('default', function () {
    gulp.start(['minify', 'lint', 'jsdoc']);
});

Next, we want to do all of these tasks automatically when we make a change to arrgh.js. We can do this using gulp.watch.

gulp.watch(['src/*.js', '*.conf.json'], ['minify', 'lint', 'jsdoc']);

gulp.task('default', function () {
    gulp.start(['minify', 'lint', 'jsdoc']);
});

Now whenever something changes in our source or in a config file Gulp will run the minify, lint and jsdoc tasks. This time, when you run gulp, you'll notice it won't terminate like it always did. That's because it's now watching your files. To terminate batch jobs in the command use ctrl+c.

My gulpfile does not currently look like that (although it did at one point), but you should now know how Gulp works and how to create and run tasks.

Karma

We're missing just one thing, automated testing! Unfortunately, using Jasmine alone we can't automate anything. We'll need a testing framework. There are a few out there, but I've chosen Karma. We'll start by installing Karma. We'll need a lot of plugins (again), so get ready to install Karma, the Jasmine plugin and some browser launchers (install the browser launchers that apply to you).

npm install karma --save-dev
npm install karma-jasmine --save-dev
npm install karma-chrome-launcher --save-dev
npm install karma-firefox-launcher --save-dev
npm install karma-ie-launcher --save-dev

Here's the Karma part in the gulpfile.

var karma = require('karma').Server;
// Code...
.task('test', function (done) {
    new karma({
        configFile: __dirname + '/karma.conf.js',
    }, function (err) {
        if (err > 0) {
            return done(err);
        }
        return done();
    }).start();
});

Unfortunately, when err > 0 Karma (or Node.js) shows a really ugly and useless stack trace in the console. It can be helped using gulp-util, but I'm not going into that in this article. Once again we have an external config file.

module.exports = function(config) {
    config.set({
        frameworks: ['jasmine'],
        files: [
            'src/*.js',
            'test/spec/*.js'
        ],
        reporters: ['progress'],
        port: 9876,
        autoWatch: true,
        browsers: ['Chrome'], // And/or Firefox or IE...
        singleRun: true
    });
};

If you want to test every time one of the files changes set singleRun to false.

Now when you run gulp test from your console you should see a browser starting up, doing some Karma stuff, closing, and then have the result in your console.

Let's add some code coverage, for what use is a test if you don't know what it's covering.

npm install karma-coverage --save-dev

You'll only need to change your config file.

JavaScript

module.exports = function(config) {
    config.set({
        frameworks: ['jasmine'],
        files: [
            'src/*.js',
            'test/spec/*.js'
        ],
        preprocessors: {
            'src/*.js': ['coverage']
        },
        reporters: ['progress', 'coverage'],
        port: 9876,
        autoWatch: true,
        browsers: ['Chrome'],
        singleRun: true,
        coverageReporter: {
            reporters: [
                { type : 'html', subdir: 'html' }
            ],
            dir : 'test/coverage/',
            check: {
                global: {
                    statements: 95,
                    branches: 95,
                    functions: 95,
                    lines: 95
                },
            }
        }
    });
};

With this setup you'll get a nice and detailed HTML report in /test/coverage/html. You can set some global thresholds too, so the task will fail whenever the minimum coverage is not met.

Not in the scope of this article, but good to know, you can install additional plugins for CI systems such as Jenkins and Travis. For example, Jenkins works with JUnit reports, so you could install karma-junit-reporter. Jenkins also uses cobertura format for code coverage, which is already present in the karma-coverage plugin, but should be configured (add another reporter to coverageReporter.reporters).

So now we have everything linted, tested, minified and documented (don't forget to also test your minified file). Time to release this baby!

Publishing arrgh.js

Now that we have everything in place let's see if we can publish arrgh.js to npm and NuGet. Obviously, it's my package, so you can't actually publish it. The name is already taken and it's mine. You could come up with a new name and publish it anyway, but that would be kind of lame on your part.

npm

First we'll check out how to publish to npm. First of all you'll need an account on npmjs.com. You can sign up on the website or create a user in the console by typing npm adduser (although I've never tried it). Once you've created an account you can login to npm using your console.

npm login

The console will now ask for your username and password. Now that you're logged in you can, theoretically, simply publish by using npm publish.

cd myProject
npm publish

npm then looks at your package.json file and uses it to create a page for your package. We haven't discussed the package.json yet, only that you needed it to save your developer dependencies. The package.json file contains information about your project, like the name, a description, the author, license, dependencies, etc. As far as I know name, version and description are mandatory, everything else is optional. Here is the most important part of my package.json file.

JavaScript

{
  "name": "arrgh",
  "version": "0.9.2",
  "description": "A lightweight JavaScript library that brings proper .NET-like collections and LINQ to the browser.",
  "main": "arrgh.js",
  "files": [
    "arrgh.js",
    "arrgh.debug.js"
  ],
  "scripts": {
    "prepublish": "gulp && xcopy .\\dist\\arrgh.js .\\arrgh.js* && xcopy .\\dist\\arrgh.debug.js .\\arrgh.debug.js*"
  }
  //...
}

The name and version need to be unique for every publish you do. So you can't publish any version of arrgh, because the name arrgh belongs to me. My current published version is 0.9.2 so I can't publish that again.

By default npm will just publish your entire folder and sub folders, so you can specify specific files to target. The files will be published as they are, including their current path. So if you publish folder/subfolder/my-script.js the published package will also contain folder/subfolder/my-script.js. You can also create an npmignore file which works the same as a gitignore file (npm will ignore the files and patterns included in both gitignore and npmignore, but npmignore overrides gitignore). Some files, like the package.json, will be released no matter your settings. You can find such stuff in the npm developers documentation.

Here's a fun story. Personally, I hate it when I install a package that I just want to use and I get a ton of files that I'd never use, like a gulpfile, because why would I need a gulpfile as the consumer of a Node.js package!?. So I just wanted to publish the mandatory package.json, readme.md and the contents of the dist folder, but not the folder itself. So I put dist/arrgh.js and dist/arrgh.debug.js in the files field in the package.json and the dist folder got distributed. Under version 1.0.0. No problem, I thought. I'll just unpublish it and publish it again, but correctly this time. So I unpublished it, thinking everything would be gone (even I can't see it), but it's actually still there and now I'll never be able to publish 1.0.0 again. So I went with 0.9.0 as a test instead...

Of course I'm here so you don't have to make that mistake. You can check out what would be published if you actually did by using npm pack. This will create a tarball file with the would-be published files. Unfortunately, it's not possible to grab the files from dist and put them in the root, so I used the package.json scripts object to copy the contents of dist to the root folder before publishing. I'm also running Gulp, just in case. If the Gulp build or tests fail npm won't publish. You can also specify other scripts, like postpublish, preinstall, install, etc.

When updating you can increment your version manually or using npm version major/minor/patch in the command. npm uses semvar internally.

And that's how I got my very own package on npm!

NuGet

Next to npm I wanted to release my package to NuGet, because I usually code in C#. Before you start you'll need your package.json version of NuGet, which is called the nuspec. Like the package.json the nuspec file contains information about your package. It's an XML file with fields such as id, version, title and description. It's not all that big, so here's my complete nuspec file for version 0.9.2.

XML

<?xml version="1.0"?>
<package>
    <metadata>
        <id>arrgh.js</id>
        <version>0.9.2</version>
        <title>arrgh.js</title>
        <description>A lightweight JavaScript library that brings proper .NET-like collections and LINQ to the browser.</description>
        <authors>Sander Rossel</authors>
        <owners>Sander Rossel</owners>
        <language>JavaScript</language>
        <licenseUrl>https://spdx.org/licenses/MIT</licenseUrl>
        <projectUrl>https://sanderrossel.github.io/arrgh.js/</projectUrl>
        <requireLicenseAcceptance>false</requireLicenseAcceptance>
        <releaseNotes>Fixed Dictionary implementation.</releaseNotes>
        <copyright>Copyright 2016</copyright>
        <tags>JavaScript LINQ collections array List HashMap Dictionary</tags>
    </metadata>
    <files>
        <file src="dist\*.js" target="content\Scripts" />
    </files>
</package>

Notice how I can very awesomely tell NuGet what files to include and where to put them!? Nice! The target has to start with lib, content or tools, so I chose content. You should read the entire Nuspec documentation for more information.

To publish a package to NuGet you'd probably use Visual Studio, but I haven't used VS at all for this project, so why start now? Instead, we can download the Windows x86 Commandline tool over at NuGet downloads. I've actually also included it in GitHub. To build a package simply start up the command and use nuget pack.

cd folder_of_project_including_nuget_cli
nuget pack

This will create a nupkg file that you can use for private repositories (you can create them in Visual Studio, just head over to your NuGet settings, create a new repository and target some folder on your computer, now NuGet will automatically pick up nupkg files in that folder) or you can upload them to NuGet manually. Of course, you'll need an account at NuGet too.

It is possible to push packages directly from the command, but I haven't actually tried it. You can find how to do it in the documentation. Uploading manually is so easy, however, that I haven't even bothered. Of course, it's nice (and necessary) if you have a CI server like Jenkins, TFS or Travis.

And that's how I got my very own package on NuGet!

Adding support for AMD/RequireJS and CommonJS/Node.js

Support for AMD/RequireJS and CommonJS/Node.js was added in release 1.1.0. You can find these changes on GitHub.

Everything works nice in the browser now, but this is JavaScript and the web, there is never one way to do anything. The same goes for loading JavaScript files. In the browser you can simply reference a script in your HTML and it will work (your package is exposed as a global variable), but that is just one way to expose your script. Another method of exposing your script is through AMD (Asynchronous Module Definition), supported by RequireJS.

JavaScript

require(['node_modules/arrgh/arrgh.js'], function (arrgh) {
	// Use arrgh here...
});

This is easy as you do not have to reference all your scripts in the correct order in your HTML file.

Yet another method, used by Node.js, the back-end JavaScript solution, is CommonJS. CommonJS allows your to require files when needed and has the same benefits as RequireJS, but with even easier syntax.

JavaScript

var arrgh = require('arrgh');
// Use arrgh here...

Unfortunately, supporting either one of those methods requires us to change our JavaScript and will prevent us from "simply" loading the script in our HTML. Luckily, we can support all three.

JavaScript

var arrgh = function () { ... };

// Support for AMD...
define(arrgh);

// Support for CommonJS...
module.exports = arrgh();

// Support for "simple" referencing...
window.arrgh = arrgh();

So basically, what it comes down to, is to check if RequireJS is loaded, if it is we use define, if CommonJS is loaded we use module.exports and else we attach arrgh to the window object.

This is fairly easy code that is also easily found on Google (and there is probably already an npm package for it as well).

JavaScript

(function (name, definition) {
    "use strict";
    if (typeof module !== "undefined") {
        module.exports = definition();
    }
    else if (typeof define === "function" && typeof define.amd === "object") {
        define(definition);
    }
    else {
        window[name] = definition();
    }
}("arrgh", function () { ... });

That code makes sure we use the correct method of loading scripts. Please note that these methods are mutually exclusive and that CommonJS takes precedence over AMD and AMD takes precedence over simple loading.

Of course we still need to test our stuff. I've added one simple AMD test that I executed manually.

JavaScript

describe("arrgh.js tests", function () {
    it("should load using require.js", function (done) {
        require(['../src/arrgh.js'], function (arrgh) {
            expect(arrgh.Enumerable).not.toBeNull();
            done();
        });
    });
});

The Node.js tests are a little more complex since we are going to run on a complete new platform. I've copied the tests.js file and created a file called node-spec.js (the naming is important for the jasmine-node library we're going to use). The file is exactly the same as the file it was copied from save for the definition of arrgh.js and the loading of the other scripts.

JavaScript

var arrgh = require('../../src/arrgh.js');
/* jshint ignore:start */
var fs = require('fs');
eval(fs.readFileSync('./test/spec/test-Enumerable_overridden.js','utf-8'));
eval(fs.readFileSync('./test/spec/test-Iterators.js','utf-8'));
// [...]
/* jshint ignore:end */

It isn't my best code, but if it looks stupid and it works then it ain't stupid. We can now install jasmine-node and gulp-jasmine-node using npm. In our gulpfile we can add the following task.

JavaScript

var jasmineNode = require('gulp-jasmine-node');

[...]

.task('test-node', ['test'], function () {
    return gulp.src('test/spec/node-spec.js')
    .pipe(jasmineNode({ reporter: [ new jasmine.TerminalReporter({ color: true }) ] }));
})

Unfortunately, our tests run a little different in Node.js and [NaN] does not equal [NaN] anymore, so tests including NaN had to be slightly changed.

JavaScript

it("should add NaN to the list", function () {
    var l = new arrgh.List();
    l.add(NaN);
    var arr = l.toArray();
    expect(arr.length).toBe(1);
    expect(arr[0]).toBeNaN();
    // This doesn't work anymore.
    //expect(l.toArray()).toEqual([NaN]);
});

One more thing, to make require('arrgh') work in Node.js we need our arrgh.js file to be named index.js. So, in our package.json we need to add an extra xcopy command "xcopy .\\dist\\arrgh.js .\\index.js*" and we also need to add index.js to the files node in package.json.

After some testing, changing the docs, and playing around we're ready to publish! I've published this change to 1.1.0 since 0.9 was actually already version 1.0... Anyway, enjoy these new additions to arrgh.js!

Conclusion

So, that was my journey to create my very own JavaScript LINQ library and publish it to npm and NuGet. I have learned a lot along the way and I hope I've passed some of that knowledge on to you, the reader. I'm currently using arrgh for a project at work and it made some stuff really very easy and consise.

If you decide to use arrgh as well please let me know, I'd love to hear your feedback! Any tips, improvements, bugs or missing features can be reported here or on GitHub issues.

Happy coding!

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Arrgh.js - Bringing LINQ to JavaScript

Table of Contents

License