Why?
Long running calculations in JavaScript are generally a bad idea. This is because JavaScript is a single-threaded environment; anything we do takes place on the UI thread, and when a script is running, the UI is unresponsive. To prevent this happening all the time, browsers have implemented various warning messages that allow the user to stop execution after a certain threshold.
Exhibit A, Google Chrome:
In the last couple of years, browser capabilities have increased dramatically. JavaScript execution engines are faster and the good browsers (you know who you are...) can do DOM updates really quickly. Browser-based applications have spread from the domain of plug-ins (Flash, Applets) to native JavaScript implementations, e.g., Google Docs (and try the Chrome Experiments). This change in direction has been recognised in the addition of the Web Workers API to the HTML5 specification. Web Workers is a method of starting a new JavaScript thread, and is implemented in the current versions of Chrome, Firefox, and Opera. Unfortunately, there aren't any plans yet to implement it in Internet Explorer.
There are some examples out there of Web Workers in action - try the Julia Map which speeds up calculation by using them.
When I write parallel code in .NET, I have the Task Parallel Library to help me. It provides a lot of help when doing things like fork and join, and allows to spawn a new Task really easily. The Web Workers API isn't so nice, it requires that we write our parallel JavaScript code in another file and send a message to it and listen for responses. So in this article, I will investigate a way in which we can make the Web Workers API behave a lot more nicely.
Introduction to HTML5 Web Workers
There are plenty of online tutorials (here is a good one) but I only want to look at some specific aspects, namely features, construction, and performance. This will then form a foundation of the "Deferred" solution later on.
Features
Web Workers are created by writing a block of code in a separate .js file. This piece of code is then executed in an entirely separate context – it has no access to the window object, it can’t see the DOM, and it receives input and sends output via messaging. The messages are serialized, so the input and output is always copied – meaning we can’t pass any object references into Workers. Although initially this seems like a serious downside, it can also be viewed as a great bonus – it forces thread safety.
To implement a Worker
, we have to create Worker code in a new file. It needs to confirm to a specific "interface":
onmessage
: implement this function to receive messages from the UI threadonconnect
: implement this function in a Shared Worker, to receive notification when multiple UI threads (i.e., from multiple windows) connect to the same Worker
instancepostMessage
: call this function to send a message back to the UI thread
Since a Worker
doesn’t have access to the window object, you can’t use all the window functions you are used to (self
is the global object in a Web Worker). However, you can still use these:
setTimeout
setInterval
XMLHttpRequest
Construction
Here is a simple implementation of a prime-number calculating Worker, primes.js:
self.onmessage = function(event) {
for(var n = event.data.from; n < = event.data.to; n += 1){
var found = false;
for (var i = 2; i <= Math.sqrt(n); i += 1) {
if (n % i == 0) {
found = true;
break;
}
}
if(!found) {
postMessage(n);
}
}
}
Here, we have implemented a function called onmessage
, and that function calls postMessage
with its results. To make use of this Worker, we have the following piece of code in our page:
var worker = new Worker('primes.js');
worker.onmessage = function (event) {
console.log(event.data);
};
worker.postMessage({from:1, to:100});
This constructs a new Worker
object using our Worker definition file. Each time it receives a message from the Worker, it is output to the console.
Performance: Worker Thread vs. UI Thread
To run the following tests, I updated the Worker above by adding timestamp measurements at the start and end of the onmessage
function. They are then passed out through the result
object at the end. This allowed me to get the exact time when the function started and finished execution, enabling measurement of the time taken to send a message to the Worker, the time for it to execute, and the time for it to send a message back to the UI thread.
var time = new Date().getTime();
I also ran the same algorithm without the use of any Workers. The parameters in both cases were from 1 to 100000. Everything was repeated in Chrome, Firefox, Opera, and IE.
In Chrome, the Worker execution time is a little longer than the UI thread, and the setup time is bigger than the other browsers. Since this is a constant, it will become less significant as the Worker does more work, or is reused.
In Opera, execution also takes a little longer in the Worker, but again the setup time is a bigger factor as with Chrome.
In Firefox, the Worker is more than twice as fast! I don't know why this is. My only guess is that the UI thread is busy doing other things. The setup time is minimal. Firefox seems to like Workers, but in saying that, it's still slower than Chrome and Opera.
In IE...well, it doesn't implement Workers, and the UI thread takes a long time. In IE9, we'll see better JavaScript performance but we won't see Web Workers.
Performance: Multiple Workers vs. Single Worker
In all of the tests above, core 1 of my dual-core CPU shot to 100% usage while core 2 remained idle. That's a bit of a waste, and that's where the benefits of Web Workers should be seen.
So let's repeat the tests above, using two Workers instead of one. IE is left out this time for obvious reasons. All timing is in milliseconds.
Browser | Construction | Avg. message sending | Avg. execution | Avg. message receiving | Total time (load to completion) |
---|
Chrome 9 | 1 | 175 | 92 | 7 | 290 |
Opera 11 | 200 | 50 | 99 | 50 | 202 |
Firefox 3.6 | 1 | 32 | 525 | 5 | 614 |
Consistently, we see that two Workers are only slightly faster than one, but that is entirely due to the overhead involved in creating each Worker - the actual execution time doubled in speed.
But there is definitely something strange going on with Opera: the time taken to construct the Workers is almost equal to the total time required. This means the UI thread is busy whilst the Workers are running, and the UI thread won't get to see any benefits as is the case with Chrome and Firefox. However, this probably only applies to short-running Workers with large overhead.
Sending/Receiving Large Messages
Workers communicate with the UI thread via messaging, and those messages are copied. If we pass an object to a Worker, it's serialised to JSON, and this serialisation and copying process is going to require effort. Let's measure exactly how much effort. I've removed the work from the Worker and I simply pass it an object, and it pings that object back. We take a timestamp within the Worker so we know exactly when it's run. This is the Worker code:
self.onmessage = function (event) {
postMessage({input:event.data, received: new Date().getTime() });
};
And this is how we consume it:
var worker = new Worker('ping_worker.js'),
startT = new Date().getTime();
worker.onmessage = function(event){
console.log("Time to send message to worker: "+
(event.data.received - startT));
console.log("Time to receive message from worker: "+
(new Date().getTime() - event.data.received));
};
worker.postMessage({});
For each browser, I ran the above code both with and without a large (100KB) object in the postMessage
argument. This let me find the time delta which indicates the time lag induced by passing the object. Again, all times are in milliseconds.
Browser | Send empty | Receive empty | Send large | Receive large | Send large delta | Receive large delta |
---|
Chrome 9 | 112 | 9 | 135 | 34 | 23 | 25 |
Opera 11 | 1 | 0 | 8 | 4 | 7 | 7 |
Firefox 3.6 | 27 | 3 | 34 | 38 | 7 | 4 |
I think we can safely conclude that serialisation/deserialisation and message passing doesn't take a significant amount of time, especially compared to the overhead of constructing the Worker.
jQuery Deferred
A Deferred
object represents an asynchronous activity, and relays the success or failure of it, along with results, to any registered callbacks. It used to be the case that if you were performing an asynchronous action and wanted to make a callback at the end, you would allow the consumer to pass in a callback function. Now, you just return a Deferred
object to the consumer, and call its resolve
function when you want any listeners to be notified. Take this example of the jQuery 1.4 ajax
function, before it used Deferred:
$.ajax({
url: "w.php",
success: function(result){
}
});
And in jQuery 1.5, that changes to the following, where "success
" is no longer a simple callback – but a function on the Deferred
object created by the $.ajax
request:
$.ajax("w.php").success(function(result){
})
Note that, just to confuse matters, the $.ajax
request returns a specialised Deferred
object which gives us the success
, error
, and complete
callback hooks for ease of use – the standard Deferred
methods are implemented internally. So it’s probably not the best example. Here’s a lovely example where a Deferred
object is created to represent the completion of an animation:
function fadeIn(selector){
return $.Deferred(function(dfd) {
$(selector).fadeIn( 1000, dfd.resolve );
});
}
And to consume this, we can call fadeIn
and attach a completion handler to the Deferred
result:
fadeIn(".elem").then(function(){
alert("Fade is complete!");
});
In fact, any action can be represented as a Deferred
object, which would be really useful because we could then chain time-consuming actions together in a simple way. For example:
fadeIn(".elem1").then(function(){
alert("elem1 complete!");
fadeIn(".elem2").then(function(){
alert("elem2 also complete!");
});
});
$.wait(fadeIn(".elem1"), fadeIn(".elem2")).then(function(){
alert("Fading both elems is complete!");
});
.NET Tasks vs. jQuery Deferred
The .NET Task Parallel Library is a great advancement in parallel programming for the .NET Framework. It lets us to easily run an anonymous method in another thread without any worries about the actual thread creation. A Task
object wraps up a piece of parallel code, and provides a notification of when it’s complete. We can use the Task.WaitAll
or TaskFactory.ContinueWhenAll
functions to do something after a collection of Task
s are all complete, or use Task.WaitAny
or TaskFactory.ContinueWhenAny
to wait until one is complete. The ContinueWith
method schedules code to be run after a single task is complete.
Sounding familiar? jQuery Deferred
is a very similar concept to the Task
class, as they both are used to represent long-running operations. Let's look at a direct comparison:
TPL | Deferred | Description |
---|
new Task(action) | $.Deferred(function) | Creates a new Task or Deferred from a function. |
ContinueWith(action) | then(function), done(function) | Creates a new Task or Deferred from a function, to be run when the current Task or Deferred is complete. |
WaitAll | … | Blocks the current thread until all tasks are complete. Bad idea in JavaScript since you’d be blocking the UI thread! |
WaitAny | … | Blocks the current thread until any task is complete. |
TaskFactory.ContinueWhenAll | $.when(function) | Creates a new Task or Deferred which is run when the supplied collection of Task /Deferred objects is complete. |
TaskFactory.ContinueWhenAny | … | Creates a new Task which is run when any of the supplied collection of Task objects is complete. |
The operations which Deferred
doesn't have, WaitAll
and WaitAny
, are missing for a reason: blocking the UI thread is impossible in JavaScript with it being a completely asynchronous language.
Wrapping Web Workers in Deferred
Now that we understand Web Workers and jQuery Deferred
, let's look at combining the two so we can achieve a .NET Task
-like programming environment. Let’s firstly define a simple Web Worker object and put it in the file test_worker.js:
self.onmessage = function (event) {
var result = "the result of lots of work";
postMessage(result);
};
And to consume a Worker using Deferred
, we have the following helper function:
$.work = function(args) {
var def = $.Deferred(function(dfd) {
var worker;
if (window.Worker) {
var worker = new Worker(args.file);
worker.onmessage = function(event) {
dfd.resolve(event.data);
};
worker.onerror = function(event) {
dfd.reject(event);
};
worker.postMessage(args.args);
} else {
}
});
return def.promise();
};
Finally, all that remains is to make a call into the $.work
function to start the Worker!
$.work({file: 'test_worker.js', args: { anArg: "hello!" }}).then(function(data) {
console.log(data);
}).fail(function(data){
console.log(data);
});
Beautiful! Now let’s see an example of how Deferred
makes life a lot easier now. Let’s assume we’ve already completed the trivial task of writing a Worker "primes.js" that calculates the prime numbers between a pair of values. Our task is to consume that Worker and calculate the primes between 1 and 1 million. We can split that into two Workers as follows:
var worker1 = $.work({file: 'primes.js', args: { from: 1, to: 500000 }});
var worker2 = $.work({file: 'primes.js', args: { from: 500001, to:1000000 }});
$.when(worker1, worker2).done(function(result1, result2){
});
So far we've combined Deferred with Web Workers, but there's some issues yet to solve:
- We have to put our Worker code into a separate file - not very nice
- It doesn't work in browsers that don’t have Web Workers
Creating a Generic Worker
Wouldn’t it be nice if you didn’t even have to write the Worker file? To achieve this, we must overcome the fact that Web Workers need to be constructed with a file name containing the Worker definition, as opposed to the function to be run. To get around this problem, we just create a Web Worker that takes, as a message, a function definition and arguments, all encoded as a JSON-string.
This technique adds some code to our mission: to convert a function to and from a string requires a little bit of effort. To convert a function to a string, we simply do this:
var funcStr = func.toString();
But the reverse – getting the function back from a string – is more difficult. We could try using eval
:
var funcStr = func.toString();
eval("var func = " + funcStr);
However, if you try that, you will find that the performance of running the function in Chrome is abysmal: the function doesn’t get precompiled, and the net result is that execution time is more than 10x slower. Another alternative is constructing the function using the new Function
syntax. In the following table, I compare the performance of each (all in milliseconds – lower is better):
| Native | Eval | new Function |
---|
Chrome 9 | 207 | 2955 | 204 |
IE 8 | 4078 | 4890 | 4047 |
Opera 11 | 240 | 1080 | 240 |
Firefox 3.6 | 341 | 342 | 336 |
In all cases, constructing the function using new Function
gives the same performance as a natural JavaScript function, so we’ll use that instead of eval
. In the following code, we see how to convert a string-encoded function to a real function using the Function
constructor. It just involves manipulating the function’s string to get the function body, and name of the function’s argument, and then pass those to the Function
constructor. Combining this with the "generic" Worker
, we can write this Worker
file (worker.js):
self.addEventListener('message', function (event) {
var action = self.getFunc(event.data.action);
self.postMessage(action(event.data.args));
}, false);
self.getFunc = function (funcStr) {
var argName = funcStr.substring(funcStr.indexOf("(") + 1, funcStr.indexOf(")"));
funcStr = funcStr.substring(funcStr.indexOf("{") + 1, funcStr.lastIndexOf("}"));
return new Function(argName, funcStr);
}
Note that in the above Worker, we attach to the message event using the standard addEventListener
syntax. That is much nicer than the old school method of adding a function to the onmessage
property, and allows us to attach multiple listeners if needed.
To consume this Web Worker, we must serialise the function to be run and its arguments so they can be passed in a message. Our $.work
function can do that for us. We’ll also add one other detail: make it cross-browser compatible by synchronously executing the action when there is no Worker definition.
$.work = function(action, args) {
var def = $.Deferred(function(dfd) {
if (window.Worker) {
var worker = new Worker('worker.js');
worker.addEventListener('message', function(event) {
def.resolve(event.data);
}, false);
worker.addEventListener('error', function(event) {
def.reject(item);
}, false);
worker.postMessage({
action: action.toString(),
args: args
});
} else {
setTimeout(function(){
try {
var result = action(args);
dfd.resolve(result);
} catch(e) {
dfd.reject(e);
}
}, 0);
}
});
return def.promise();
};
To define the code, you can write any function that takes a single parameter:
var findPrimes = function (args) {
var divisor, isPrime, result = [],
current = args.from;
while (current < args.to) {
divisor = parseInt(current / 2, 10);
isPrime = true;
while (divisor > 1) {
if (current % divisor === 0) {
isPrime = false;
divisor = 0;
} else {
divisor -= 1;
}
}
if (isPrime) {
result.push(current);
}
current += 1;
}
return result;
}
And running it then becomes this succinct beauty:
$.work({action: findPrimes, args: { from:2, to:50000 }}).then(function(data) {
alert('all done');
}).fail(function(data){
alert('oops');
});
Performance
To try out the performance, I’m going to do the following three tests in the usual four browsers:
- Run the findPrimes function in the UI thread (no workers involved)
- Run the findPrimes function in a Web Worker (keeping the UI thread free)
- Run the findPrimes function in two Web Workers (splitting the calculation into two equal parts)
| UI thread | One worker | Two workers | Observations |
---|
Chrome 9 | 4915 | 4992 | 3268 | CPU at 50% with 1 worker, 100% with 2 |
Firefox 3.6 | 7868 | 7862 | 5289 | CPU at 50% with 1 worker, 100% with 2 |
Opera 11 | 5754 | 5780 | 5676 | CPU at 50% in both cases (but UI thread is free) |
IE 8 | 108689 | same | same | CPU at 50% in all cases (UI thread is always used) |
In the above tests, we can see that execution always takes the same time in a Web Worker as in the UI thread. In Chrome and Firefox, we see that executing two Web Workers concurrently gives a nice performance improvement by taking advantage of multiple CPUs on the user’s machine. These are very positive results, especially considering the overhead in constructing and messaging the Web Workers.
Chrome
This technique of wrapping Web Workers works really well in Google Chrome, even though Chrome has the largest overhead in constructing a Web Worker object. As you would expect, Chrome makes use of multiple cores by running the Web Workers in separate threads, and we can achieve a good speed-up in performance on multi-core machines vs. single-core machines.
Firefox
Firefox also has great performance. There is a good speed-up on multi-core machines, and additionally, Firefox has a low overhead in constructing Web Worker objects.
Opera
Although Opera does support Web Workers, it doesn’t seem to run them in their own threads – in the table above, we can see that the performance when running multiple workers is no better than running a single worker, when on a multi-core machine. I noted that the CPU usage maxed out at 50% on my dual-core machine even though I was running multiple workers. I’m sure Opera will resolve this in the future though, and using Web Workers still frees up the UI thread and makes the browser responsive during long-running calculations.
Internet Explorer
In IE, since we are executing exclusively in the UI thread, long-running calculations will result in the message: "A script on this page is causing Internet Explorer to run slowly". This will occur if your Worker function executes more than 5 million statements. By way of workaround, I can only suggest the following:
- Split the Worker into multiple smaller Workers to ensure the 5 million-statement limit is not reached. You should endeavor to provide user feedback regularly to let the user know that the application has not crashed.
- If you have control over the client’s Registry (e.g., a company internal application), then the limit can be changed, although that is a bad idea because the browser will be unresponsive for a long time.
- Offer an alternative version of your application to IE users, which is not as computationally intensive. Inform the users that they can use the full version in another browser.
jQuery Plug-in
This solution is all wrapped up in a nice jQuery plug-in. The remainder of this article shows its usage. To use this plug-in optimally, you should isolate functions in your code that meet the following criteria:
- The function must be "static" – it cannot access any closure variables, only variables that are passed to it through its arguments. You can use
setTimeout
, setInterval
, XMLHttpRequest
, and construct Web Workers – but no other globals are available. - The function takes longer than 100ms to run. This ensures that the benefits of running it in a background worker are greater than the overhead of constructing the worker.
- If you want to support IE, the function should execute fewer than 5 million statements. Otherwise you should split the work into multiple parts, implant calls to
setTimeout
into it, or offer an alternate application to IE users. Of course, if you are optimising an existing application, your code won’t run any slower in Internet Explorer than it does already.
Basic Use
Call the $.work
function to run a function in another thread. This returns a Deferred
object which you can use like any other Deferred jQuery object.
Let’s assume you’ve got a long running function "doStuff
" in your application:
function doStuff(arg1, arg2) {
}
var result = doSomething("a", "b");
This can be parallelised by re-jigging the function to take a single parameter, and adding a callback to the ‘done
’ helper function:
function doStuff(args) {
}
$.work(doStuff, {a:1, b:100}).done(function(result){
});
Handling Errors
The done
function above only gets called when the function executes without any exceptions. To handle exceptions, use the then
and fail
helper functions:
function doStuff(args) {
}
$.work(doStuff, {a:1, b:100}).then(function(result){
}).fail(function(event){
});
Multiple Threads (Fork and Join)
You can run multiple Workers and easily join the results using the $.when
Deferred helper function:
function doStuff(args) {
}
var work1 = $.work(doStuff, {a:1, b:50});
var work2 = $.work(doStuff, {a:51, b:100});
$.when(work1, work2).then(function(result1, result2){
}).fail(function(event){
});
Conclusion
As browser-based apps continue to become more complex, and CPUs gain more cores, there will be a natural need to offload work into separate threads. HTML5 Web Workers will likely form a big part of this, and I think that combining them with jQuery Deferred objects can make it simpler for us developers to write simple, easy-to-read, parallel code, without adding any extra overhead.