Introduction
Data-Driven Documents (d3js.org[^]) is a JavaScript library that offers an interesting approach to visualization for the web. D3 enables direct inspection and manipulation of the standard document object model (DOM). With D3, we can selectively bind input data to document elements and define dynamic transforms that both generate and modify the content of the DOM.
D3’s primary authors Michael Bostock, Vadim Ogievetsky and Jeffrey Heer have created a library that uses representational transparency to improve expressiveness and provide natural integration with developer tools. It’s quite different from other visualization libraries since it focuses on direct inspection and manipulation of the document object model.
The functionality of D3 can broadly be separated into four categories:
- Loading data
- Binding data to elements
- Transforming elements by inspecting the bound data, and updating the bound properties of the elements
- Transitioning elements in response to user interaction
The transformation of elements based on bound data is the most important category and D3 provides the structure for performing these transformations, while we define how we want those transformations carried out – leaving us in full control of the final output.
D3 is not a traditional visualization framework. Instead of introducing a new graphical grammar, D3’s authors choose to solve a different, smaller problem: the manipulation of documents based on data.
Basically, D3 can be thought of as a visualization “kernel
” rather than a framework, somewhat resembling other document transformers such as jQuery, CSS and XSLT. These libraries share a common concept of a selection: A set of elements is selected matching the given criteria before applying a set of operations that transforms the selected elements. JavaScript-based selections provide flexibility on top of CSS, since styles can be altered dynamically in response to user interaction and changes to the data.
The designers of D3 choose to adopt the W3C Selectors API for selection; a mini-language consisting of predicates that filter elements by tag (“tag
”), class (“.class
”), unique identifier (“#id
”), attribute (“[name=value]
”), containment (“parent child
”), adjacency (“before ~ after
”), and other facets. Since predicates can be intersected (“.a.b
”) and unioned (“.a, .b
”) we have a rich and concise method for selection.
D3’s operates on set of elements queried from the current document. Data joins bind data to elements, enabling functional operations that depend on data, and produce enter and exit sub-selections that enables creation and destruction of elements in correspondence with data. Operations are, by default, applied instantaneously, while animated transitions interpolate attributes and styles smoothly over time. The library facilitates event handlers that respond to user input and enable interaction. D3 also provides a number of helper modules that simplify common visualization tasks, such as layout and scales.
A Tiny Example
Let’s start by creating a small interactive visualization based on a simple particle system:
<head runat="server">
<title>An introduction to d3.js</title>
<style type="text/css">
body
{
background-color:#222;
}
rect
{
fill: steelblue;
stroke-width: 2.5px;
}
</style>
<script src="http://d3js.org/d3.v3.min.js"></script>
</head>
The script
tag loads the minified version 3 of D3 from its canonical location: http://d3js.org/d3.v3.min.js.
The body of our page consists of a single script
tag. The code creates an svg
element and attaches the particle function to the mousemove
event of the element.
<body>
<script type="text/javascript">
var w = 600,
h = 400,
z = d3.scale.category20b(),
i = 0;
var svg = d3.select("body").append("svg:svg")
.attr("width", w)
.attr("height", h)
.on("mousemove", particle);
function particle()
{
var m = d3.mouse(this);
svg.append("svg:rect")
.attr("x", m[0] - 10)
.attr("y", m[1] - 10)
.attr("height", 20)
.attr("width", 20)
.attr("rx", 10)
.attr("ry", 10)
.style("stroke", z(++i))
.style("stroke-opacity", 1)
.style("opacity", 0.7)
.transition()
.duration(5000)
.ease(Math.sqrt)
.attr("x", m[0] - 100)
.attr("y", m[1] - 100)
.attr("height", 200)
.attr("width", 200)
.style("stroke-opacity", 1e-6)
.style("opacity", 1e-6)
.remove();
}
</script>
</body>
d3.mouse
returns the x
and y
coordinates of the current d3.event
, relative to the specified container – the coordinates are returned as a two-element array [x, y]
.
Next we use D3 to append a rect element to the svg
, centered at the current mouse position – attaching a transition that lasts for 5 seconds, removing the rect
element on completion.
The end result is a rather pleasing visualization that looks like a cross between northern lights and a kaleidoscope.
Basics
Well, that was the teaser and now it’s time to take closer look at D3 basics – first we’ll transform this beautiful svg
:
<svg id="visual" width="220" height="220">
<rect width="100" height="100" rx="15" ry="15" x="40" y="40" />
<rect width="100" height="100" rx="15" ry="15" x="60" y="60" />
<rect width="100" height="100" rx="15" ry="15" x="80" y="40" />
<rect width="100" height="100" rx="15" ry="15" x="100" y="25" />
<rect width="100" height="100" rx="15" ry="15" x="120" y="50" />
</svg>
into:
Which is performed by the following piece of code:
<script type="text/javascript">
var visual = d3.select("#visual");
var w = visual.attr("width");
var h = visual.attr("height");
var rectangles = visual.selectAll("rect");
rectangles.style("fill", "steelblue")
.attr("x", function ()
{
return Math.random() * w;
})
.attr("y", function ()
{
return Math.random() * h;
});
</script>
D3 implements two top-level methods for selecting elements: select
and selectAll
. The methods accept selector string
s; select
selects only the first matching element, while selectAll
selects all matching elements in document traversal order. Here, we use the select
method to create a selection that contains one element: the svg element with id="visual"
.
D3 selections are arrays of arrays of elements and D3 binds additional methods to the array so that we can perform operations on the selected elements, like setting an attribute on all the selected elements. A peculiarity of D3 is that selections are grouped and hence arrays of arrays of elements. This is done to preserve the hierarchical structure of subselections – which we will get back to later. Usually, we can ignore this detail, but this is why a single-element selection looks like [0][0]
and not [0]
.
A Simple Line Chart
It’s time to do something a bit more interesting, like drawing a line chart using data retrieved from a CSV file that resides on the server. That’s right; there is no law that says we have to use json or some verbose XML format just because we’re doing JavaScript.
The data for following the line chart was retrieved from the Norwegian Petroleum Directorate’s FactPages
.
The above chart says a bit about why oil trades at $111 a barrel – while there is still a significant amount of oil left, it’s no longer as accessible as it used to be; and retrieving it has become more costly and time consuming. It’s not an infinite resource. Ah, well, back to the code:
<script>
var margin = { top: 20, right: 20, bottom: 30, left: 50 },
width = 600 - margin.left - margin.right,
height = 400 - margin.top - margin.bottom;
First, we define a margin object with properties for the four sides of our chart, then we set the width and height that we later will use as the dimensions for our chart.
var parseDate = d3.time.format("%Y").parse;
d3.time.format
creates a time formatter using the given specifier %Y
, which says that this formatter will handle four digit years. We then assign the parse
function for the formatter to parseDate
. We will later use this parse function to extract the year from the csv data retrieved from the server.
Next we need to set up the scales for our chart. Scales are functions that map values from an input domain to values in an output range.
var x = d3.time.scale()
.range([0, width]);
d3.time.scale
implements a scale that converts values from the input domain to dates. The time scale also provides suitable ticks based on time intervals, greatly simplifying the work required to generate axes for nearly any time-based domain.
var y = d3.scale.linear()
.range([height,0]);
d3.scale.linear
is the most common scale which maps a continuous input domain to a continuous output domain. The mapping is linear since the output range value y can be expressed by the linear function of the input values x: y = mx + b
.
It’s now quite easy to set up the axis for our chart:
var xAxis = d3.svg.axis()
.scale(x)
.orient("bottom");
var yAxis = d3.svg.axis()
.scale(y)
.orient("left");
d3.svg.axis
creates an axis component which we configure to use the scales we just created.
d3.svg.line
creates a line generator with the default x- and y-accessor functions that generates SVG path data for an open piecewise linear curve, which is what we want for a line chart.
var line = d3.svg.line()
.x(function (d) { return x(d.year); })
.y(function (d) { return y(d.oil); });
The accessor function is invoked for each element in the data array passed to the line generator, which allows us to map year and oil to x and y respectively.
Now we need to create the SVG element for the chart and add an SVG G (group) element to the SVG which will contain the final chart.
var svg = d3.select("body").append("svg")
.attr("width", width + margin.left + margin.right)
.attr("height", height + margin.top + margin.bottom)
.append("g")
.attr("transform", "translate(" + margin.left + "," + margin.top + ")");
It’s now time to load our data from the server, and D3 provides built-in support for loading and parsing comma-separated values (CSV). CSV is more space-efficient than JSON, and this can improve loading times for large datasets.
d3.csv("YearlyOilProduction.csv", function (error, data)
{
d3.csv issues an HTTP GET
request for the comma-separated values (CSV) file at the specified url. The request is processed asynchronously, and when the CSV data is available, the specified callback will be invoked with the parsed rows as the argument.
data.forEach(function (d) {
d.year = parseDate(d.year);
d.oil = +d.oil;
});
Once we have our data, we need to ensure that it’s in the expected format by using the parser we created earlier to coerce the year
into a Date
value, and d.oil = +d.oil
to coerce oil
into a number.
With our data in the expected format, we are ready to set the input range for the scales, and d3.extent
will return the minimum and maximum value of the passed data.
x.domain(d3.extent(data, function (d) { return d.year; }));
y.domain(d3.extent(data, function (d) { return d.oil; }));
Now we are ready to actually render the chart: We append a g
element for each axis, and specify a translate transform for the x-axis to move it to the bottom of our chart – remember that for an SVG coordinates are relative to the upper left corner, with positive y-axis going downwards.
svg.append("g")
.attr("class", "x axis")
.attr("transform", "translate(0," + height + ")")
.call(xAxis);
svg.append("g")
.attr("class", "y axis")
.call(yAxis)
.append("text")
.attr("transform", "rotate(-90)")
.attr("y", 6)
.attr("dy", ".71em")
.style("text-anchor", "end")
.text("million Sm3");
.call(xAxis)
causes the passed axis object to render itself into the new g
element.
Finally, we create an SVG path element that uses the line generator we created earlier to render the line segments for the data.
svg.append("path")
.datum(data)
.attr("class", "line")
.attr("d", line);
});
</script>
Serving Data Efficiently Using a Synchronous HTTP Handler
Visualizations often require a vast amount of data, and providing the data in an efficient manner is usually important.
One way to do this really efficiently is to implement an ASP.NET HTTP handler. Given how easy this really is, it’s surprising that it’s not done more often:
public class DataRequestHandler : IHttpHandler
{
public bool IsReusable
{
get
{
return true;
}
}
public void ProcessRequest(HttpContext context)
{
var response = context.Response;
response.ContentType = "text/csv";
response.Output.WriteLine(ProductionData.CsvHeader);
List<ProductionData> productionDataList = Global.Data;
foreach (ProductionData element in productionDataList)
{
response.Output.WriteLine(element.ToString());
}
response.End();
}
}
That’s it – with the above code, we’ve implemented a complete ASP.NET HTTP handler which serves up CSV formatted data.
IsReusable
returns true
because the handler is stateless, which means that it doesn’t contain fields that may change between invocations of ProcessRequest
– and that allows ASP.NET to reuse an instance of the handler.
The implementation of ProductionData.CsvHeader
is simple:
public static string CsvHeader
{
get
{
return "timeStamp,oil,gas,ngl,condensate,oe,water";
}
}
The same can be said about ProductionData.ToString()
:
public override string ToString()
{
string result = timeStamp.Year.ToString() + "-" + timeStamp.Month.ToString() + "," +
oil.ToString(CultureInfo.InvariantCulture) + "," +
gas.ToString(CultureInfo.InvariantCulture) + "," +
ngl.ToString(CultureInfo.InvariantCulture) + "," +
condensate.ToString(CultureInfo.InvariantCulture) + "," +
oe.ToString(CultureInfo.InvariantCulture) + "," +
water.ToString(CultureInfo.InvariantCulture);
return result;
}
Real world industrial database servers often need to serve up hundreds of thousands of records, and doing so using XML or JSON really hurts the performance of the server. While binary would be even better, CSV is a good compromise between performance and flexibility – and it’s a reasonable way to provide data to HTML5 applications.
History