Part 1 of the series Analysing Data in Real Time #1, #2
Introduction
This article is based on conversations and consultations I have had with McLaren Electronic Systems, Formula 1 Management, my own experience with Microsoft StreamInsight and publicly available information.
When watching F1, have you ever wondered how all that in-car data gets to your TV screen or how teams change strategies and make decisions as the race evolves? If so, then read on...
Chasing Hamilton, Rosberg is using more fuel whilst the gap between them is updated constantly
Background
Increasingly most televised sports broadcasts are accompanied by a plethora of statistical data. Whether it's tennis with serve speeds or football with run distance and pass accuracy, timely delivery of such data to viewers is becoming an important aspect of the viewers experience. Consisting of around 80,000 components, the average Formula 1 car contains up to 300 sensors broadcasting telemetry data through the ECU and out over the radio waves. Over the course of one race, this can amount to 1.5 billion data points and even when compressed represents several gigabytes.
That's one hell of a lot of moving data to slice and dice!
If you're an F1 team, how you manage and interpret this data can be the difference between making strategic decisions that will win you the race or blunders that cost you millions in sponsorship. For viewers knowing that your favourite driver is catching the car in front (whose high fuel consumption causes him to slow down to complete the race), only adds to the excitement and understanding.
Will Hamilton exit the pit lane to a clear track or be stuck in traffic causing him to lose precious time?
Formula 1 uses a system called the Advanced Telemetry Linked Acquisition System (ATLAS) developed by McLaren Electronic Systems, to relay information from the car to trackside antennas and then to the teams on the pit wall and the television broadcasters.
ATLAS System components
Clients connect to the ethernet backbone and receive information over TCP and UDP. But this is raw unstructured data and receiving it is akin to looking down the mouth of a hose pipe as someone else turns the tap.
RPM, speed, gear selection, throttle percentage, DRS and KERS just some of the data sources transmitted and interpreted.
So what techniques can be used to extract meaning and derive insight from this flood of information? Indeed, faced with a similar situation, what can you use to help provide value to your customers?
Introducing Microsoft StreamInsight
This season, the broadcasters of Formula 1 have begun introducing a new system using Microsoft StreamInsight that filters and derives useful statistics from raw telemetry data. For example, if a typical fuel sensor will only measure the current level every second you can calculate the flow rate or usage if you already know volumes and densities.
Coming under a class of applications known as Complex Event Processing (CEP), StreamInsight is Microsoft's offering for real time data stream processing. Currently in at version 2.3 StreamInsight is the easiest of CEP systems to configure and use.
Architecture of a StreamInsight application
C# developers already comfortable with concepts such as LINQ and Observables will readily be able to transfer their skills. Additionally, so long as you have a Microsoft SQL Server license, the StreamInsight is free to use and deploy.
As StreamInsight is free, Microsoft has no incentive to sell it, one possible reason why it is known to so few. But by customising an off the shelf CEP system, we can take advantage of data and patterns in that data that would otherwise be ignored.
Go faster using less resources
CEP Concepts
All CEP Systems share some common concepts all of which should be familiar if you have ever worked with a database before. Here's what you need to know to adapt your thinking from the world of static data querying to filtering moving data.
Event Streams
Any sequence of data whether it is static or moving can be thought of as a stream. Like a FileStream
(which is just a sequence of bytes), the speed of a car can be represented as a sequence of numbers that grows as time progresses. Time stamp each data point and you have a time series of complex Events aka an Event Stream.
Queries, Filters, Sources and Sinks
With languages such as SQL, we can query a database source table and return a subset of that data to ourselves (the sink). For example:
SELECT * FROM Speedometer WHERE lap = 10
Likewise, with an event stream, we can filter a source stream in real time using Linq and pipe the results into a new stream:
var sinkEventStream = from datapoint in speedometerEventStream
where datapoint.lap = 10
select datapoint;
In this example, we've created a sink called sinkEventStream
and every time a new event enters into the source (speedometerEventStream
), if it's lap property is 10 then it is consequently added to the sink.
Joins, Windows and Partitions
A window is essentially a moving view on static
data (SQL) or a fixed length view of moving data (CEP). In CEP windows can either be sliding (i.e. they move continuously) or tumbling discrete amounts.
As an example, suppose we have 2 sensors providing us with data to the following SQL tables:
FuelLevel
LapTime
You can join and partition data in with SQL Window functions to keep track of the average fuel for the lap.
SELECT l.Lap , f.Fuel, AVG(f.Fuel )
OVER(PARTITION BY l.Lap ) AS "Average Fuel This Lap" FROM
FuelLevel f
Join LapTime l
on f.Time = l.Time
Similarly with StreamInsight we can join streams, group them and then partition them over time or by arbitrary values:
CepStream<Fuel> FuelLevel;
CepStream<Lap> LapTime;
var joined = from f in FuelLevel
join l in LapTime
on f.Time equals l.Time
select new LapAndFuel() { FuelLevel = f.FuelLevel, Lap = l.Lap};
var AverageFuelStream = from win in
joined
group win by win.Lap into eachGroup
from window in eachGroup.SnapShotWindow()
select new {Average = window.Avg(), Lap = window.Lap};
So essentially, StreamInsight
gives us much the same functionality as you would expect from SQL Server with the added advantage that everything is happening in real time. Mix this in with the ability to easily call out to other sub systems and you have the beginnings of a system that can monitor data and make intelligent decisions.
Leadership enabled by speed AND intelligence?
Future Articles
This was just a brief introduction to CEP and MS StreamInsight, but in future articles I plan to demonstrate some practical examples as well as describe the various tools and deployment scenarios you might use in real life.
History
- Creation
- Image sizes
- Article series links