Introduction
It’s Spatial Data Analysis! Let me clarify. Spatial Data describes the position, shape and orientation of objects in space. And when we describe the position and shape of objects on the Earth – it’s called Geospatial data. It’s high time when software giants are moving beyond Relational Data and it’s the same time when Microsoft is up with SQL Server 2008, which gives developers access to two new spatial data types: Geometry and Geography. (If you have SQL Server 2008 installed, check now – you will find these data types are there. Ah! you have not seen that. I know you are busy seeing some other good things). So, welcome to the world of Spatial Data.
Background
SQL Server 2005 supported so many data types: int
, char
, varchar(50)
. But then the idea to precisely locate on map/globe increased the interest towards Spatial Data Analysis. And that's one of the reasons why Microsoft added two new data types in SQL Server 2008: geography and geometry for the Spatial Data Analysis.
Business Logic Involved
There is a lot of Business Logic involved with it. Let's consider two of them:\
- How nice would it be if we could visualize how the sales trends are in a particular region (on a map)? Then we can decide whether to open a new store or not looking at the proximity of customers and competitors. Also, if we need to grow our business (open new stores), we can make (and of course visualize) networks of our Shoppe to provide the most efficient coverage of an area.
- Have you heard of Global Positioning System (GPS). You might have! Consider the scenario when you have to send a Vehicle/Device/Resources at a particular place in an area. That might be for business purposes or to send Relief items in a Flood Situation. Pointing out locations on map will surely help!
So, what we are talking about is giving information on map rather in Tabular/Chart format.
About Geometry and Geography Data Types
Every variable, parameter and column in SQL Server table is defined as being of a particular datatype
. Everyone is aware of some common SQL Server Datatypes like int, char, float, etc.
SQL Server 2008 introduced two new – geometry (to store planar vector spatial data) and geography (to store geodetic vector spatial data)
Geometry
The geometry spatial data type is used to represent information in a uniform 2 – dimensional plane, much like what we did with Graph papers in schools. That’s why we can represent the position of any point using a single pair of Cartesian Co-ordinates (x,y).
However, the geometry data type is ideally suited for storing projected co-ordinates (x,y). In this case, the process of projection (remember Engineering Drawing) has already mapped the angular geographic coordinates onto a flat plane, onto which the methods of geometric data type can be applied.
Now, you would usually start with an existing diagram or a map, and define your spatial data in terms of that base map.
Geography
The earth is not flat, neither is it spherical. Isn’t it?
So, that’s why we have geographic data type. The most important feature of geographic datatype is that it stores geodetic spatial data, which takes account of the curved surface of the earth.
For example, if we were to define a line that connects two points on the earth’s surface in the geography data type, the line would curve to follow the earth’s surface.
When using a geographic coordinate system, the coordinates of the point are expressed using angles of Latitude (how far North (or South) of the Equator a point is) and Longitude (how far East (or West) of a Prime Meridian a point is) about geometry and geography data types.
Similarities and Differences
Similarities
- They can both represent spatial information using a range of geometries –
Points
, LineStrings
and Polygons
. - Internally in SQL Server 2008, both data types store spatial data as a
stream
of binary data in the same format.
- They both implement many of the same standard spatial methods (I'll let you know about this later):
Differences
Property | geometry datatype | geography datatype |
Shape of Earth | Flat | Round (ellipsoidal) |
Coordinate System | Projected (or natural planar) | Geographic |
Coordinate Values | Cartesian (x and y) | Latitude and Longitude |
Unit of Measurement | Same as coordinate values | As per spatial reference identifier (SRID) |
Size limitation | None | No object may occupy more than one hemisphere. (It means that any variable of geography datatype can’t store an object (area) that exceeds one hemisphere of earth!) |
SRIDs
Every time (read that again – “Every time”) we state the latitude or longitude, or x and y coordinates, that describe the position of a point in a geometry, we must also state the associated spatial reference system in which those coordinate systems were obtained. Without this extra information, a coordinate tuple is just a set of numbers.
However, will it not be an overhead to write out the full details of datum, the prime meridian and the unit of measurement each time we write down a set of coordinates. Fortunately (read that again – “Fortunately”) , various authorities allocate easily memorable unique integer reference numbers that represent all of the necessary parameters of a spatial reference system. These reference numbers are called Spatial Reference Identifiers (SRIDs).
In case of geography data type, every time we store an item of data, we must supply the appropriate SRID
. The supplied SRID
then correlates with one of the supported spatial reference systems in sys.spatial_reference_systems
table. However, for all general purposes, we can use the spatial reference system EPSG:4326
with SRID = 4326
.
In case of geometry data type, it makes no difference what spatial reference system the coordinates of each point were obtained from, as long as they are obtained from the same system. This is because SRID
is required in a projected coordinate system to initially determine the coordinates that uniquely identify the position on the earth. Once this is obtained, all other operations can be performed using basic geometrical methods. For example: Once we have determined that Point A is (0,0) and Point B is (30,40), the distance between them is always 50 units, irrespective of what spatial reference system was used to obtain these coordinates.
We will be generally using SRID 0
here. However, in case you need a special projection system for your map, use that particular SRID
. For all general purposes, we will use SRID = 0 with geometry data type.
SQL Query to Get You Started
Let's get started with writing up a query in SQL Server Management Studio 2008:
DECLARE @a Geometry
SET @a = Geometry::STGeomFromText(‘LINESTRING(0 0, 10 10, 20 0)’,
0)
SELECT @a
Here we declare a variable a of geometry data type. We set it as a Linestring
passing from (0,0) ; (10,10); (20,0)
. (The SRID
is 0
.)
When we execute it, we can see a new tab in SQL Server Management Studio called – Spatial results!
Let us now execute a Query to know the distance of Place A (40.20 degree N , 2.22 degree East on earth) and Place B (50.20 degree N , 22.22 degree East on earth) DECLARE @a geography = geography :: Point (40.20 , 2.22 , 4326) :
DECLARE @a geography = geography :: Point (40.20 , 2.22 , 4326)
DECLARE @b geography = geography :: Point (50.20 , 22.22 , 4326)
SELECT @a.STDistance(@b)
Here 4326
is the SRID
.
Points of Interest
There are so many things that are coming up in the series. Spatial Data can be used in so many projects using Integration of .NET with Google/Bing Maps to show your data accurately. There is a lot of fun involved when you can precisely locate your location, your structure and visualize things.
History