Whether you're trying to find the best route to go from Place A to Place B or trying to analyze neighborhoods' access to supermarkets to identify food deserts, framing the problem in the form of a network, using streets as your edges, is often the simplest and most accessible means of solving it.
Before you can start building your street network, however, you need data. Google Maps is probably the first choice that comes to most people's minds when thinking of geographic information, but its API is unfortunately not free. Thankfully, there's a free, open-source alternative with much of the same data: the OpenStreetMaps (OSM) project.
Having found a data source, though, it can still be a pain to navigate OSM's data model of nodes, paths, and relations. It's even harder — and a significant time sink — to familiarise yourself with the multiple APIs (it has three) and download methods and their associated query languages. Thankfully, the excellent OSMnx project by the University of Southern California's Professor Geoff Boeing can do all the heavy lifting for you.
Introduction to OSMnx
OSMnx is an open-source Python library that allows you to download OSM data with simple queries, such as a place name or a bounding box. Not only can it fetch this data, but far more importantly, it also performs a variety of pre-processing on the raw data from OSM and formats it into a form that is readily converted into a NetworkX MultiDiGraph. For those not familiar with it, NetworkX is the premier graph analysis and visualization library in Python. As such, it is readily equipped to handle many tasks, such as finding the shortest path between two points and more.
Additionally, OSMnx also has first-class support for conversion between a NetworkX graph and a GeoPandas GeoDataFrame, an extension of the Pandas DataFrame we all know and love with added support for working with geospatial data in a tabular format. In doing so, OSMnx also allows for quick and easy visualization of the OSM street networks using the GeoPandas mapping tools, with additional helper functions for added customization of the resulting maps.
The rest of this tutorial will go through examples of achieving some of these tasks using OSMnx, with code samples. Let's get started by first downloading some data.
Downloading Street Networks Using OSMnx
You can pull down street networks via OSMnx using several options, such as the name of a place, a bounding box, a distance from some latitude-longitude point, or a radius around an address. For example, let's try to get the street networks of San Francisco and Oakland, California.
import osmnx as ox
cities = [
"San Francisco, California",
{
'city': 'Oakland',
'state': 'California',
'country': 'USA'
}
]
sf_oakland = ox.graph_from_place(cities, network_type='drive', simplify=True)
Note here that we can specify the place as either a string (in the case of San Francisco) or as a dictionary specifying the region more precisely (as we did for Oakland). As in the example above, we can also call the function on just a single place or a list of places. This is what the graph we just downloaded looks like:
Alongside a string naming a place, we can also download a graph for all the streets within a certain distance from a given address. Let's try that next, by pulling all the roads within 2 km from 10 Downing Street, the official residence of the Prime Minister of the U.K.
downing = ox.graph_from_address("10 Downing Street, London, UK", dist=2000)
This is what the graph looks like:
We can similarly download data using multiple types of parameters. Please see the documentation for the osmnx.graph module for more details on all the possible options.
type(downing)
networkx.classes.multidigraph.MultiDiGraph
Also, note that the graph is stored as a NetworkX MultiDiGraph, so you could quite literally move on to the network analysis portion of your project after calling a single function. That's a remarkable degree of convenience!
Plotting Street Networks Using OSMnx
Thus far, I've just presented you with the plots of the graphs, without explaining how they’re made. Let's now see how we can plot graphs using OSMnx. Thankfully, this is as simple as a single function call:
fig, ax = ox.plot_graph(downing, node_size=2, node_color='r', edge_color='w', edge_linewidth=0.2)
Most of the arguments to the ox.plot_graph
function are, thankfully, self-explanatory. Let's go over them quickly:
node_size
lets you specify the radius of the nodes in the graph node_color
lets you specify the color of the nodes in the graph edge_color
lets you specify the color of the edges in the graph edge_linewidth
lets you specify the width of the lines representing the edges in the graph
It's important to note a few things here:
- a 'node' above refers to the intersection of two or more streets.
- an 'edge' above refers to a single street connecting two (or more) intersections.
- OSMnx uses GeoPandas' plotting functions under the hood, which are themselves based on Matplotlib. As such, you can use many of the Matplotlib operations you're used to with the resulting images. For example, fig.savefig("./my_network.png") would save the above visualization as a PNG in the current directory.
- Since we're using GeoPandas' plotting tools, you can see that the edges (roads) are represented with their proper geospatial shapes, instead of the straight lines we'd see if we had used NetworkX for visualizing the graph.
Conversion to GeoDataFrames
One of the best things about OSMnx is that it fits incredibly well into the existing Python GIS ecosystem and doesn't try to reinvent the wheel. Need to do graph analysis? The street network is converted to a NetworkX graph by default and ready for use with that library.
Need to deal with the road network in a tabular format for more extensive batch operations? GeoPandas' GeoDataFrame is here to the rescue and is a first-class citizen when using OSMnx. Converting the street network to a tabular format is as simple as a single line of code with OSMnx and splits the graphs into two data frames, one containing the nodes and one containing the edges.
nodes, edges = ox.utils_graph.graph_to_gdfs(downing)
nodes.head()
| y | x | street_count | highway | ref | geometry |
osmid | | | | | | |
101990 | 51.520573 | -0.147701 | 4 | NaN | NaN | POINT (-0.14770 51.52057) |
101992 | 51.521395 | -0.149695 | 4 | NaN | NaN | POINT (-0.14970 51.52139) |
101993 | 51.520316 | -0.149225 | 4 | NaN | NaN | POINT (-0.14923 51.52032) |
101995 | 51.519974 | -0.151786 | 3 | NaN | NaN | POINT (-0.15179 51.51997) |
101997 | 51.519093 | -0.148652 | 4 | NaN | NaN | POINT (-0.14865 51.51909) |
Dealing with this data in a tabular format can be incredibly convenient for large data cleaning operations. The above nodes data frame contains the longitude and latitude of the node (the x and y columns), while the geometry column contains the Shapely geometry shape that contains all the relevant GIS information for mapping.
edges.iloc[:, :4].head()
| | | osmid | oneway | name | highway |
u | v | key | | | | |
101990 | 330803151 | 0 | 17944925 | True | Weymouth Street | tertiary |
101998 | 0 | 532636970 | True | Harley Street | tertiary |
101992 | 1667118178 | 0 | 615056689 | False | Devonshire Street | unclassified |
101993 | 1685938630 | 0 | 17944925 | True | Weymouth Street | tertiary |
101992 | 0 | 4254943 | True | Upper Wimpole Street | tertiary |
The u and v columns represent the source and target nodes of the edge, respectively. Since this is a MultiDiGraph, we can have parallel edges with the same (u, v) values, which are disambiguated via the key. The rest of the columns contain various other pieces of information about the edges, such as the street's name, the type of road it is, and more. The edges data frame also has a geometry column that contains the relevant GIS information about the edges, but it's truncated here due to the very wide nature of the table.
You can then do your choice of cleaning, filtering, grouping, and other operations using the Pandas syntax we know and love (and sometimes hate), and then convert the network back into a MultiDiGraph with a single line of code again:
downing = ox.utils_graph.graph_from_gdfs(nodes, edges)
Now that we've visualized the street network and performed any number of arbitrary data manipulations on the network, we can start analyzing the network to solve our original problem (such as routing or identifying food deserts). As mentioned before, OSMnx is an exceptionally well-positioned library that doesn't try to do what other packages already can.
As such, given the native support for NetworkX graphs, you'd most likely move further analysis to the NetworkX library and its excellent array of utilities specifically designed for that purpose. However, OSMnx does come with a few built-in functions for analyzing the network. Let's look at a quick example of using the basic_stats
function to examine the average number of streets connected to a single node in this network.
ox.basic_stats(downing)['streets_per_node_avg']
2.8493639608309693
Conclusion
In conclusion, OSMnx can significantly simplify the entire ETL process for a street networks analysis problem by downloading, cleaning, visualizing, and converting street data into a NetworkX graph ready for analysis. It does this with just a few lines of code, thereby saving you potentially hours if not days in setup time. It also fits incredibly well into the existing Python ecosystem with excellent support for GeoPandas for data manipulation and NetworkX for network analysis, making street network analysis in Python as accessible and frictionless as possible.
This article originally appeared as a guest submission on ContentLab.