Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / algorithm

Finding the Closest Latitude and Longitude in an OpenStreetMap CSV File

5.00/5 (2 votes)
2 Dec 2013CPOL2 min read 24.2K  
Closest latitude and longitude to a referenced latitude and longitude.

Introduction

In this tip we will learn how to find the closest latitude and longitude in an OpenStreetMap CSV file. This is useful because the OpenStreetMap databases are not complete and finding the closest node is important to a correct and accurate GPS location.

Background 

Before starting we need to get a copy of any OSM database (note that the larger the database, especially the world database, it will take longer to process). 

  1. Download an OpenStreetMap database. OpenStreetMap databases are located at Download Databases. Make sure a database with the extension of .PBF is downloaded.
  2. Next, a conversion tool is needed to take that .PBF database and convert it to a .CSV file. Navigate to OSMConvert and download OSMConvert for Windows or Linux.
  3. Now that both files are downloaded, you have a database file and a converter application. The next step is creating a batch script to convert your database into a .CSV database.
  4. Create a new batch file with the following code:

    BAT
    @echo off
    cls
    osmconvert.exe connecticut.pbf --all-to-nodes --csv="@id 
       @lat @lon highway amenity shop name" -o=convert.csv

    As you can see, I used the Connecticut database. Replace "connecticut.pbf" with the database you downloaded.

  5. After running this batch script, you will get an output file named convert.csv. Here is a small snippet of what your .CSV file should look like:
  6. 32658277    41.6970505    -72.7321095                
    60642404    41.3847854    -73.5581884                
    60642405    41.3848481    -73.5576026                
    60642406    41.3850997    -73.5558833                
    60642407    41.3852108    -73.5553076                
    60642408    41.3854561    -73.5541931                
    60642409    41.3855941    -73.5536402                
    60642410    41.3859120    -73.5525427                
    60642411    41.3860840    -73.5520105                
    60642412    41.3862672    -73.5514768                
    60642413    41.3864555    -73.5509534                
    60642415    41.3875006    -73.5480470                
    60642416    41.3876687    -73.5475638                
    60642418    41.3881994    -73.5459707                
    60642419    41.3885572    -73.5447182    motorway_junction

    Toward the end, you will see much more data in the database.

    1000000017180069    41.9893800    -72.6480980    primary            North Main Street
    1000000017180070    42.0450340    -73.4087070    service            
    1000000017180071    41.9395970    -72.6294340    primary            North Main Street
    1000000017180072    41.6089105    -72.8778610    primary            North Main Street
    1000000017180073    41.6758988    -72.9467286    secondary            North Main Street
    1000000017180074    42.0418960    -73.4147590    service            
    1000000017180075    41.7792640    -73.2191550    service            
    1000000017180076    41.9591465    -72.7224673    secondary            North Main Street
    1000000017180077    42.0349050    -73.4026730    service            
    1000000017180078    41.9325970    -72.6177340    secondary            North Main Street
    1000000017180079    41.6110220    -73.4773270    residential            
    1000000017180080    41.6106810    -73.4769270    residential            
    1000000017180081    41.6406441    -72.4741366    primary            North Main Street
    1000000017180082    42.0442280    -73.3870410    service            
    1000000017180083    41.5672993    -73.2173759    residential            
    1000000017180084    41.7951592    -72.5235614    primary            North Main Street
    1000000017180085    41.6649840    -73.2698040    residential            
    1000000017180086    41.7793959    -72.5473544    residential            Ferndale Drive

    If you are getting this type of data then you have successfully converted your database into a workable database.

Using the code

Before you begin reading how to use the code I highly suggest that if you don't like math, just skip down to "Implementing the Algorithm and Putting it all Together".

The main algorithm

The main algorithm is a very simple process to understand. You have a reference Latitude and Longitude and you have a different a set of different latitudes and longitudes. By finding the distance between the reference latitude and longitude and the compared latitude and longitude, the closest distance is obviously the nearest location to the reference.

Here is a very broad and general code of the algorithm:

C#
var rLat;
var rLon;

var distance;

// Loop through the file
for each line in file
    var sLat;
    var sLon;
    
    //Get distance and compare
    var _distance = GetDistance();

    if(_distance < distance)
        distance = _distance
    end if    

next

Calculating the distance

Calculating the distance between the reference latitude and longitude and the compared latitude and longitude requires some higher level math.

To calculate the distance, Haversines formula is used.

a = sin²(Δφ/2) + cos(φ<sub>1</sub>).cos(φ<sub>2</sub>).sin²(Δλ/2)
c = 2.atan2(√a, √(1−a))
d = R.c 

φ is latitude, λ is longitude, R is earth’s radius 

The Java implementation of this formula is as follows:

Java
private static Double CalculateDistance(String LatLon1, String LatLon2) {
    Double lat1 = Double.parseDouble(LatLon1.split(",")[0]);
    Double lon1 = Double.parseDouble(LatLon1.split(",")[1]);
    Double lat2 = Double.parseDouble(LatLon2.split(",")[0]);
    Double lon2 = Double.parseDouble(LatLon2.split(",")[1]);
    Double latDistance = toRad(lat2 - lat1);
    Double lonDistance = toRad(lon2 - lon1);
    Double a = Math.sin(latDistance / 2) * Math.sin(latDistance / 2)
            + Math.cos(toRad(lat1)) * Math.cos(toRad(lat2))
            * Math.sin(lonDistance / 2) * Math.sin(lonDistance / 2);
    Double c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1 - a));
    Double distance = EarthRadius * c;
    return distance;
}

Reading the file

Reading the .CSV file is the easiest part! Reading the file line by line is the most efficient option because most computers do not have enough memory to load the entire file into memory.

Java
BufferedReader br = new BufferedReader(new FileReader("connecticut.csv"));
String line;
while ((line = br.readLine()) != null) {
    //Algorithm will go insde of here 
} 
br.close;

Replace the connecticut.csv to the file location of your .CSV database file.

Implementing the Algorithm and Putting it all Together

The implementation of the algorithm is very simple.

Java
double srefDistance = 0.0;
String idsShortestLatLon = "";

        String StartLatitudeLongitude = "41.47295453,-73.34532628";

// Locate the nearest points on the database
BufferedReader br = new BufferedReader(
        new FileReader("connecticut.csv"));
String line;
while ((line = br.readLine()) != null) {
    // process the line.
    // @id @lat @lon highway amenity shop name
    String id = line.split("\t")[0].toString();
    double lat = Double.parseDouble(line.split("\t")[1].toString());
    double lon = Double.parseDouble(line.split("\t")[2].toString());

    String sLatLon2 = lat + "," + lon;
    double distance = CalculateDistance(StartLatitudeLongitude,
            sLatLon2);
    
    if (srefDistance == 0.0) {
        System.out.println(srefDistance);
        srefDistance = distance;
    } else {
        if (distance < srefDistance) {
            System.out.println(distance);
            srefDistance = distance;
            idsShortestLatLon = id;
        }
    }
}
br.close();

And that's it!

Points of interest 

  • The database file is separated by tabs. In Java, representation of tabs is "\t".
  • The following code returns the ID of the closest found latitude and longitude in the database.
  •    ID               LAT             LAT
    32658277    41.6970505    -72.7321095

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)