Introduction
There are various Web Services that give you a corresponding place name if you feed them a postal-code. GeoNames is such a service.
As an extra service, GeoNames provides the databases under the Creative Commons 3.0 license. First, you download a country-specific zip file from their recent database dumps and extract that.
Using the Code
If you're just interested in importing the data into your database, then run the demo. Enter a valid connection string, point the file dialog to your extracted download, and wait for the progress bar to fill up.
The source code is relatively simple, as this is a straightforward import-job. The import logic is nicely tucked away in the file "FormMain.ImportProcedure.cs". This file holds the backgroundworker-logic, which is based on a foreach
-loop:
private void backgroundWorker1_DoWork
(object sender, System.ComponentModel.DoWorkEventArgs e)
{
[...]
string[] lines = File.ReadAllLines(args.SourceFileName, Encoding.UTF8);
foreach (string line in lines)
{
string[] items = line.Split('\t');
IDbCommand sqlInsertItem = db.NewCommand(sqlInsertIntoImportTable);
sqlInsertItem.Parameters.Add(db.NewParam("countrycode", items[0]));
[...]
sqlInsertItem.Parameters.Add(db.NewParam("longitude", items[9]));
sqlInsertItem.Parameters.Add(db.NewParam("accuracy", items[10]));
sqlInsertItem.ExecuteNonQuery();
Supporting Different Database Drivers
This version supports multiple database-formats, and has been tried with SQL Server, Microsoft Access, SQL CE, SQLite, Oracle XE and Microsoft Excel.
The import-logic doesn't need to know the specifics of the database-driver used, and is using an interface. The interfaces get replaced by the equivalent driver that the enduser has selected, replacing the IConnection
with an object created at runtime. The creation of the appropriate objects is done by the internal DataInterface
-class, by calling the static New
method.
This method then switch
-es to the driver that the user has selected in the dropdownlist. If the user wants a SqlClient
, we return a SqlConnection
. If the user wants an OracleClient
connection, we return a connection that facilitates Oracle.
switch (dbType)
{
case DataProvider.ODBC:
{
result._con = new System.Data.Odbc.OdbcConnection(connectionString);
break;
}
case DataProvider.OleDB:
{
result._con = new System.Data.OleDb.OleDbConnection(connectionString);
break;
}
case DataProvider.OracleClient:
{
result._con =
new System.Data.OracleClient.OracleConnection(connectionString);
break;
}
case DataProvider.SqlClient:
{
result._con = new System.Data.SqlClient.SqlConnection(connectionString);
[...]
Each of these databases can be accessed by opening a connection and posting IDbCommands
that contain a SQL statement. Most databases can handle ANSI SQL commands. Take a look at Asher Barak's article "Server Indifferent SQL" for more information on the ANSI-standards.
ConnectionStrings
Some sample connections that have been used while testing this code and their testresults:
SQL Server Express 2005 (Provider: SqlClient)
Server=.\SQLEXPRESS;Database=master;Trusted_Connection=True;
16377 records in 1 minute, 4 seconds - averaging 0.0039 seconds per record.
Microsoft Access (Provider: OleDB)
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\database1.mdb;User Id=admin;Password=;
16377 records 54 seconds, averaging 0.0033 seconds per record.
SQL CE (Provider: SqlCe)
Data Source=C:\MyDatabase1.sdf;
16377 records in 38 seconds, averaging 0.023 seconds per record.
SQLite (Provider: SQLite)
Data Source=C:\Database.sqlite;Version=3;
16377 records in 22 minutes and 31 seconds, averaging 0.0825 seconds per record.
Oracle XE 10g (Provider: OracleClient)
Data Source=XE;User Id=Anonymous;Password=Password;
16377 records in 48 seconds, averaging 0.0030 seconds per record.
Microsoft Excel 2007 (Provider: OleDb)
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\Map1.xlsx;
Extended Properties="Excel 12.0 Xml;HDR=YES";
16377 records in 1 minute and 3 seconds, averaging 0.0039 seconds per record.
How Long is This Going to Take?
There's a ProgressBar
to give visual feedback on the progress that we make. This progressbar will fill up slow or fast, depending on the speed of our machine. The value of our feedback declines if the progressbar fills up too slow - since the user isn't getting a real feedback - he's just waiting for another bar to appear in the progressbar, without any clue when this next bar might arrive.
One option is to make the bar longer, effectively giving a more fine-grained progressindication. Another option is to throw in an "estimated waiting time", like the one that you see when copying files with Windows Explorer. This will work best when the operation consists of blocks of work that take an equal amount of time. We need to track at what time the operation started. The next thing we want to know is how long we've been working on the current operation, so we subtract the startMoment
from the current date/time:
DateTime now = DateTime.Now;
TimeSpan span = now.Subtract(startMoment);
Now that we know how long we've been processing, we can calculate how long a single item took on average:
double timePerItem = (span.TotalSeconds / progressBar1.Value);
Once you know how long the processing of the average item takes (being a single step in the progressBar
), we can calculate how much time we're (probably) going to need to perform the rest of the operation. You can see the result when you hover the mousecursor over the progressBar
.
Notes
- SQLite was quite a bit slower than expected. There must be a more efficient way to talk to a SQLite database.
- Oracle doesn't like the
@
-sign in the parameter-declaration. The runtime haunted me with the error "ORA-01036: illegal variable name/number", until the @-character in the SQL-statement was replaced with a ":"-character.
History
- 21st January, 2009 - Initial version
- 8th February, 2009 - Decoupled database-dependencies and moved the importlogic to a backgroundworker