Introduction
There are many people out there who happily take your money for historical stock data. There are alternatives: The Yahoo API and Google trade API. What if you dont want to learn how to use them?
Simple: Nasdaq has a list of all traded symbols and yahoo financial api has an interface to download them.
If you just need the raw historical data for analysis with excel or your own software feel free to download the software. It will download to C:\Datenbank
Background
In the financial sector a company is not listed by it's name but rather by its symbol. Microsoft Corp. would be MSFT and so on. Every tradable company has a uniqe Symbol (3 to 5 letter name)
The basis for this project is one link:
http://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download
This link returns a list of all symbols ever traded in the US with some extra information. Now we can extract this information to a list with of type header:
struct header
{
public string symbol;
public string name;
public string sector;
public string industry;
public string infolink;
};
These names are selfexplanatory with infolink
being a link to the company website. What now?
Yahoo has a special link to download all raw historical data in one link:
http://ichart.finance.yahoo.com/table.csv?s=MSFT
REPLACE MSFT WITH SYMBOL
Very handy.
The Target
Our target is an app which can download all stock data to a folder on the harddisk called Database. It will contain every single symbol with all stock data availible for it and some extra information:
Example symbol: Database\AEPI
info.txt:
AEPI
AEP Industries Inc.
Specialty Chemicals
Capital Goods
http:
AEPI.txt
Date,Open,High,Low,Close,Volume,OpenInt
19860131,13.00,13.25,13.00,13.00,301200,5.71
19860203,13.50,13.75,13.50,13.50,462700,5.93
19860204,13.50,13.88,13.50,13.50,173800,5.93
19860205,13.50,13.75,13.50,13.50,104800,5.93
19860206,13.50,13.75,13.50,13.50,122500,5.93
...
Beware: Everything will take about 900 MB of space. This is a standalone downloader so if we want to read from the file we would use something like this:
words=File.ReadAllLines("Path");
string[] subs=words[n].Split(",");
DateTime temp = DateTime.ParseExact(subs[0], "yyyyMMdd", null);
The first step would be to download all symbols:
System.Net.ServicePointManager.DefaultConnectionLimit = 100;
string get_symbols()
{
string contents = null;
contents += GetWebPageContent("http://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download");
contents += GetWebPageContent("http://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nyse&render=download");
return contents;
}
GetWebPageContent
downloads a website as a string. (For the exact implementation download the source project)
Now we have a list
of strings
which we can cast to a list
of our header
struct.
To recieve all data of a single symbol all we have to do is call the following line for each header:
string symbol=header.symbol;
contents = GetWebPageContent("http://ichart.finance.yahoo.com/table.csv?s=" + symbol.ToUpperInvariant());
For better performance we start many downloads parallel:
List<header> headers = new List<header>();
Parallel.For(0, length, new ParallelOptions { MaxDegreeOfParallelism = 100 }, i =>
{
Thread todo = new Thread(unused => load_symbol(headers[i].symbol, k, lenght, start));
Thread todo2 = new Thread(unused2 => write_infos(headers[i]));
todo.IsBackground = true;
todo2.IsBackground = true;
todo.Start();
todo2.Start();
todo.Join();
todo2.Join();
});
We write everything to the right folder and files in the two functions and thus the core of the app is finished.
What remains is the use of this data. We can use excel or own software. The target was some easy to use files on the harddrive and we achieved that. Individual files may contain every single day back to 1960.
Notice: If you use the close data, use the "Adj Close" column as this shows the adjusted price for any corporate actions such as stock splits, buybacks etc.
Have fun.
Things Left Undone
- Command line arguments for form free downloading
- Installer and background service which automatically updates the database every day
- Download singe symbols and automatically look up symbol of company name
- Proxy support
If you have implemented or really need any of these features or have some ideas please leave a comment.