Introduction
I am always amazed with space-related images that NASA shares on a daily basis (see http://antwrp.gsfc.nasa.gov/apod/astropix.html). I am a frequent visitor to that site and it was very common (if not always) to download the images to my computer for later offline visualization (we still don't have an always-on world - at least a cost effective one). Moreover, I frequently had several images I wanted to download (all pictures of a month) and it was a tedious clicking process.
Hence, I developed my own automated software for that!
The software and lessons shared here consist of the following:
- Perform HTTP requests to APOD website (for a selected date)
- Retrieve and process HTML code
- Identify and retrieve JPG images contained in HTML (currently only JPG)
- Store JPG image in the same directory as the executable. Image is stored as <year><month><day>.jpg
Moreover, I tried to develop a user friendly window. However, only date selection is allowed at present.
Background
No specific background is required, other than basic knowledge of C# (or any other OO language and principles) and Microsoft Visual Studio.
Note that the code is 'quick' - hence short in documentation and cleanness.
Using the Code
The program is simple. The user is currently allowed to select a specific date. Then three commands are currently allowed:
- Get image
- Get all images from selected month (whose date is earlier than selected date)
- Exit
2 is actually an extension of 1, as I will explain later. 1 does the most complicated stuff.
But before I start, we need to understand how information and pages in APOD are structured - in a very simple and intuitive way:
APOD base URL is:
- http://antwrp.gsfc.nasa.gov/
The people who run the site are both smart and kind for they provide an extensive archive of pictures (current and past), totally free, which may be accessed as:
- http://antwrp.gsfc.nasa.gov/apod/ap<2-digit year><2-digit month><2 digit day>.html
For example, image for year 2009, month 09 and day 09 is accessed through the following URL:
- http://antwrp.gsfc.nasa.gov/apod/ap090916.html
So, having the rule to fetch pages for any given day, it's a matter of starting using nice C#,.NET and HTTP stuff.
Let's Start !
Actually action starts when a user presses the 'GET !' button. The click event calls the following code:
private void buttonGET_Click(object sender, EventArgs e)
{
GetAPODImage(dateTimePicker.Value);
}
GetAPODimage
method does the main work:
First, we generate the URL for the selected date, following the rule explained above:
string url = GenerateAPOD_URL("http://antwrp.gsfc.nasa.gov/apod/ap",date);
This method builds the URL as follows:
private string GenerateAPOD_URL(string rootURL, System.DateTime date)
{
return rootURL
+ Get2DigitYear(date)
+ Get2DigitNumberAsString(date.Month)
+ Get2DigitNumberAsString(date.Day)
+ ".html";
}
Note that Get2DigitNumberAsString
returns a two digit string
. For example, if selected day is 1, it returns ‘01’.
Next, we are going to retrieve the HTML-page:
GetHTTP(url,5000,ref resultHTML);
GetHTTP
method will fetch the HTML page from ‘url
’ and store it to ‘resultHTML
’. Note that we pass ‘5000’ (i.e., 5 seconds) which indicates the timeout period to wait to fetch the page – this is very important to prevent the possibility to wait a long time for a page.
GetHTTP
code is as follows:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
req.Timeout = timeout;
req.ReadWriteTimeout = timeout;
WebResponse resp = req.GetResponse();
Stream resStream = resp.GetResponseStream();
int count = 0;
byte[] buf = new byte[8192];
resultHTML = "";
do
{
count = resStream.Read(buf, 0, buf.Length);
if (count != 0)
{
resultHTML += Encoding.ASCII.GetString(buf, 0, count);
}
}
while (count > 0);
resp.Close();
Straightforward. OK – now, if no exceptions were raised, we must parse ‘resultHTML
’ and look for ‘JPGs’ (which are the nice pictures we wish to download):
GetHTTPFilesByTypeSuffix(resultHTML, ‘.JPG’, ref list);
I will not go into excessive details in this function, except that work performed is to look for ‘href=
’ and fetch the referenced file. It will only consider it valid if it is a ‘.JPG’.
For example: if var ‘resultHTML
’ contains “<a href="image/0909/tarantula_gleason.jpg">
”, image/0909/tarantula_gleason.jpg will be added to the list (var 'list
').
Now that we have the image files, we should download and store them on disk (look at the code in bold):
try
{
....
foreach (string file in list)
{
String source = "http://antwrp.gsfc.nasa.gov/apod/"+file;
string dest = GetYYYYMMDD(date)+’.JPG’;
WebClient Client = new WebClient();
Client.DownloadFile(source, dest);
pictureBox.Image = Image.FromFile(dest);
}
}
catch (Exception ex)
{
Console.WriteLine("An exception occurred: " + ex.Message);
pictureBox.Image = Properties.Resources.g_close;
}
Note that:
- After the picture is stored, I show it in
pictureBox
(in the main window) - If an error occurs,
pictureBox
shows an error image
Finally, how do you fetch all images for a selected month (which are earlier than the selected date)? Very simple:
private void buttonGETMONTH_Click(object sender, EventArgs e)
{
DateTime date = dateTimePicker.Value;
for (int i = 1; date.Month == dateTimePicker.Value.Month; i++ )
{
GetAPODImage(date);
date = dateTimePicker.Value.AddDays(-i);
}
}
The trick is to initialize temporary variable date to selected date and then decrease n-days until the month is changed.
That’s it ! I hope you enjoyed the article.
Final Remarks
You may download the win32 binary and run it directly (you need .NET 3.5 Framework). If you open the solution, you'll note a dependency on WeAreUtils (which is also included). This library actually has several of the utilities referred in the article. However, I decided to present a simplified version to make it easier to understand.
If you liked the software, you may find it at SourceForge (http://sourceforge.net/projects/weareapod/)! Join the project!
History
- 16th September, 2009: Initial post