We have a web site used by the tax payers to file their annual taxes. This is used by 100s of thousands of users within a short period during peak season (Feb – Mar) and reasonably used other times. During the peak times, it almost explodes.
I have been working on improving performance of this site. Also, there have been no stats whatsoever to be able to confidently say how many users were served successfully or when it failed where the bottlenecks are. We relied on information stored in the Database to “guess” these stats. I am adding a bunch of tools and (Java) programs to get more stats as well.
I am currently doing load testing on a test version of this site using Apache JMeter. While this tool is great and comes with a suite of reports, and I have added a tons of stats on the server side (an old Sybase EA Server) I needed something outside of these two. So, I started looking at the HTTP Access logs. EA Server includes a light weight Apache server at the core. So, the format of the log file is almost identical to a Apache Server Access logs. (On EA Server the file is called JaguarHttpRequest.log).
As usual, I looked around for tools that will help with analyzing these logs (when the computer can do the work, why do it manually?!). While searching for the tools, initially I looked for Apache log parser. I did find a few, but nothing great. (I found one PHP program though, which is kind of interesting from programming point of view). Eventually, I landed in Web Analytics Software and our good old Wiki (here) came to the rescue. I tried a few software listed there (Webalizer, AWStats) look interesting.
Finally, I tried Analog. Bingo!! This is what I have been looking for. Analog is a very simple tool to install and use. (Installation is just unzipping the zip file). Using the tool is as simple as Double clicking or running from command line, the analog.exe file.
This actually produces a (sample) report with the name Report.html. After reading more on Analog web site, I was able to customize the config file named analog.cfg. Once I tweaked this file, I started parsing the log files in minutes. I was even able to append date to the report file name and thus was able to run it in a loop to capture the stats every few minutes.
Apart from the Summary page, it also has several graphs to show stats for Hourly, Daily, Weekly, Monthly etc.
If you are looking to analyse usage patterns on your web site, this is a great tool. Check it out!