CP Posts Analyzer in action - posts view
CP Posts Analyzer in action - categories view
Introduction
CP Posts Analyzer is a Windows Phone 7 application that will analyze the last
200 posts made by any CP member (across all forums, articles, surveys etc.). You
will need a developer license to be able to deploy and run this on your WP7
device, but if you don't have one, you can still run this off the emulator. I've
tested the app on the emulator as well as on a Samsung Focus.
Warning
Since the application scrapes HTML from the CP website, the application can
potentially stop functioning if the site undergoes a layout change (in the
scraped pages). Here's to hoping that, that will not happen too frequently as to
make the app tedious to maintain.
This application uses the excellent and highly recommended
HTML Agility Pack
for parsing HTML.
Using the app
I've included screenshots below that you can look at, but the app is quite
simple to use. When you run the app, you get a panorama view and the first page
prompts you to enter the member id (example, my member id is 20248). You can get
your member id from your profile page. Since the app does not require you to be
logged in, you can run this on any member id you want to. The app will maintain
a history of the last 10 member ids that you ran it on, so it will save you some
typing (which is not pleasant on such small devices with virtual keypads). The
recent member ids are stored using isolated storage and are thus persisted until
you uninstall the app.
When you run a fetch, the other three pages in the panorama view come into
play. The first page gives you a categories based spit up of your last 200
posts. The Lounge falls under the Page category but most other forums
including technical and some non-technical ones like the Indian forum fall under
Forum. Any posts you make on articles are classified unsurprisingly under
Article, and then you have Survey and Member
categories (self explanatory). There may be others I missed or did not
encounter, those will show up as Unknown for now.
The next page is the forum-wide split up of your posts, so you can see how
many posts you made in a specific forum. Since Chris chose to expose only the
last 200 messages you posted, that's the limit for this app too. So the app will
basically analyze your most recent posting activity. While debugging/testing the
app I found that most of mine were in the Lounge until the India-SA ODI game
happened, and then my posts in the Indian forum dominated like crazy. I ran it
on Chris and found that he spends most of his time answering questions in the
Site Bugs / Suggestions forum (he does perform a mostly thankless job I guess).
The last page in the panorama view will list all the 200 posts to give you a
quick summary of what you posted recently. Just the thread titles (not the
messages).
The app has support for tombstoning, so you can switch to another app
and come back and your state will be persisted. Well I reckon that's about it.
If you can think of something useful that I can add to the app taking into
consideration the limitation that all my input data is from html scraping, I'll
be happy to try and make that change for you (provided I have the time). Check
out a few more screenshots and then there's a brief section on the technical
implementation details.
More screenshots
CP Posts Analyzer in the application list
CP Posts Analyzer in action - running a fetch
CP Posts Analyzer in action - categories view for Roger
CP Posts Analyzer in action - forums view
CP Posts Analyzer in action - tile icon
Implementation details
The app was written in SilverLight for WP7 and the project attempts to use a
basic MVVM model. To help with data binding, I use these following types that
represent the returned data.
namespace CPPostsAnalyzerWP7.Models
{
public class PostInfo
{
public string ThreadName { get; set; }
public string DisplayName { get; set; }
public string TimeString { get; set; }
public string ForumName { get; set; }
public ForumType ForumType { get; set; }
}
}
namespace CPPostsAnalyzerWP7.Models
{
public class PostSummaryInfo
{
public string ForumName { get; set; }
public int Count { get; set; }
public ForumType ForumType { get; set; }
}
}
namespace CPPostsAnalyzerWP7.Models
{
public enum ForumType
{
Unknown = 0,
Forum = 1,
Page = 2,
Article = 3,
Survey = 4,
Member = 5
}
}
There is a PostsFetcher
class that handles the HTML fetch and
parsing, and it uses the excellent HTML Agility Pack library.
namespace CPPostsAnalyzerWP7.Models
{
public class PostsFetcher
{
private string memberId;
public PostsFetcher(string memberId)
{
this.memberId = memberId.Trim();
}
public event EventHandler<PostInfoEventArgs> PostFetched;
public event EventHandler<FetchCompletedEventArgs> FetchCompleted;
private void FirePostFetched(PostInfoEventArgs e)
{
var handler = this.PostFetched;
if (handler != null)
{
handler(this, e);
}
}
private void FireFetchCompleted(FetchCompletedEventArgs e)
{
var handler = this.FetchCompleted;
if (handler != null)
{
handler(this, e);
}
}
private int nextPage = 1;
private const int maxPage = 4;
public void Fetch()
{
int temp;
if (Int32.TryParse(this.memberId, out temp))
{
LoadNextPageAsync();
}
else
{
FireFetchCompleted(new FetchCompletedEventArgs()
{ Error = new ArgumentException("Invalid memberId.") });
}
}
private void LoadNextPageAsync()
{
HtmlWeb.LoadAsync(String.Format(
"http://www.codeproject.com/script/Forums/Messages.aspx?fmid={0}&fid=0&pgnum={1}",
this.memberId, nextPage++), HtmlLoaded);
}
private void HtmlLoaded(object sender, HtmlDocumentLoadCompleted e)
{
if (e.Error != null)
{
FireFetchCompleted(new FetchCompletedEventArgs() { Error = e.Error });
return;
}
try
{
ParseHtml(e);
}
catch (Exception ex)
{
FireFetchCompleted(new FetchCompletedEventArgs() { Error = ex });
}
if (nextPage > maxPage)
{
var args = new FetchCompletedEventArgs();
foreach (var item in forumTypeCountMap)
{
args.ForumTypeSummaries.Add(new PostSummaryInfo()
{ ForumType = item.Key, Count = item.Value, ForumName = String.Empty });
}
foreach (var item in forumPostsCountMap)
{
args.PostSummaries.Add(new PostSummaryInfo()
{ ForumType = ForumType.Unknown, Count = item.Value, ForumName = item.Key });
}
FireFetchCompleted(args);
}
else
{
LoadNextPageAsync();
}
}
private Dictionary<ForumType, int> forumTypeCountMap = new Dictionary<ForumType, int>();
private Dictionary<string, int> forumPostsCountMap = new Dictionary<string, int>();
private void ParseHtml(HtmlDocumentLoadCompleted e)
{
var tableNode = e.Document.DocumentNode.DescendantNodes().Where(
n => n.Name.ToLower() == "table"
&& n.Attributes["cellspacing"] != null
&& n.Attributes["cellspacing"].Value == "4").FirstOrDefault();
if (tableNode == null)
return;
var trNodes = tableNode.Descendants("tr");
foreach (var tdNode in trNodes)
{
var aNode = tdNode.Descendants("a").FirstOrDefault();
if (aNode == null)
continue;
PostInfo postInfo = new PostInfo();
postInfo.ThreadName = aNode.InnerText.Trim();
var divNodes = tdNode.Descendants("div").Where(
n => n.Attributes["class"] != null
&& n.Attributes["class"].Value == "small-text subdue");
if (divNodes.Count() == 2)
{
var divNodesArray = divNodes.ToArray();
string nameAndTime = divNodesArray[0].InnerText.Trim();
int byPos = nameAndTime.IndexOf("by");
int atPos = nameAndTime.LastIndexOf("at");
if (byPos == -1 || atPos == -1)
continue;
postInfo.DisplayName = nameAndTime.Substring(byPos + 2, atPos - byPos - 2).Trim();
postInfo.TimeString = nameAndTime.Substring(atPos + 2).Trim();
string[] forumLines = divNodesArray[1].InnerText.Trim().Split(
'\r', '\n').Where(s => !String.IsNullOrEmpty(s.Trim('\r', '\n'))).ToArray();
if (forumLines.Length < 1 || forumLines.Length > 2)
continue;
string forumNameInput = forumLines.Length == 1 ? "(Untitled)" : forumLines[0];
string forumTypeInput = forumLines.Length == 1 ? forumLines[0] : forumLines[1];
postInfo.ForumName = forumNameInput.Trim();
int leftBracketPost = forumTypeInput.IndexOf('(');
int rightBracketPost = forumTypeInput.IndexOf(')');
if (leftBracketPost == -1 || rightBracketPost == -1)
continue;
ForumType forumType = ForumType.Unknown;
try
{
string enumLine = forumTypeInput.Substring(
leftBracketPost + 1, rightBracketPost - leftBracketPost - 1).Trim();
forumType = (ForumType)Enum.Parse(typeof(ForumType), enumLine.Trim(), true);
}
catch (ArgumentException)
{
}
postInfo.ForumType = forumType;
}
if (!forumTypeCountMap.ContainsKey(postInfo.ForumType))
{
forumTypeCountMap[postInfo.ForumType] = 1;
}
else
{
forumTypeCountMap[postInfo.ForumType]++;
}
if (!forumPostsCountMap.ContainsKey(postInfo.ForumName))
{
forumPostsCountMap[postInfo.ForumName] = 1;
}
else
{
forumPostsCountMap[postInfo.ForumName]++;
}
FirePostFetched(new PostInfoEventArgs() { PostInfo = postInfo });
}
}
}
}
Just basic HTML parsing there - as you can see I had to make some high risk
assumptions (since some of the tables did not have identifiable ids associated
with them). Oh and the reason I use Enum.Parse
is because WP7's
mscorlib does not have TryParse
!
Tombstoning
As I mentioned earlier, the app supports tombstoning, and one of the
things I had to do was to ensure that all the types I wanted to restore were
fully compatible with serialization (which basically means they had to have
public properties along other things). Here's the code that handles
tombstoning.
protected override void OnNavigatedFrom(System.Windows.Navigation.NavigationEventArgs e)
{
App.MainViewModel.SaveState();
base.OnNavigatedFrom(e);
}
protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
App.MainViewModel.RetrieveState();
base.OnNavigatedTo(e);
}
And here are the implementations in the view model.
public void SaveState()
{
var appService = PhoneApplicationService.Current;
appService.State.Clear();
appService.State["MemberId"] = this.MemberId;
appService.State["CanFetch"] = this.CanFetch;
if (this.CanFetch)
{
appService.State["MemberName"] = this.MemberName;
appService.State["Results"] = this.Results;
appService.State["ForumTypeSummaries"] = this.ForumTypeSummaries;
appService.State["PostSummaries"] = this.PostSummaries;
}
}
public void RetrieveState()
{
var appService = PhoneApplicationService.Current;
if (appService.State.ContainsKey("MemberId"))
{
this.MemberId = (string)appService.State["MemberId"];
}
if (appService.State.ContainsKey("CanFetch"))
{
this.CanFetch = (bool)appService.State["CanFetch"];
}
if (!this.CanFetch)
{
this.CanFetch = true;
return;
}
if (appService.State.ContainsKey("MemberName"))
{
this.MemberName = (string)appService.State["MemberName"];
}
if (appService.State.ContainsKey("Results"))
{
this.Results.Clear();
foreach (PostInfo item in (IEnumerable<PostInfo>)appService.State["Results"])
{
this.Results.Add(item);
}
}
if (appService.State.ContainsKey("ForumTypeSummaries"))
{
this.ForumTypeSummaries.Clear();
foreach (PostSummaryInfo item in
(IEnumerable<PostSummaryInfo>)appService.State["ForumTypeSummaries"])
{
this.ForumTypeSummaries.Add(item);
}
}
if (appService.State.ContainsKey("PostSummaries"))
{
this.PostSummaries.Clear();
foreach (PostSummaryInfo item in
(IEnumerable<PostSummaryInfo>)appService.State["PostSummaries"])
{
this.PostSummaries.Add(item);
}
}
}
I'll just mention in passing that I made some unsuccessful attempts to
serialize the entire view model and it was doomed from the very beginning. I
eventually gave up and decided to serialize what I specifically wanted. I guess
I could have got it to work but it would have been too much hassle for no return
at all, except maybe some artificially boosted self-esteem which I don't care
for anyway!
Isolated Storage
The recipients list is stored via isolated storage so your most recent 10
searches will be remembered.
public MainViewModel()
{
IsolatedStorageFile store = IsolatedStorageFile.GetUserStoreForApplication();
if (!store.FileExists(configPath))
return;
using (StreamReader reader = new StreamReader(
new IsolatedStorageFileStream(configPath, FileMode.Open, store)))
{
var lines = reader.ReadToEnd().Trim().Split('\r', '\n').Where(
s => !String.IsNullOrEmpty(s.Trim('\r', '\n'))).ToList();
lines.ForEach(line => recentMemberIds.Add(line));
}
}
private void SaveConfig()
{
IsolatedStorageFile store = IsolatedStorageFile.GetUserStoreForApplication();
string baseDirectory = System.IO.Path.GetDirectoryName(configPath);
if (!store.DirectoryExists(baseDirectory))
{
store.CreateDirectory(baseDirectory);
}
using (StreamWriter writer = new StreamWriter(
new IsolatedStorageFileStream(configPath, FileMode.Create, store)))
{
foreach (var entry in recentMemberIds)
{
writer.WriteLine(entry);
}
}
}
The built-in support makes it incredibly easy to use isolated storage.
Initially, I did consider saving posts and then accumulating them so I can
analyze more than 200 posts, but that assumes the user will run it frequently
enough that there won't be missing posts (which was very hard to enforce,
actually quite impossible). So I gave up and thought 200 is good enough. Some
day if Chris increases that limit to 1000, then I'll update the app at that time.
Conclusion
Well that's it. It's been tested reasonably thoroughly but there may be
issues, specially on other phone models. The error handling is kinda silent, so
if it encounters any errors it won't break down but it won't tell you either. You
could try running a Fetch again, or close and re-run the app (although I
have never had to do that so far) .
Feel free to submit feedback, criticism, and suggestions as usual.
History
- January 23rd 2011 - Article first published.