In this article we look at: Using RealNews, the folder structure for RealNews, RealNews's integrated web server, WebClient and https, and running the app as a single instance.
Introduction
RealNews
is a Winforms RSS Reader similar in features to FeedReader
which is out of support for 10 years now and is showing its age on the Windows 10 operating system. As a replacement for FeedReader
, I tried out many other RSS readers like RSSOwl
(which is also unsupported) but none of them felt right, so I decided to write my own.
This is a personal need project which I have released to the public for anyone who has a similar need and might find useful. It uses the following libraries:
RealNews
source code is also on github at https://github.com/mgholam/RealNews.
Features
RealNews
has the following features:
- Super fast performance
- No database used or required
- Tiny 310KB single EXE file
- Fully offline mode to view feed items with associated images
- Download images in a daily time window
- Download Images less than a size limit only
- Update every X minutes globally or based on each individual feed setting
- Manual update for individual feed or globally
- Editable style.css for the feed view window
- HTML sanitizer for the feed item content to strip out JavaScript and potentially risky tags
- Read feed definitions from an OPML file
- Light and dark theme
Limitations
Currently (this may change in the future), there are the following limitations:
- No writing OPML files
- Drag move feeds to folders
Using RealNews
You can get started with RealNews
by importing an existing OPML file or by adding feeds individually from the File
menu, when adding individually, you will see the following form:
You put your feed in the Feed Address text box and press the get info button so RealNews
can fetch the feed name from the URL for you.
You can put this feed in a category folder in the treeview
by setting the category text.
If you set the Update Every
number > 0
, then this feed will update on its own schedule as well as the global schedule.
If you minimize RealNews
, it will go down to the tray, and you can restore or exit from there if you need.
Controlling RealNews
All the settings for the application are in the form below which is self explanatory:
If you click on the style.css link, you can edit the feed item style that is used in the browser control.
Keyboard Shortcuts
You can control RealNews
with the following shortcut keys:
Key | Description |
F2 | edit feed information |
ctrl+L | show logs |
ctrl+S | toggle star for feed item |
space | move to next feed item |
Alt+up | move up in current feed |
Alt+down | move down in current feed |
ctrl+D | delete current or selected items |
The Internals
Most of the code is in the frmMain.cs class and handles the complex UI for the application.
The Files and Folders
RealNews
has the following folder structure:
The feed lists are json files of the feed items by the feed name you have specified. Also in the feed folder, there are 2 other files:
- downloadimg.list: is the download queue for images extracted from feed items in
json
- feeds.list: is the main feed lists of all the feeds you have added in
json
Feed Reader
The feed reader code is from the URL at the top of the article, it has been modified to support extracting all the information in the "non-standard" standard RSS formats.
Internal Web Server
RealNews
has an integrated web server to show the feed items for the web browser control. The web server serves the HTML for the feed item and also handles the serving of images that are in the content via the ImageChache
class.
You can also view the content in a normal browser by going to the localhost URL and the associated port from the settings file.
The content page generated replaces the image tag URLs with a link back to this web server, so the images are served from the image cache and not the internet.
Image Cache
Most feed readers use a database to store the content (images, etc.), I went back and forth on this and initially used RaptorDB
but I decided against it when I came up with this solution. First, let me state the problem:
- Images in RSS feeds can be really long URLs and may not end in an image extension.
- Storing this URL on disk may not be feasible because of the length of characters and file system limitations that may contain special characters that the file system does not like.
An example of a URL is below (this was taken from downloadimg.list and is json stringified):
"https://o.aolcdn.com/images/dims?crop=5916%2C3944%2C0%2C0&quality=85&
amp;format=jpg&resize=1600%2C1067&image_uri=
http%3A%2F%2Fo.aolcdn.com%2Fhss%2Fstorage%2Fmidas%2Ff9c858ae2e3bcaa713b57e2e88624c74%
2F206499306%2Fside-view-of-a-bright-red-tesla-model-3-automobile-from-tesla-motors-
picture-id974894362&client=a1acac3e1b3290917d92&
signature=f1743691ec12a1171bdb6e03cdf1e407a93dcc58"
This is just one image URL and as you can see, there is no way to reliably get the image name to save to disk for example.
So the solution I came up with is dead simple, instead of parsing headaches to get the image name to store to disk with all the limitations stated, all I did was to generate the hash of the URL and use that number as the file name on disk (the simplest hash is the one already defined for string
in .NET).
To isolate images more, I also extract the domain name from the URL and store the hash in that folder which is handy for humans to browse through when needed.
All this removes the need for a database which essentially is just mapping a lookup URL to a filename anyway.
HTML Sanitizer
The HTML sanitizers job is to strip out potentially problematic HTML tags from the feed item and replace them with safe ones, so the viewing of that feed item is less a security risk.
Image Downloader
Between the start time and end time in the settings file, RealNews
will start downloading images it has in the download queue. The image download task will first check the image size by getting that information from the server if that is below the settings you have specified, it will download it and save it to the cache.
If the download fails with a timeout, then it will re-queue for retry later, else it will just skip that image.
The image downloader starts 200 tasks at once with a break of 4secs between the batches until the queue is empty.
Points of Interest
WebClient and https
Weirdly for the WebClient
to work correctly with https
in .NET 4, you have to add the following code (which is undocumented, no intellisense):
ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072;
Also for the WebClient
to force using http compression on connections, you need to change the request, the entire new code is as follows:
public class mWebClient : WebClient
{
public mWebClient()
{
Timeout = 10 * 1000;
ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072;
base.Encoding = System.Text.Encoding.UTF8;
if (Settings.UseSytemProxy)
Proxy = WebRequest.DefaultWebProxy;
}
public int Timeout { get; set; }
protected override WebRequest GetWebRequest(Uri uri)
{
WebRequest request = base.GetWebRequest(uri);
var http = request as HttpWebRequest;
if (http != null)
{
http.AutomaticDecompression = DecompressionMethods.GZip |
DecompressionMethods.Deflate;
http.ReadWriteTimeout = Timeout;
}
request.Timeout = Timeout;
return request;
}
}
Non Standard, Feed Standards
While the RSS feed has its standards (there are multiple ones), there are interpretation differences for each provider which was a real headache to handle despite the excellent feed reader code used.
For example, The Gaurdian
and CNN
feeds have their images not in the main feed description but the "extra" content.
Also, feeds like Android Central
have their entire content in the extra section instead of the description section.
For all these non standard "standards", I had to change the feed reader code to extract these extra sections too for showing in the view area.
Single Instancing
To stop data conflicts, the app needed to run as a single instance with the added requirement of from the running folder (you can have multiple instances if each run from its own folder).
For this, first I created a "temp" file in the data folder on startup and was deleted on closing, which I checked the existence of when starting to handle single instancing. The problem with this was that it failed if the computer died on power outs, and I had to manually delete the temp file to get it working again.
The final solution was to check the running processes on the computer and check the path of the running EXE matches this processes path:
_path = Path.GetDirectoryName(Assembly.GetEntryAssembly().Location);
var name = Process.GetCurrentProcess().MainModule.ModuleName.Split('.')[0];
if (_path.EndsWith("\\") == false) _path += "\\";
var pp = Process.GetProcessesByName(name);
var found = 0;
foreach (var pi in pp)
{
if (pi.MainModule.FileName.StartsWith(_path))
found++;
}
if(found == 1)
{
}
else
MessageBox.Show("Only one instance at a time");
If the found count = 1
(this includes itself), then were ok to run else one is already running.
HTML Rendering vs Browser Control
At first, for security reasons, I decided to have the feed be rendered as HTML with a variety of HTML render controls I found, but neither of these were suited for the task I needed and all had their limitations, so they were scrapped.
After a lot of testing, a fell back to the WinForms browser control (which is a really old IE wrapper) but for the purpose of RealNews
works fine since the app's HTML sanitizer strips out all JavaScript and potential threat tags, which the browser control can handle (for newer web pages, the IE control just chokes with a lot of errors for JavaScript it doesn't understand).
I also looked in the Chrome browser control for Winforms but this added 50MB to the size of the application which for needs here is excessive.
Icons
The black & white icons are from FontAwesome web icon font which where extracted by a little program I found on CodeProject which generates .png files for a given image dimension you require.
a tags with target=_blank
If the HTML in the description has a
tags with the target
set, then the browser control opens IE to that URL which is a disaster, to overcome this and open the tags in your default browser, you need to add the following code (also add a COM reference to "Microsoft Internet Controls"):
(this.webBrowser1.ActiveXInstance as SHDocVw.WebBrowser).NewWindow3 += FrmMain_NewWindow3;
private void FrmMain_NewWindow3(ref object ppDisp, ref bool Cancel,
uint dwFlags, string bstrUrlContext, string bstrUrl)
{
Cancel = true;
Process.Start(bstrUrl);
}
The Many Versions of MoveNextUnread()
This method went through a lot of revisions and tweaks which started out as a lot of foreach
loops without folder/category support and finally came down to some interesting linq
statements which are really cool:
var found = false;
if (_currentFeed.Folder != "")
{
var f = _feeds.FindAll(x => x.UnreadCount > 0 &&
x.Folder != "" &&
x.FullTitle.CompareTo(_currentFeed.FullTitle) > 0)
.OrderBy(x => x.FullTitle)
.ToList();
if (f.Count() > 0)
{
_currentFeed = f[0];
found = true;
}
}
if (found == false)
{
var f = _feeds.FindAll(x => x.UnreadCount > 0 &&
x.Folder == "" )
.OrderBy(x => x.Title)
.ToList();
if (f.Count() > 0)
_currentFeed = f[0];
else
{
f = _feeds.FindAll(x => x.UnreadCount > 0 &&
x.Folder != "")
.OrderBy(x => x.FullTitle)
.ToList();
if (f.Count() > 0)
_currentFeed = f[0];
}
}
This version seems to work intuitively moving down the tree, looping back to the top and finding the next unread feed (hopefully!)
Previous Versions
History
- Initial Version v1.0: 2nd August, 2018
- Update v1.1: 23rd August, 2018
- Upgrade to fastJSON v2.2.0
- Search clear button visible only if text entered
- Fix search and current unread count update
- Added cleanup image cache for orphan images
- Update v1.3: 25th February, 2019
- Usability tweaks
- Check image already downloaded before downloading
- Code cleanup
- Added feed update timeout
- Local datetime in settings.config
- Selecting folder/category shows all feeds below it
- Update v1.4: 11th April, 2019
- Added alt+up and down to navigate the feed list
- Usability tweaks
- Fix for some sites returning 403
- Update v1.4.1: 8th June, 2019
- Added Google search the feed item title in the toolbar
- Cleaned up the UI style
- Updated to new zipstorer.cs
- Update v1.4.3: 23rd August 2019
- Added the use of a custom proxy
- Colour coding feeds in tree if failed to get data
- Search title pre-process plugin
- Update v1.4.4: 30th August 2019
- Add/edit feed form validation checks
- Added
OnCloseMinimize
config to close or minimize on form close click - Adding a new feed with category updates the tree correctly
- Added
SkipFeedItemsDaysOlderThan
config for feeds that don't filter recent items when sending - Singleton app now works correctly and opens existing window even if in tray
- Update v1.5.0: 11th December 2019
- bug fix filtering old items when item does not provide a publish date
- cleanup all, resets failed feed colours
- added
dark
theme
- Update v1.5.1 : 30th December 2019
- bug fix ensuring items are visible at the end of the feed list
- memory cleanup on load and save feeds
- bug fix running on WinXP
- Update v1.5.2 : 14th January 2020
- bug fix running on Win7
- check for 0 length images in cache
- Update v1.5.4 : 28th May 2020
- replaced ListView with custom ListBox because of theme problems
- image cache check for zero length files
- caching compiled plugin MethodInfo
- internal feature flag
- Update v1.5.5 : 29th May 2020
- bug fix sorted feed items
- Update v1.5.6 : 4th June 2020