Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / HTML

Tracking Microsoft Solutions (a Newer SCM tool)

4.78/5 (6 votes)
13 Feb 2014CPOL17 min read 22.5K   372  
A small utility that feature-creeped into Yet Another SCM tool - a useful app to examine
Image 1

Introduction

Newer is an application to compare snapshots of solutions taken over time. This tool can be helpful when your task is to quickly gauge where new development has occurred, where previous snapshots have become obsolete, and to find when these changes took place with or without connection to your SCM server. In a pinch, Newer can be used, not to supplement your SCM system, but as your SCM system.

Note: I use the term Software Configuration Management (SCM) instead of Revision or Version Control Software or System (VCS) since Newer is mostly a change tracking tool and not database file version differences.

Background

This tool started as a quick means of checking newer source files in a particular development environment. I wanted something that would simply let me see which files were "Newer" than a given date and hour. Hence the app's name, which I haven't changed despite feature creep. My motivation was the need to see day-by-day, which files had been updated throughout a large solution. The SCM system we use currently, Seapine Surround SCM, lets you "Get" a copy of the latest version of files from a repository branch root or subfolder with check-in timestamps. The first version of Newer, shown above, had a button to choose the 'gotten-to' folder, a date picker and hour selector, and two buttons to show the list of files that were newer or older than the picked date and selected hour. This was all good to track check-in timestamps in a single snapshot.

The First Creep

I thought I was through with a small one-off hack but then I started wanting a way to compare two snapshots taken on different days. Hey, why not extend that Newer utility and add that capability? (Sound a little familiar?) I added a second folder chooser and more buttons to list files unique in the first folder, unique in the second, common in both folders with the same content, and common but with differences -- aka updates. I added CheckBoxes to select which file types to show for each of these listings and counting logic to show the number of files of each type. Since the lists could be long, I added a search feature to find the first or next occurrence of a string in the file list (case insensitive) and a button to set the starting point of searches to whatever line had or started a text selection. Newer now looked like the this:

Image 2

More is Better

Good, but then I wanted to check on saved executables and a few other file types that came into play for our mixed platform, mixed-era solutions. Note the separate .vcproj and .vcxproj CheckBoxes. Why not add just one or two more CheckBoxes with counters? (This should also sound familiar.)

Also, I found I wanted a way to invalidate internal dictionaries whenever I separately added files, deleted files, or altered any file timestamp in a chosen folder. And I wanted to be sure I was correctly numbering lines when a search string was found. But this would mess up the 'select-and-copyability' of file lists if line numbering was hardwired. So for both of these, I added a "Renew" button to clear and rebuild internal dictionaries and a "line #'s" CheckBox. Showing "Diff" files of two snapshots, Newer now looked like this:

Image 3

Creeping Doubt

Speaking of wanting to be sure, I began to doubt how well Newer would handle file lists if it were challenged with comparing a 'special' folder, either inadvertently or on purpose. Wouldn't a Newer user get tired of OK'ing or X'ing dialogs if I simply used MessageBox.Show for exceptional cases? Wouldn't it be faster to just X out of Newer altogether, restart it, and avoid re-choosing the special folder?

This is something I wanted to avoid. So instead of showing a MessageBox (originally), I add the feature of noting IO exceptions in internal directories along with normal file entries. The file listing code could then do something with all the exceptional entries at one time. As a minimum, the exceptional entries could be listed along with 'good' entries. Minimum sounded good to me. (Please note there are other areas where MessageBox.Show still made sense to me.)

The following shows a listing of my C:\Windows files that are older than 1/15/2014. Internally,

SortedDirectory<string,

        DateTime>
is used. That is why all the exceptional "*" entries are shown together at the beginning of the list. The list has been scrolled down some to show where the numbered 'good' entries start.

Image 4

I wanted to point out where this is done in the code. The following is an abbreviated listing of the start of the TransverseTree method, which is essential to list generation in Newer and is a fairly straight up rip off of the non-recursive method of TraverseTree in MSDN. Please see reference (1).

C#
private DateTime TransverseTree(string root, SortedDictionary<string, DateTime> fileDict ... )
{
    Queue<string> dirs = new Queue<string>(20);

    if (!Directory.Exists(root))
    {
        throw new ArgumentException(); // let system handle
    }
    dirs.Enqueue(root);

    while (dirs.Count > 0)
    {
        string currentDir = dirs.Dequeue();
        string[] subDirs;
        try
        {
            subDirs = Directory.GetDirectories(currentDir); // absolute paths
        }
        catch (Exception e)
        {
            //Save exception in fileDict.
            fileDict["* " + currentDir + " GetDirectories Exception " + e.Message] = DateTime.Now;
            continue;
        }
    ...

Getting Seriously Creepy

Newer was useful and it got to a point where I wanted to seriously consider usability (but I didn't want to actually go so far as to redesign, refactor, or rewrite it). So with good intentions and best laid schemes, I decided to "enhance" it further.

Trick #1

The main window of Newer was already at a considerable Width of 1030. I didn't want to burden a hypothetical user with an app that was too wide and with features they didn't need or couldn't use. Here's my trick: Increase the Width of the main Window to say 1330, add additional feature controls and code-behind, hide that part of the UI again by setting the Window Width back to 1030, and tell potential users in a CodeProject article they can drag the app wider to see the added features if needed. Initialized Window Width is easily controlled in MainWindow.xaml. Here is the start of that file:

XML
<Window x:Name="Newer" x:Class="Newer.MainWindow"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    Title="Newer" Height="492" Width="1030"
    Icon="Newer.ico" SizeChanged="Newer_SizeChanged">

I'd like to explain the handler for SizeChanged but first let me show both sides of the tricked-out Newer interface showing a "Same" list:

Image 5Image 6

You'll notice red text on two Buttons and a CheckBox on the right. When Newer is widened to see these controls, the Foreground of "Choose root 1" on the left is also set to red in the Newer_SizeChanged handler. This is an attempt to indicate that files and directories in the root 1 folder can actually be deleted if "Del EqEqvR1" or "Clean" is pressed. The route to the meaning of the "Del EqEqvR1" is a little tortured.

"I wish I were special"

Step 1: I wanted to eliminate files showing up as different if they compiled to the same code. We were going through a Doxygen craze where I work and the double and triple slashes were flying around like crazy in different file versions (.h, .hpp, .c, .cpp, .cs).

If you select a common file name from a listing and press the "Eqv1" button, Newer will rewrite the file selected in root 1 with content from the file in root 2 if they are equivalent line for line. 'Equivalent' here means syntactically equivalent as in possibly different white space or commenting but the same semantics. We can quibble whether this button's Content should also be red.

Step 2: Why require selecting each common file name and clicking a button if I want to eliminate unimportant differences throughout the solution? If the "Eqv1All on Diff" box is checked, then when "Diff" is pressed, the 'Eqv1' function is called for each possibly equivalent file pair and the root 1 file is replaced if they are equivalent, otherwise the file name shows up in the "Diff" list.

Step 3: Why retain files in root 1 that are a binary equal or syntactically equivalent to those in root 2? If you press "Del EqEqvR1", only essentially different selected file types will remain in root 1. Unselected file types will not be touched. This is the basis of an in-a-pinch SCM.

3a: Make or get a newer copy of a development root.
3b: In Newer, choose an older copy as root 1, the newer copy as root 2. Check all file types including "other". Click "Del EqEqvR1".
3c: Archive the now-reduced file set of the older copy and restore it when needed. In my case, a newer root is 300+ MB and reduced sets are on the order of 10 MB for an typical week of development.

Step 4: Creep further by retaining only restorable file differences instead of whole files (one creep too many for me -- I'll rely on the real SCM at this point).

Step 0: By default, Surround SCM's 'get' saves read-only copies. Be sure to change the root 1 folder to remove the read-only attribute if you are going to make updates or deletes in this folder. Choose 'Apply changes to this folder, subfolders and files'. Other SCM's (git clone, perforce Get Latest Revision, svn checkout, etc.) will have other considerations you may need to take into account.

Comparative Creeping

In case it hasn't been obvious, Newer is also an attempt to wrest features out of any single SCM. There are many SCM's. They all have some or many features. For one list of SCM's, please see the Index in following reference (2) below.

I think it's easier and more desirable to keep control of features outside any single SCM when they can be included in something like Newer. Maybe my next world won't use Surround SCM. Newer should still work.

So, the "Diff" button finds files that are different in two snapshots. What to do? Let's look at them. One good tool for that is Beyond Compare. Newer will first look for the executable "BCompare.exe" in Environment.SpecialFolder.ProgramFilesX86. If it doesn't find it there, it looks in Environment.SpecialFolder.ProgramFiles. If it doesn't find it there, it cries.

To compare file differences, select some text of the file name in a "Diff" or "Same" list. Then click the "BC" button. It's above the "other" CheckBox.

Why would you compare "Same" files? Well, there is identically the same and syntactically the same. The later might be worth comparing to check comment changes. Wouldn't it be nice to know which files were identical and which just equivalent? The following is a detail of a "Same" listing:

Image 7

Notice the two .cpp file names preceded by "~". These two are only syntactically equivalent in root 1 and root 2. Comparing one of these would show commenting or white space differences.
Short aside: Beyond Compare has an "Ignore unimportant differences" view button, Image 8 or Image 9 in version 4.0, to treat unimportant difference as the same. This was a good second opinion I used to check my equivalency code. Beyond Compare also provides visual comparison of several image file types including .ico. Scooter Software had just started public beta test of version 4.0 when this article was submitted. I am still examining its 'Ad Hoc Unimportant Text' feature. 4.0 can also access subversion repositories and Dropbox storage within the program.

Not to love Beyond Compare too much, you may ask, "Why not do all of this in BC?" With a little bit of work, you can set up a "days ago" filter to check dates. I don't know how you can count files of a given type in a root folder. Renewing internal dictionaries requires folder selection and a menu selection (or hitting Shift-F5). With a few clicks, you can drill down a folder path to find some file differences but not all of them with one click. It's difficult or impossible to see a set of denied access files. Selecting and cleaning a set of files types is difficult. Removing all files that are equal or equivalent between two folders is not possible or known by me. In short, Newer goes beyond Beyond Compare.

The other file names in the "Same" list, the ones for identical files, are actually preceded by a blank space. I decided to violate my unadorned select-and-copy objective to allow sorting identical from equivalent names in something like cmd sort. You can deal with this extra first character or you can change the code in the bw_Same handler. Here is the relevant section:

C#
string f1 = Path.Combine(root1, s);
string f2 = Path.Combine(root2, s);
if (FilesAreEqual(f1, f2)) // without regarding any BOM
   names.Add(" " + s); // denote identical with " "
else if (ext == ".cpp" || ext == ".cs" || 
	ext == ".h" || ext == ".c" || ext == ".hpp")
{
   if (CommentableTextFilesEqual(f1, f2, false))
      names.Add("~" + s); // denote equivalent with "~"
}

I did want to point out one feature in the code I didn't anticipate would be needed. (Feature creep begets creepettes). When Beyond Compare was started as a separate process and eventually closed when you were through comparing or editing, the text selection in the Newer file list was lost. This is one of those "Where was I?" disruptions we like to avoid. The quick fix was to save a global last selection starting index and selection length. These are saved any time the file list TextBox looses focus and restored when but after starting Beyond Compare.

In case you want to use a different compare tool than Beyond Compare, I'll point out the handler where this happens:

C#
private void bcButton_Click(object sender, RoutedEventArgs e)
{
    string filename = CkComparable(); // verify a comparable file pair
    if (filename.Length == 0)
        return;

    string bcPath = GetBCPath(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFilesX86));
    if (bcPath.Length == 0)
        bcPath = GetBCPath(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles));

    if (bcPath.Length == 0)
    {
        searchString.Text = "wah";
        return;
    }

    string argString = "\"" + Path.Combine(root1, filename) + "\" \"" +
           Path.Combine(root2, filename) + "\"";
    try
    {
        System.Diagnostics.Process.Start("\"" + bcPath + "\"", argString); //start Beyond Compare
    }
    catch (Exception ex)
    {
        MessageBox.Show(ex.ToString());
    }
    textBox1.Focus();
    textBox1.SelectionStart = lastTextBoxSelectionStart;
    textBox1.SelectionLength = lastTextBoxSelectionLength;
}

Is a BOM a matter of concern? Well, yes. If we are comparing source files, one snapshot may have an ACSII version of a file and a second snapshot may have a UTF-8 or UNICODE version of the file. In the codebase I've examined, I've seen mixed BOM / no BOM cases for file types .h, .cpp, .cs, .xsd, .xml, .sln, .xaml, and .csproj but only one or two cases where the use of a BOM has changed for a given file.

The real problem is that many UTF-8 source files do not have a BOM. Without prescanning entire non-BOMed source files for non-ASCII characters (single bytes having a value > 127) it's difficult to know the correct Encoding to use in reading the file into memory where everything is Unicode characters.

For equivalency testing, Newer tries to account for encoding using BOM's, but it doesn't do a full prescan to find non-BOMed UTF-8 files. Very infrequently, "Same" may miss and "Diff" may list a file that is really equal or equivalent where BOM use has changed. I think the tradeoff in performance justifies not doing the full prescan.

To see this rare case, the file superAscii.txt (an "other" file type) is included without a BOM in CompareFolder1 and with a BOM in CompareFolder2 in the solution download. They are identical UTF-8 files. You may also be interested to check the table in reference (3).

Creepy Clean

If you build or analyze in a chosen root folder, you can accumulate files Visual Studio doesn't remove. I added a "Clean" button to do some cleanup so these won't get in the way of "Same" and "Diff". Two dictionaries are actually used to do this. One handles cleanable files that have a single extension. A second dictionary is needed to handle over-dotted files like MyCSharpFile.g.i.cs. These dictionaries are initialized right after InitializeComponent in the MainWindow constructor. Here is an abbreviated section of the initFixedDicts method:

C#
private void initFixedDicts()
{
    ...
    cleanAnyDict[".baml"] = 1;
    cleanAnyDict[".bi"] = 1;
    cleanAnyDict[".bsc"] = 1;
    cleanAnyDict[".cache"] = 1;
    cleanAnyDict[".cdf"] = 1;
    cleanAnyDict[".exp"] = 1;
    cleanAnyDict[".idb"] = 1;
    cleanAnyDict[".ilk"] = 1;
    ...
    cleanAnyDict2[".CodeAnalysisLog.xml"] = 1;
    cleanAnyDict2[".csproj.GenerateResource.Cache"] = 1;
    cleanAnyDict2[".csprojResolveAssemblyReference.cache"] = 1;
    cleanAnyDict2[".exe.config"] = 1;
    cleanAnyDict2[".g.cs"] = 1;
    cleanAnyDict2[".g.i.cs"] = 1;
    ...
}

Fair warning: There is no confirmation in Newer when you click "Clean". If you plan to use this button, please double-check the file types in these two directories don't include ones you don't want to clean. You can add or delete file types from these dictionaries to suit cleaning for your environment.

There is also a "delEmpty dirs on clean" CheckBox that will remove bin, obj and other directories when cleaning leaves or finds that a subfolder is empty. If you are minimizing an older snapshot, I recommend checking this box and clicking "Clean" after clicking "Del EqEqvR1".

Is Newer Creeping Along? How Does It Work?

The last two enhancements show activity and a help form.

Creating a "Same" or "Diff" file listing or deleting equal and equivalent files can take a while. Newer displays an ellipsis when it is busy doing something:

Image 10 Image 11

But how do you know it hasn't hung? If you check the "Show activity" box before clicking "Root 1 unique", "Root 2 unique", ... "Root 2 older", or "Del EqEqvR1", Newer will display file names in the Search string TextBox while it checks files to show its working. Don't check this box after starting one of these long compare or delete functions and expect activity to be shown immediately. Newer will be too busy to notice (for now).

Finally, if you forget all this description, clicking the "?" Button will show a cheat sheet for some of these features. Note: HelpForm.exe is its own executable and needs to be located in the directory from were you start Newer.exe or NewerPy.exe. The downloads should have this already handled.

Finally finally, I added two "->clip" buttons to put full paths of any file list on the clipboard. I decided not to add surrounding quotes for paths with spaces but it wouldn't be too hard to creep a little further. Beware if you have listed "Root 1 unique" files and click "->clip" for root 2, you will get paths for files that don't exist. But maybe that is what you want. This last, of course, proves article crafting takes longer than feature creeping.

Securing against Creeps

Just a note. If you are using Windows 8 or better, you may already have run into a problem running executables downloaded from CodeProject.

I have provided binaries before in addition to Visual Studio solutions. The good thing about building the solution anew on your system is that you won't see this problem. The bad thing is that you need Visual Studio to easily build programs that are "known" to your system.

I think it's still worthwhile providing pre-built binaries. It's an easy way to check out whether something is worth using or to follow along when you read an article. On Win 8 and above, it's just two steps further (one time) to do this if a program is unsigned.

Step 1. You double-click the .exe and "Windows Smart Screen" says "Windows protected your PC". Click "More info". There won't be an indication "More info" has a link, but it does and it takes you to Step 2.

Step 2. You should have the choice of "Run Anyway" or "Don't Run". Since the executable comes from CodeProject and since authors are identifiable, I suggest it's fairly safe to run anyway. It's as safe as it was before Win 8 and once you do run anyway, Win 8[+] will not bother you again about that program.

Perhaps we can both look into signing for next time. One place to start is the 5-star Tip at reference (4).

Jacking the Code

Well, this is all fine but did I really address tracking solutions in other languages (not yet). I'd like to present a small, simplified table showing some other languages:

Language
Commenting
Extensions
F#
// /// (*...*)
.fs .fsi
Python
#
.py
VB
'
.vb .vbs
HTML, XML
<!-- ... -->
.htm .html .xml
JavaScript
// /*...*/
.js

It would be tedious and unfun to show Newer snippets needing changes to handle new file extensions and equivalency checking for other languages. What I've done instead is include a NewerPy project in the downloadable solution.

As an example of places to change code, NewerPy has replaced the ".h, .hpp" CheckBox and counter with one for ".py". Where CommentableTextFilesEqual is used for c-like files, NewerPy has a CommentablePyFilesEqual method for comparing .py file equivalency.

BTW, a good source of some useful Python scripts is at reference (5). A CodeProject article about using Python and C# together is at reference (6).

This should help if you want to tailor a version of Newer suitable for your development environment. Just difference the Newer and NewerPy folders with Beyond Compare. Most changes are in the MainWindow.xaml and MainWindow.xaml.cs. I'd say use Newer itself to check the differences, but in this case Newer doesn't buy you much more than managing execution of Beyond Compare.

For a free differencing tool instead of Beyond Compare, there is always WinMerge. Please check reference (7) below.

If you just want to see the interface change and try comparing a couple Python snapshots, change the Startup Project to NewerPy in the Solution Explorer and give it a try.

References

(1) How to: Iterate Through a Directory Tree (C# Programming Guide) I believe a Queue was originally used for dirs (better directory order). Now a Stack is used. The method in MSDN is also named "TraverseTree" instead of Newer's "TransverseTree". "Traverse" is the more-accurate word.
(2) Using Beyond Compare with Version Control System
(3) Comparing Characters in Windows-1252, ISO-8859-1, IS0-8859-15, Comparison Table
(4) How-to-be-your-own-Certificate-Authority-and-creat e your own certificate to sign code files. By , 1 Mar 2013
(5) ActiveState Code >> Recipes Popular Python Scripts
(6) Python, Visual Studio, and C#... So. Sweet By , 23 Sep 2013
(7) WinMerge Command Line WinMerge home is here
(8) On Having and Wanting Favor

History

  • Submitted to CodeProject 12 Feb 2014

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)