Introduction
Newer
is an application to compare snapshots of
solutions taken over time. This tool can be helpful when
your task is to quickly gauge where new development has occurred,
where previous snapshots have become obsolete, and to find when
these changes took place with or without connection to your SCM
server. In a pinch, Newer
can be used, not to
supplement your SCM system, but as your SCM system.
Note: I use the term Software Configuration Management (SCM)
instead of Revision or Version Control Software or System (VCS)
since Newer
is mostly a change tracking tool and
not database file version differences.
Background
This tool started as a quick means of checking newer source files
in a particular development environment. I wanted something
that would simply let me see which files were "Newer" than a
given date and hour. Hence the app's name, which I haven't
changed despite feature creep. My motivation was the need to
see day-by-day, which files had been updated throughout a large
solution. The SCM system we use currently, Seapine
Surround SCM, lets you "Get" a copy of the latest version of
files from a repository branch root or subfolder with check-in
timestamps. The first version of Newer
, shown
above, had a button to choose the 'gotten-to' folder, a date
picker and hour selector, and two buttons to show the list of
files that were newer or older than the picked date and selected
hour. This was all good to track check-in timestamps in a
single snapshot.
The First Creep
I thought I was through with a small one-off hack but then I
started wanting a way to compare two snapshots taken on different
days. Hey, why not extend that Newer
utility
and add that capability? (Sound a little familiar?) I
added a second folder chooser and more buttons to list files
unique in the first folder, unique in the second, common in both
folders with the same content, and common but with differences --
aka updates. I added CheckBoxes to select which file types
to show for each of these listings and counting logic to show the
number of files of each type. Since the lists could be long,
I added a search feature to find the first or next occurrence of a
string in the file list (case insensitive) and a button to set the
starting point of searches to whatever line had or started a text
selection. Newer
now looked like the this:
More is Better
Good, but then I wanted to check on saved executables and a few
other file types that came into play for our mixed platform,
mixed-era solutions. Note the separate
.vcproj and
.vcxproj
CheckBoxes. Why not add just one or two more CheckBoxes with
counters? (This should also sound familiar.)
Also, I found I wanted a way to invalidate internal dictionaries
whenever I separately added files, deleted files, or altered any
file timestamp in a chosen folder. And I wanted to be sure I
was correctly numbering lines when a search string was
found. But this would mess up the 'select-and-copyability'
of file lists if line numbering was hardwired. So for both
of these, I added a "Renew" button to clear and rebuild internal
dictionaries and a "line #'s" CheckBox. Showing "Diff" files
of two snapshots, Newer
now looked like this:
Creeping Doubt
Speaking of wanting to be sure, I began to doubt how well
Newer
would handle file lists if it were challenged with comparing a
'special' folder, either inadvertently or on purpose.
Wouldn't a
Newer
user get tired of OK'ing or X'ing
dialogs if I simply used
MessageBox.Show
for
exceptional cases? Wouldn't it be faster to just X out of
Newer
altogether, restart it, and avoid re-choosing the special folder?
This is something I wanted to avoid. So instead of showing a
MessageBox (originally), I add the feature of noting IO exceptions
in internal directories along with normal file entries. The
file listing code could then do something with all the exceptional
entries at one time. As a minimum, the exceptional entries
could be listed along with 'good' entries. Minimum sounded
good to me. (Please note there are other areas where MessageBox.Show
still made sense to me.)
The following shows a listing of my C:\Windows files that
are older than 1/15/2014. Internally,
SortedDirectory<string,
DateTime>
is used. That is why all the
exceptional "*" entries are shown together at the beginning of the
list. The list has been scrolled down some to show where the
numbered 'good' entries start.
I wanted to point out where this is done in the code. The
following is an abbreviated listing of the start of the TransverseTree
method, which is essential
to list generation in Newer
and is a fairly straight
up rip off of the non-recursive method of TraverseTree
in MSDN. Please see reference (1).
private DateTime TransverseTree(string root, SortedDictionary<string, DateTime> fileDict ... )
{
Queue<string> dirs = new Queue<string>(20);
if (!Directory.Exists(root))
{
throw new ArgumentException();
}
dirs.Enqueue(root);
while (dirs.Count > 0)
{
string currentDir = dirs.Dequeue();
string[] subDirs;
try
{
subDirs = Directory.GetDirectories(currentDir);
}
catch (Exception e)
{
fileDict["* " + currentDir + " GetDirectories Exception " + e.Message] = DateTime.Now;
continue;
}
...
Getting Seriously Creepy
Newer
was useful and it got to a point where I wanted
to seriously consider usability (but I didn't want to actually go so
far as to redesign, refactor, or rewrite it). So with good
intentions and best laid schemes, I decided to "enhance" it further.
Trick #1
The main window of Newer
was already at a considerable
Width of 1030. I didn't want to burden a hypothetical user
with an app that was too wide and with features they didn't need or
couldn't use. Here's my trick: Increase the Width of the
main Window
to say 1330, add additional feature
controls and code-behind, hide that part of the UI again by setting
the Window
Width back to 1030, and tell potential
users in a CodeProject article they can drag the app wider to see
the added features if needed. Initialized Window
Width is easily controlled in MainWindow.xaml. Here is
the start of that file:
<Window x:Name="Newer" x:Class="Newer.MainWindow"
xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
Title="Newer" Height="492" Width="1030"
Icon="Newer.ico" SizeChanged="Newer_SizeChanged">
I'd like to explain the handler for SizeChanged
but first let me show both sides of the tricked-out Newer
interface showing a "Same" list:
You'll notice red text on two Buttons and a CheckBox on the
right. When Newer
is widened to see these
controls, the Foreground of "Choose root 1" on the left is also
set to red in the Newer_SizeChanged
handler.
This is an attempt to indicate that files and directories in the
root 1 folder can actually be deleted if "Del EqEqvR1" or "Clean"
is pressed. The route to the meaning of the "Del EqEqvR1" is
a little tortured.
"I wish I were special"
Step 1: I wanted to eliminate files showing up as different
if they compiled to the same code. We were going through a
Doxygen craze where I work and the double and triple slashes were
flying around like crazy in different file versions (.h, .hpp, .c,
.cpp, .cs).
If you select a common file name from a listing and press the
"Eqv1" button, Newer
will rewrite the file selected
in root 1 with content from the file in root 2 if they are
equivalent line for line. 'Equivalent' here means
syntactically equivalent as in possibly different white space or
commenting but the same semantics. We can quibble whether
this button's Content should also be red.
Step 2: Why require selecting each common file name and
clicking a button if I want to eliminate unimportant differences
throughout the solution? If the "Eqv1All on Diff" box is
checked, then when "Diff" is pressed, the 'Eqv1' function is
called for each possibly equivalent file pair and the root 1 file
is replaced if they are equivalent, otherwise the file name shows
up in the "Diff" list.
Step 3: Why retain files in root 1 that are a binary equal
or syntactically equivalent to those in root 2? If you press
"Del EqEqvR1", only essentially different selected file types will
remain in root 1. Unselected file types will not be
touched. This is the basis of an in-a-pinch SCM.
3a: Make or get a newer copy of a development root.
3b: In Newer
, choose an older copy as root 1,
the newer copy as root 2. Check all file types including
"other". Click "Del EqEqvR1".
3c: Archive the now-reduced file set of the older copy and
restore it when needed. In my case, a newer root is 300+ MB
and reduced sets are on the order of 10 MB for an typical week of
development.
Step 4: Creep further by retaining only restorable file
differences instead of whole files (one creep too many for me --
I'll rely on the real SCM at this point).
Step 0: By default, Surround SCM's 'get' saves
read-only copies. Be sure to change the root 1 folder to
remove the read-only attribute if you are going to make updates or
deletes in this folder. Choose 'Apply changes to this
folder, subfolders and files'. Other SCM's (git clone,
perforce Get Latest Revision, svn checkout, etc.) will have other
considerations you may need to take into account.
Comparative Creeping
In case it hasn't been obvious, Newer
is also an
attempt to wrest features out of any single SCM. There are
many SCM's. They all have some or many features. For
one list of SCM's, please see the Index in following reference (2)
below.
I think it's easier and more desirable to keep control of
features outside any single SCM when they can be included in
something like Newer
. Maybe my next world
won't use Surround SCM. Newer
should
still work.
So, the "Diff" button finds files that are different in two
snapshots. What to do? Let's look at them. One
good tool for that is Beyond Compare. Newer
will first look for the executable "BCompare.exe" in Environment.SpecialFolder.ProgramFilesX86
.
If it doesn't find it there, it looks in Environment.SpecialFolder.ProgramFiles
.
If it doesn't find it there, it cries.
To compare file differences, select some text of the file
name in a "Diff" or "Same" list. Then click the "BC"
button. It's above the "other" CheckBox.
Why would you compare "Same" files? Well, there is
identically the same and syntactically the same. The later
might be worth comparing to check comment changes. Wouldn't
it be nice to know which files were identical and which just
equivalent? The following is a detail of a "Same" listing:
Notice the two .cpp file names preceded by "~". These two
are only syntactically equivalent in root 1 and root 2.
Comparing one of these would show commenting or white space
differences.
Short aside: Beyond Compare has an "Ignore unimportant
differences" view button, or in version 4.0, to treat unimportant
difference as the same. This was a good second opinion I
used to check my equivalency code. Beyond Compare
also provides visual comparison of several image file types
including .ico. Scooter Software had just started public
beta test of version 4.0 when this article was submitted. I
am still examining its 'Ad Hoc Unimportant Text' feature.
4.0 can also access subversion repositories and Dropbox storage
within the program.
Not to love Beyond Compare too much, you may ask, "Why
not do all of this in BC?" With a little bit of work, you can
set up a "days ago" filter to check dates. I don't know how
you can count files of a given type in a root folder.
Renewing internal dictionaries requires folder selection and a
menu selection (or hitting Shift-F5). With a few clicks, you
can drill down a folder path to find some file differences but not
all of them with one click. It's difficult or impossible to
see a set of denied access files. Selecting and cleaning a
set of files types is difficult. Removing all files that are equal
or equivalent between two folders is not possible or known by
me. In short, Newer
goes beyond Beyond
Compare.
The other file names in the "Same" list, the ones for identical
files, are actually preceded by a blank space. I decided to
violate my unadorned select-and-copy objective to allow sorting
identical from equivalent names in something like cmd sort
.
You can deal with this extra first character or you can change the
code in the bw_Same
handler. Here is the
relevant section:
string f1 = Path.Combine(root1, s);
string f2 = Path.Combine(root2, s);
if (FilesAreEqual(f1, f2))
names.Add(" " + s);
else if (ext == ".cpp" || ext == ".cs" ||
ext == ".h" || ext == ".c" || ext == ".hpp")
{
if (CommentableTextFilesEqual(f1, f2, false))
names.Add("~" + s);
}
I did want to point out one feature in the code I didn't
anticipate would be needed. (Feature creep begets
creepettes). When Beyond Compare was started as a
separate process and eventually closed when you were through
comparing or editing, the text selection in the Newer
file list was lost. This is one of those "Where was I?"
disruptions we like to avoid. The quick fix was to save a
global last selection starting index and selection length.
These are saved any time the file list TextBox looses focus and
restored when but after starting Beyond Compare.
In case you want to use a different compare tool than Beyond
Compare, I'll point out the handler where this happens:
private void bcButton_Click(object sender, RoutedEventArgs e)
{
string filename = CkComparable();
if (filename.Length == 0)
return;
string bcPath = GetBCPath(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFilesX86));
if (bcPath.Length == 0)
bcPath = GetBCPath(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles));
if (bcPath.Length == 0)
{
searchString.Text = "wah";
return;
}
string argString = "\"" + Path.Combine(root1, filename) + "\" \"" +
Path.Combine(root2, filename) + "\"";
try
{
System.Diagnostics.Process.Start("\"" + bcPath + "\"", argString);
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
}
textBox1.Focus();
textBox1.SelectionStart = lastTextBoxSelectionStart;
textBox1.SelectionLength = lastTextBoxSelectionLength;
}
Is a BOM a matter of concern? Well, yes. If we are
comparing source files, one snapshot may have an ACSII version of
a file and a second snapshot may have a UTF-8 or UNICODE version
of the file. In the codebase I've examined, I've seen mixed
BOM / no BOM cases for file types .h, .cpp, .cs, .xsd, .xml, .sln,
.xaml, and .csproj but only one or two cases where the use of a
BOM has changed for a given file.
The real problem is that many UTF-8 source files do not have a
BOM. Without prescanning entire non-BOMed source files for
non-ASCII characters (single bytes having a value > 127) it's
difficult to know the correct Encoding to use in reading the file
into memory where everything is Unicode characters.
For equivalency testing, Newer
tries to account for
encoding using BOM's, but it doesn't do a full prescan to find
non-BOMed UTF-8 files. Very infrequently, "Same" may miss
and "Diff" may list a file that is really equal or equivalent
where BOM use has changed. I think the tradeoff in
performance justifies not doing the full prescan.
To see this rare case, the file superAscii.txt (an
"other" file type) is included without a BOM in CompareFolder1
and with a BOM in CompareFolder2 in the solution
download. They are identical UTF-8 files. You may also
be interested to check the table in reference (3).
Creepy Clean
If you build or analyze in a chosen root folder, you can
accumulate files Visual Studio doesn't remove. I added a
"Clean" button to do some cleanup so these won't get in the way of
"Same" and "Diff". Two dictionaries are actually used to do
this. One handles cleanable files that have a single
extension. A second dictionary is needed to handle
over-dotted files like MyCSharpFile.g.i.cs. These
dictionaries are initialized right after InitializeComponent
in the MainWindow
constructor. Here is an
abbreviated section of the initFixedDicts
method:
private void initFixedDicts()
{
...
cleanAnyDict[".baml"] = 1;
cleanAnyDict[".bi"] = 1;
cleanAnyDict[".bsc"] = 1;
cleanAnyDict[".cache"] = 1;
cleanAnyDict[".cdf"] = 1;
cleanAnyDict[".exp"] = 1;
cleanAnyDict[".idb"] = 1;
cleanAnyDict[".ilk"] = 1;
...
cleanAnyDict2[".CodeAnalysisLog.xml"] = 1;
cleanAnyDict2[".csproj.GenerateResource.Cache"] = 1;
cleanAnyDict2[".csprojResolveAssemblyReference.cache"] = 1;
cleanAnyDict2[".exe.config"] = 1;
cleanAnyDict2[".g.cs"] = 1;
cleanAnyDict2[".g.i.cs"] = 1;
...
}
Fair warning: There is no confirmation in Newer
when you click "Clean". If you plan to use this button,
please double-check the file types in these two directories don't
include ones you don't want to clean. You can add or delete
file types from these dictionaries to suit cleaning for your
environment.
There is also a "delEmpty dirs on clean" CheckBox that will
remove bin, obj and other directories when cleaning leaves or
finds that a subfolder is empty. If you are minimizing an
older snapshot, I recommend checking this box and clicking "Clean"
after clicking "Del EqEqvR1".
Is Newer Creeping Along? How Does It Work?
The last two enhancements show activity and a help form.
Creating a "Same" or "Diff" file listing or deleting equal and
equivalent files can take a while. Newer
displays an ellipsis when it is busy doing something:
But how do you know it hasn't hung? If you check the "Show
activity" box before clicking "Root 1 unique", "Root 2 unique",
... "Root 2 older", or "Del EqEqvR1", Newer
will
display file names in the Search string TextBox while it checks
files to show its working. Don't check this box after
starting one of these long compare or delete functions and expect
activity to be shown immediately. Newer
will
be too busy to notice (for now).
Finally, if you forget all this description, clicking the "?"
Button will show a cheat sheet for some of these features.
Note: HelpForm.exe is its own executable and needs to be
located in the directory from were you start Newer.exe or
NewerPy.exe. The downloads should have this already
handled.
Finally finally, I added two "->clip" buttons to put full
paths of any file list on the clipboard. I decided not to
add surrounding quotes for paths with spaces but it wouldn't be
too hard to creep a little further. Beware if you have
listed "Root 1 unique" files and click "->clip" for root 2, you
will get paths for files that don't exist. But maybe that is
what you want. This last, of course, proves article crafting
takes longer than feature creeping.
Securing against Creeps
Just a note. If you are using Windows 8 or better, you may
already have run into a problem running executables downloaded from
CodeProject.
I have provided binaries before in addition to Visual Studio
solutions. The good thing about building the solution anew on
your system is that you won't see this problem. The bad thing
is that you need Visual Studio to easily build programs that are
"known" to your system.
I think it's still worthwhile providing pre-built binaries.
It's an easy way to check out whether something is worth using or to
follow along when you read an article. On Win 8 and above,
it's just two steps further (one time) to do this if a program is
unsigned.
Step 1. You double-click the .exe and "Windows Smart Screen"
says "Windows protected your PC". Click "More info".
There won't be an indication "More info" has a link, but it does and
it takes you to Step 2.
Step 2. You should have the choice of "Run Anyway" or "Don't
Run". Since the executable comes from CodeProject and since
authors are identifiable, I suggest it's fairly safe to run
anyway. It's as safe as it was before Win 8 and once you do
run anyway, Win 8[+] will not bother you again about that program.
Perhaps we can both look into signing for next time. One place
to start is the 5-star Tip at reference (4).
Jacking the Code
Well, this is all fine but did I really address tracking
solutions in other languages (not yet). I'd like to present
a small, simplified table showing some other languages:
Language
| Commenting
| Extensions
|
F#
| // /// (*...*)
| .fs .fsi
|
Python
| #
| .py
|
VB
| '
| .vb .vbs
|
HTML, XML
| <!-- ... -->
| .htm .html .xml
|
JavaScript
| // /*...*/
| .js
|
It would be tedious and unfun to show Newer
snippets needing changes to handle new file extensions and
equivalency checking for other languages. What I've done
instead is include a NewerPy
project in the
downloadable solution.
As an example of places to change code, NewerPy
has
replaced the ".h, .hpp" CheckBox and counter with one for
".py". Where CommentableTextFilesEqual
is used
for c-like files, NewerPy
has a CommentablePyFilesEqual
method for comparing .py file equivalency.
BTW, a good source of some useful Python scripts is at reference
(5). A CodeProject article about using Python and C#
together is at reference (6).
This should help if you want to tailor a version of Newer
suitable for your development environment. Just difference
the Newer
and NewerPy
folders with Beyond
Compare. Most changes are in the MainWindow.xaml
and MainWindow.xaml.cs. I'd say use Newer
itself to check the differences, but in this case Newer
doesn't buy you much more than managing execution of Beyond
Compare.
For a free differencing tool instead of Beyond Compare,
there is always WinMerge. Please check reference (7)
below.
If you just want to see the interface change and try comparing a
couple Python snapshots, change the Startup Project to NewerPy
in the Solution Explorer and give it a try.
References
(1)
How to: Iterate Through a Directory Tree (C# Programming Guide)
I believe a Queue was originally used for dirs
(better directory order). Now a Stack is used. The
method in MSDN is also named "TraverseTree" instead of Newer
's
"TransverseTree". "Traverse" is the more-accurate word.
(2)
Using Beyond Compare with Version Control System
(3)
Comparing Characters in Windows-1252, ISO-8859-1, IS0-8859-15,
Comparison Table
(4)
How-to-be-your-own-Certificate-Authority-and-creat e your
own certificate to sign code files. By Mike
Meinz, 1 Mar
2013
(5)
ActiveState Code >> Recipes Popular Python Scripts
(6)
Python, Visual Studio, and C#... So. Sweet By Nick
Cosentino, 23 Sep
2013
(7)
WinMerge Command Line WinMerge home is here
(8)
On Having and Wanting Favor
History
- Submitted to CodeProject 12 Feb 2014