Contents
When working on a project to display and compare source codes, I had to sort objects inside an ArrayList
. Because I never did this before, my knowledge about this issue has been zero. Therefore, I Google'd and got hundreds of results.
Two good articles I found in The Code Project:
But none of the articles I read contained the complete information I needed. So I spent several hours with many of the query results, and did a couple of tests, until I thought I knew enough.
Then I decided to write a summary here to help other newbies in the sort issue. You should download the demo program so that you can see the results of the operations described here.
To sort items, you can use every criteria you want. Besides the 'usual' ones like size, height, weight etc., it may be also brightness, sweetness, age, etc. The internal rules of the sort algorithm is a secret of Bill G. You only have to compare two items. Your decision is given by the integer result of the Compare
method:
- If you decide for item #2, the result is positive
- If the items are equivalent, the result is 0
- If you decide for item #1, the result is negative
A negative result is not necessarily -1. This brings an easy way to compare two numbers just by returning their difference.
Walking through the Google results, you will find the interface IComparable
. In the following articles, you will see the IComparer
.
A printing error?
No! Both interfaces can be used to sort simple types as well as to do very complex multi-step sorts. The differences:
| IComparer | IComparable |
Implemented in | Subclass in ArrayList or separate class | Item to be sorted |
Implemented by | IComparer class <comparername> Method "Compare(Object x, Object y)" | Method "CompareTo (object x) " |
Executed by | ArrayList.Sort(<comparername>) | ArrayList.Sort() |
Occurrence | Multiple possible | Single only |
International strings | Possible | Possible |
Practically, every result can be reached by use of either of these interfaces.
Practical experiments
Let's now sort some ArrayList
s. The leftmost ListBox
is generated by clicking one of the left or middle Button
s. The Click
event then copies all lstOrig
ListBox
items into a standard ArrayList
, sArl
, which does a sArl.Sort()
and writes the content of sArl
to the lstStandard
Listbox
.
Private Sub standardSort()
Dim sArl As New ArrayList
lblType.Text = lstOrig.Items.Item(0).GetType().ToString()
origToArray(sArl)
Try
sArl.Sort()
arrayToListBox(sArl, lstStandard)
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Sub
As you can see, the first three simple types Int32
, String
, and DateTime
are sorted correctly without coding any sort algorithm anywhere. But when trying the Color
type, you get an exception. So for this type we have to do some coding:
We generate a class MyColor
. Because the structure System.Color
is sealed
, we cannot derive directly from System.Color
and have to define a field m_color
. The class MyColor
implements the interface IComparable
.
To code the implementation, we have to define a compare method, int IComparable.CompareTo(object x)
.
In this method, the class instance has to compare the 'delivered' object x
with itself.
public class MyColor : IComparable
{
private Color m_color;
#region Constructor
public MyColor(Color myColor)
{
m_color = myColor;
}
#endregion
#region Overrides
public override string ToString()
{
return m_color.ToString();
}
#endregion
#region Standard Comparer
int IComparable.CompareTo(object x)
{
MyColor myX = (MyColor)x;
return string.Compare(this.m_color.Name, myX.m_color.Name);
}
#endregion
}
This simple user defined class can be sorted by clicking the MyColor
button. The one and only one method int IComparable.CompareTo(object x)
has to be defined for the simple user defined class.
Here, the complexity is not the reason to use the IComparer
interface. The same could be done by using IComparable
. It is chosen here do demonstrate how to use this technique.
Generally, there are two ways to use ArrayList
s with an IComparer
comparer class.
- Define the comparer as an external class, and call the overloaded
Sort
method with <standardArrayList>.Sort(<myComparer>)
- Define the comparer as an internal class of a derived
ArrayList
, and overwrite the Sort()
method to use the internal comparer.
I prefer the second one because it is more transparent. The user does not necessarily need to know which comparer is used.
If you click the Ifline button, you get the result shown in the sample image above. If you think about the lines as strings, they are sorted correctly. But if you think about them as statements, the logical order is wrong. So we have to define a special sort algorithm for If-Lines.
First, let's create a class IfLine
, and a class IfLineList
derived from ArrayList
and containing IfLine
s.
To sort, we have to divide the IfLine
string into three parts:
- The leading
string
part, StartPart
- The
int
part, NumVal
- The trailing
string
part, EndPart
With these parts, we define a three-step compare algorithm:
private class IfLineSort : IComparer
{
int IComparer.Compare(object x, object y)
{
IfLine src = new IfLine(x.ToString());
IfLine trg = new IfLine(y.ToString());
int result;
result = string.Compare(src.StartPart, trg.StartPart);
if (result != 0)
return result;
result = src.NumVal - trg.NumVal;
if (result != 0)
return result;
result = string.Compare(src.EndPart, trg.EndPart);
return result;
}
}
Finally, in the IfLineList
class, we have to override the Sort()
method to use our IComparer
, IfLineSort
:
public override void Sort()
{
base.Sort(new IfLineSort());
}
Imagine you have some variable definitions. In your application, you dynamically want to have the choice to sort by name, type, and access level, or a combination of them. (For the sample, we restrict to these three items, no more attributes.)
A definition line looks like: "private int myValue;
". We create a class, VarLine
, representing such a definition line, and a class, VarLineList
, derived from ArrayList
and containing VarLine
s. The variable type and access level are described by enum
s. For the sample, the enum
s are restricted, too:
public enum EType
{
typInt = 1,
typString,
typBool
}
public enum EAccess
{
accessPublic = 1,
accessInternal,
accessPrivate
}
To tell the application about the sort methods and orders, we define:
T
for type sorting ascending
t
for type sorting descending
A
for access sorting descending
a
for access sorting descending
N
for name sorting descending
n
for name sorting descending
Any combination like 'TAN' or 'aTn' is possible. Because the names are unique, it makes no sense to add sort steps behind a 'N' or 'n'.
The 'combination string' is given to the SortAlgoritm
property of VarLineList
, which passes it to the IComparer
, VarLineSort
, inside the constructor.
int IComparer.Compare( object x, object y )
{
VarLine src = new VarLine(x.ToString());
VarLine trg = new VarLine(y.ToString());
int result = 0;
int i;
string step;
string stepU;
for (i=0;i<m_Algo.Length;i++)
{
step = m_Algo.Substring(i,1);
stepU = step.ToUpper();
switch (stepU) {
case "A":
result = src.VarAccess - trg.VarAccess;
break;
case "T":
result = src.VarType - trg.VarType;
break;
case "N":
result = string.Compare(src.VarName, trg.VarName);
break;
}
if (result != 0)
{
if (!step.Equals(stepU))
return -result;
else
return result;
}
}
return 0;
}
Preface: The following text is valid for character sets derived from the Latin alphabet. I do not kow whether it is valid for Arabic, East Asian, Greek, Cyrillic etc. char sets.
It is rather easy to sort strings containing foreign characters: When in the IComparer
or IComparable
, you use the string.Compare
method, and use an overloaded version with additional arguments 'bool ignoreCase
' and 'System.Globalization.CultureInfo culture
'.
int IComparer.Compare(object x, object y)
{
string src = x.ToString();
string trg = y.ToString();
int result = string.Compare(src, trg, m_ignoreCase, m_Culture);
return result;
}
Here, you can define the language (by defining a 'Culture
') to sort strings. You have two choices:
- A string with the culture name like 'en-US'
- A number with the culture identifier like '0x0409'
A complete list of all identifiers can be found at MSDN.
If you try the 'Intl. Text 1', you will (depending on your language) usually not see any difference between 'invariant', 'local PC', and 'en-US'. Only the 'da-DK' will bring a difference. The reason is that in most languages, special characters are sorted in the neighborhood of the characters they have correlation to. However, in Danish language, those characters are sorted as symbols, that means behind the standard characters.
As a conclusion:
Using the CultureInfo
for sorting strings, is, in most cases, not necessary. Of course, to use it will not do any harm.
Some languages have two different official sorting algorithms.
Example: The German language has:
- A Dictionary sort
- A Phone Book sort
The difference is in treating the umlauts (which are a, u, o with two dots on top, they look like ä Ä ö Ö ü Ü).
In Dictionary Sort (which is standard), the umlauts are sorted just behind the corresponding vowel:
T, U, Ü, V ...
In Phone Book sort, umlauts are replaced by the corresponding vowel with an additional 'e':
Ua, Ub, Uc, Ud, Ue, Ü, Uf ...
To perform the secondary sort algorithm, you have to define a special culture identifier (unfortunately, a special culture name does not exist). For the German language, it is 0x10407 instead of the standard 0x0407.
Try 'Intl. Text 2' now and see the results.
Here is a complete list of all secondary algorithms: MSDN.
In the .NET framework, there exists a class 'CaseInsensitiveComparer
' which I intentionally did not use or mention. In my opinion, the string.Compare
method offers every possibility which is needed.
- 17-Jun-2006 - First version published.