Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / VB

ArrayList Sort Tutorial

4.56/5 (12 votes)
23 Jun 2006CPOL7 min read 1   1K  
Theory and summary of different sort methods.

Beginners_Sort

Contents

Introduction

When working on a project to display and compare source codes, I had to sort objects inside an ArrayList. Because I never did this before, my knowledge about this issue has been zero. Therefore, I Google'd and got hundreds of results.

Two good articles I found in The Code Project:

But none of the articles I read contained the complete information I needed. So I spent several hours with many of the query results, and did a couple of tests, until I thought I knew enough.

Then I decided to write a summary here to help other newbies in the sort issue. You should download the demo program so that you can see the results of the operations described here.

Some theory

To sort items, you can use every criteria you want. Besides the 'usual' ones like size, height, weight etc., it may be also brightness, sweetness, age, etc. The internal rules of the sort algorithm is a secret of Bill G. You only have to compare two items. Your decision is given by the integer result of the Compare method:

  • If you decide for item #2, the result is positive
  • If the items are equivalent, the result is 0
  • If you decide for item #1, the result is negative

A negative result is not necessarily -1. This brings an easy way to compare two numbers just by returning their difference.

IComparable vs. IComparer

Walking through the Google results, you will find the interface IComparable. In the following articles, you will see the IComparer.

A printing error?

No! Both interfaces can be used to sort simple types as well as to do very complex multi-step sorts. The differences:

 IComparerIComparable
Implemented inSubclass in ArrayList or separate classItem to be sorted
Implemented byIComparer class <comparername>
Method "Compare(Object x, Object y)"
Method "CompareTo (object x)"
Executed byArrayList.Sort(<comparername>)ArrayList.Sort()
OccurrenceMultiple possibleSingle only
International stringsPossiblePossible

Practically, every result can be reached by use of either of these interfaces.

Practical experiments

Let's now sort some ArrayLists. The leftmost ListBox is generated by clicking one of the left or middle Buttons. The Click event then copies all lstOrig ListBox items into a standard ArrayList, sArl, which does a sArl.Sort() and writes the content of sArl to the lstStandard Listbox.

VB
Private Sub standardSort()
  Dim sArl As New ArrayList
  lblType.Text = lstOrig.Items.Item(0).GetType().ToString()
  ' puts all lstOrig items into the ArrayList sArl 
  origToArray(sArl)
  Try
    sArl.Sort()
    ' puts the sorted items into the middle Listbox
    arrayToListBox(sArl, lstStandard)
  Catch ex As Exception
    MsgBox(ex.Message)
  End Try
End Sub

Standard sorting of simple types

As you can see, the first three simple types Int32, String, and DateTime are sorted correctly without coding any sort algorithm anywhere. But when trying the Color type, you get an exception. So for this type we have to do some coding:

Standard sorting by using IComparable

We generate a class MyColor. Because the structure System.Color is sealed, we cannot derive directly from System.Color and have to define a field m_color. The class MyColor implements the interface IComparable.

To code the implementation, we have to define a compare method, int IComparable.CompareTo(object x).

In this method, the class instance has to compare the 'delivered' object x with itself.

C#
  public class MyColor : IComparable
  {
    private Color m_color;

    #region Constructor
    public MyColor(Color myColor)
    {
      m_color = myColor;
    }
    #endregion

    #region Overrides
    /// 
    /// base.ToString() returns "MyColor".
    /// So we return m_color.ToString()
    /// 
    public override string ToString()
    {
      return m_color.ToString();
    }
    #endregion

    #region Standard Comparer
    /// 
    /// MyColor Standard Compare method: compare names
    /// 
    int IComparable.CompareTo(object x)
    {
      MyColor myX = (MyColor)x;
      return string.Compare(this.m_color.Name, myX.m_color.Name);
    }
    #endregion
  }

This simple user defined class can be sorted by clicking the MyColor button. The one and only one method int IComparable.CompareTo(object x) has to be defined for the simple user defined class.

Complex sorting by using IComparer

Here, the complexity is not the reason to use the IComparer interface. The same could be done by using IComparable. It is chosen here do demonstrate how to use this technique.

Generally, there are two ways to use ArrayLists with an IComparer comparer class.

  • Define the comparer as an external class, and call the overloaded Sort method with <standardArrayList>.Sort(<myComparer>)
  • Define the comparer as an internal class of a derived ArrayList, and overwrite the Sort() method to use the internal comparer.

I prefer the second one because it is more transparent. The user does not necessarily need to know which comparer is used.

Lines with mixed string and numerical contents

If you click the Ifline button, you get the result shown in the sample image above. If you think about the lines as strings, they are sorted correctly. But if you think about them as statements, the logical order is wrong. So we have to define a special sort algorithm for If-Lines.

First, let's create a class IfLine, and a class IfLineList derived from ArrayList and containing IfLines.

To sort, we have to divide the IfLine string into three parts:

  • The leading string part, StartPart
  • The int part, NumVal
  • The trailing string part, EndPart

With these parts, we define a three-step compare algorithm:

C#
private class IfLineSort : IComparer
{
  /// <summary>
  /// IfLineSort Compare method
  /// </summary>
  int IComparer.Compare(object x, object y)
  {
    IfLine src = new IfLine(x.ToString());
    IfLine trg = new IfLine(y.ToString());
    int result;
    // first alpha sort by StartPart
    result = string.Compare(src.StartPart, trg.StartPart);
    if (result != 0)
      return result; // different, compare done
    // if result is 0, StartParts are identical.
    // then numerical sort by NumVal
    result = src.NumVal - trg.NumVal;
    if (result != 0)
      return result;// different, compare done
    // if result is 0, NumVals are identical, too.
    // then do final alpha sort by EndParts
    result = string.Compare(src.EndPart, trg.EndPart);
    return result;
  }
}

Finally, in the IfLineList class, we have to override the Sort() method to use our IComparer, IfLineSort:

C#
/// <summary>
/// IfLineList class standard sort
/// </summary> 
public override void Sort()
{
  base.Sort(new IfLineSort());
}

Lines to be sorted dynamically by different aspects

Imagine you have some variable definitions. In your application, you dynamically want to have the choice to sort by name, type, and access level, or a combination of them. (For the sample, we restrict to these three items, no more attributes.)

A definition line looks like: "private int myValue;". We create a class, VarLine, representing such a definition line, and a class, VarLineList, derived from ArrayList and containing VarLines. The variable type and access level are described by enums. For the sample, the enums are restricted, too:

C#
  public enum EType
  {
    // order is my personal choice
    typInt = 1,
    typString,
    typBool
  }
  public enum EAccess
    // order is 'by publicity'
  {
    accessPublic = 1,
    accessInternal,
    accessPrivate
  }

To tell the application about the sort methods and orders, we define:

  • T for type sorting ascending
  • t for type sorting descending
  • A for access sorting descending
  • a for access sorting descending
  • N for name sorting descending
  • n for name sorting descending

Any combination like 'TAN' or 'aTn' is possible. Because the names are unique, it makes no sense to add sort steps behind a 'N' or 'n'.

The 'combination string' is given to the SortAlgoritm property of VarLineList, which passes it to the IComparer, VarLineSort, inside the constructor.

C#
/// <summary>
/// CustVarLineSort Compare method
/// compare step by step, algorithm is stored as symbolic string in m_Algo
/// </summary>
int IComparer.Compare( object x, object y )
{
  VarLine src = new VarLine(x.ToString());
  VarLine trg = new VarLine(y.ToString());
  int result = 0;
  int i;
  string step;
  string stepU;
  for (i=0;i<m_Algo.Length;i++)
  {
    step = m_Algo.Substring(i,1);
    stepU = step.ToUpper();
    switch (stepU)     {
      case "A":
        result = src.VarAccess - trg.VarAccess;
        break;
      case "T":
        result = src.VarType - trg.VarType;
        break;
      case "N":
        result = string.Compare(src.VarName, trg.VarName);
        break;
    }
    if (result != 0)
    {
      if (!step.Equals(stepU))
        // descending sort
        return -result;
      else
        return result;
    }
  }
  // if we come here the objects are equivalent
  return 0;
}

Sorting of international strings

Preface: The following text is valid for character sets derived from the Latin alphabet. I do not kow whether it is valid for Arabic, East Asian, Greek, Cyrillic etc. char sets.

It is rather easy to sort strings containing foreign characters: When in the IComparer or IComparable, you use the string.Compare method, and use an overloaded version with additional arguments 'bool ignoreCase' and 'System.Globalization.CultureInfo culture'.

C#
/// <summary>
/// StringList Compare method
/// </summary>
int IComparer.Compare(object x, object y)
{
  string src = x.ToString();
  string trg = y.ToString();
  int result = string.Compare(src, trg, m_ignoreCase, m_Culture);
  return result;
}

Here, you can define the language (by defining a 'Culture') to sort strings. You have two choices:

  • A string with the culture name like 'en-US'
  • A number with the culture identifier like '0x0409'

A complete list of all identifiers can be found at MSDN.

Standard sorting

If you try the 'Intl. Text 1', you will (depending on your language) usually not see any difference between 'invariant', 'local PC', and 'en-US'. Only the 'da-DK' will bring a difference. The reason is that in most languages, special characters are sorted in the neighborhood of the characters they have correlation to. However, in Danish language, those characters are sorted as symbols, that means behind the standard characters.

As a conclusion:

Using the CultureInfo for sorting strings, is, in most cases, not necessary. Of course, to use it will not do any harm.

Secondary algorithm sorting

Some languages have two different official sorting algorithms.

Example: The German language has:

  • A Dictionary sort
  • A Phone Book sort

The difference is in treating the umlauts (which are a, u, o with two dots on top, they look like ä Ä ö Ö ü Ü).

In Dictionary Sort (which is standard), the umlauts are sorted just behind the corresponding vowel:

T, U, Ü, V ...

In Phone Book sort, umlauts are replaced by the corresponding vowel with an additional 'e':

Ua, Ub, Uc, Ud, Ue, Ü, Uf ...

To perform the secondary sort algorithm, you have to define a special culture identifier (unfortunately, a special culture name does not exist). For the German language, it is 0x10407 instead of the standard 0x0407.

Try 'Intl. Text 2' now and see the results.

Here is a complete list of all secondary algorithms: MSDN.

Remarks

In the .NET framework, there exists a class 'CaseInsensitiveComparer' which I intentionally did not use or mention. In my opinion, the string.Compare method offers every possibility which is needed.

History

  • 17-Jun-2006 - First version published.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)