Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / algorithm

A Generic Implementation of Dijkstra’s Smoothsort in C#

5.00/5 (2 votes)
11 Dec 2023MIT27 min read 9.5K   93  
This article describes the implementation of Dijkstra's Smoothsort as a generic sorting algorithm in C#.
Through sample executions, the article demonstrates that Smoothsort outperforms Heapsort particularly when the original array is nearly or completely sorted, emphasizing Smoothsort's efficiency in such scenarios.

1. Introduction

In 1981, the late Professor Edsger Wybe Dijkstra wrote an article on a sorting algorithm that he named Smoothsort as an alternative to Heapsort for sorting in situ (i.e., in place) a one dimensional array of unspecified data type (Dijkstra 1981). Heapsort is one of the fastest sorting algorithms: for an array of size N, its guaranteed execution time is O( N log2 N ). However, a disadvantage is that its execution time is the same even if the original array is nearly (or completely) sorted. In contrast, Smoothsort executes in time O( N log2 N ) on an arbitrary array, but transitions smoothly to O( N ) if the array is nearly sorted.

2. Binary Trees and Heaps

In a binary tree, each parent node has at most two children. A node with at least one child is a parent node, while nodes with no children are leaves. A full binary tree is one in which no node has only one child, and the nodes of the tree satisfy the property # of leaves = # internal nodes + 1.

A perfect binary tree is a full tree in which all the leaves are at the same level. A binary tree is complete if the next-to-last level is full of nodes and the leaves at the last level are justified to the left. These trees are illustrated in Figures 1 and 2.

Image 1

Image 2

If h is the height (number of levels) of a perfect binary tree, then it has (2 ^ (h + 1)) - 1 total nodes, of which 2 ^ h are leaves and (2 ^ h) - 1 are internal nodes.

In an implementation of binary trees by means of one-dimensional arrays, the elements of the array correspond to the nodes of the tree, and indices into the array correspond to the edges linking a node to its children (if any). Given the index into a node there must be an efficient way to determine the indices of its children. In order to store an N-node binary tree in an N-element array, the tree can be traversed level-by-level, top-to-bottom, and left-to-right, storing the nodes in consecutive array elements. Such is the well-known order of breadth-first search.

A heap is a binary tree that satisfies two properties. In terms of its shape, a heap is a complete binary tree. In terms of the ordering of the elements stored, a heap is a max-heap if the value stored in every parent node is greater than or equal to the value stored in each of its children. On the other hand, in a min-heap the value stored in a parent node is less than or equal to the values in its children. Leaf nodes vaccuously satisfy the order property of heaps. Examples of max- and min-heaps containing prime numbers are shown in Figure 3.

Image 3

For a 0-based, N-element array x, given the index i into node x[ i ] satisfying the constraints 0 <= i <= (N - 3)/2, the left and right children of such a node are, respectively, x[ i * 2 + 1 ] and x[ i * 2 + 2 ].

3. Heapsort

J.W.J. Williams (1964) published his Algol implementation of Heapsort for 1-based arrays. This section describes the logic underlying Heapsort and the author’s generic implementation of this algorithm in C#. As a prelude to Smoothsort, the formal specification of the pre-condition before, and post-condition after the execution of Heapsort was given by van Gasteren, Dijkstra and Feijen (1981), as shown in Dijkstra’s handwriting in Figure 4.

Image 4

In words, the pre-condition states that for a 0-based, N-element array m of integers, the array contains an arbitrary collection (i.e., a bag) of integers and, after the execution of Heapsort, the post-condition states that for all indices i and j satisfying the conditions 0 <= i < j and 1 <= j < N the elements of m are in sorted order. Of course, the specification is valid for any type of array elements. An implementation of Heapsort for integer-element arrays is described next.

The logic of Heapsort is simple. Assuming that an array is to be sorted in ascending order, rearrange the elements to create a max-heap, put the maximum element in its proper place, restore the max-heap property from the first to the next-to-last element, and repeat the process until there are no more elements to process. The result of the sort is a min-heap. This verbal description is implemented as follows. The C# code is adapted from C++ code written by the author (Orejel 2003), and it is assumed to reside within a static class in a console application.

C#
/// <summary>Apply Heapsort to an {int} array. 
/// </summary>
/// <param name="m">Array reference.</param>
/// <param name="n">Index of LAST element of {m}.</param>
/// <remarks>
///           The sorting is done in ascending order.
///           The initial root element is at the midpoint of {m}.
/// </remarks>
static void HeapSort( int[] m )
{
   int N = m.Length;

   BuildHeap( m, N >> 1, N - 1 );
   SortNodes( m, N - 1 );
}// HeapSort

As noted in the remarks section of function HeapSort, the process of building a max-heap starts at the midpoint of the array. Once the heap has been built, the nodes can be sorted.

Image 5

Figure 5a shows a tail-recursive implementation of function BuildHeap. The function is tail-recursive because nothing must be done after the recursive call returns. Consequently, Figure 5b shows an iterative version in which the tail-recursive call has been replaced by its equivalent code. Assuming the existence of the proper functionality to display arrays and trees (to be dealt with later), Figure 6 shows a sample array and the tree structure that results after the execution of BuildHeap.

Image 6

Function ReheapDown is the workhorse of BuildHeap. As long as there are nodes (i.e., array elements) to be examined, if a parent node is greater than the maximum of its children, the nodes are interchanged and the process continues with the maximum child as the new parent. The function is also tail-recursive, and the comments in it indicate how to make the function iterative: uncomment the label, comment out the recursive call, and uncomment the code that replaces the call.

C#
/// <summary>Move a parent node down the heap until the max-heap property is restored.
/// </summary>
/// <param name="m">Array reference.</param>
/// <param name="parent">Index of a parent element in {m}.</param>
/// <param name="bottom">Index of last element in {m} to consider.
/// </param>
static void ReheapDown( int[] m, int parent, int bottom )
{
   int maxChild, leftChild;

   // L0: // Target label for {goto} statement.

   leftChild = ( parent << 1 ) + 1;

   if ( leftChild <= bottom )
   {
      maxChild = MaxChildIndex( m, leftChild, bottom );

      if ( m[ parent ] < m[ maxChild ] )
      {
         Swap( ref m[ parent ], ref m[ maxChild ] );

         ReheapDown( m, maxChild, bottom );

         // Uncomment the label {L0}, comment out the tail-recursive call,
         // and uncomment the following line.
         //
         // parent = maxChild; goto L0;
      }
   }
}// ReheapDown

The two auxiliary functions called by ReheapDown are MaxChildIndex which determines the index of the maximum child and, of course, Swap to perform the interchange of array elements. Observe that it is essential to pass to Swap its arguments by reference.

C#
/// <summary>Determine the index of the maximum child of a parent. 
/// </summary>
/// <param name="m">Array reference.</param>
/// <param name="leftChild">Index of left child.</param>
/// <param name="bottom">Index of last element in {m} to consider.</param>
/// <returns>Index of the maximum child.
/// </returns>
static int MaxChildIndex( int[] m, int leftChild, int bottom )
{
   int rightChild = leftChild + 1; // Index of right child.

   return ( leftChild == bottom )
          ? leftChild
          : ( m[ leftChild ] > m[ rightChild ] )
            ? leftChild
            : rightChild;
}// MaxChildIndex

/// <summary>Interchange the values of two {int} variables. 
/// </summary>
/// <param name="x">Reference to first variable.</param>
/// <param name="y"> Reference to second variable.</param>
/// <remarks>
///           The arguments to this function MUST be passed by
///           reference in order to properly alter them.
/// </remarks>
static void Swap( ref int x, ref int y )
{
   int t = x; x = y; y = t;
}// Swap

The last function to be written is SortNodes. It puts the maximum heap element in its place at the end of the array, calls ReheapDown to restore the max-heap property from the first to the next-to-last element, and finally calls itself recursively. As might have been anticipated, SortNodes is also tail-recursive and the comments in the code indicate how to make it iterative.

C#
/// <summary>Arrange the elements of an array in ascending order. 
/// </summary>
/// <param name="m">Array reference.</param>
/// <param name="bottom">Index of last element in {a} to consider.
/// </param>
static void SortNodes( int[] m, int bottom )
{
   // L0: // Target label for {goto} statement.
   if ( bottom > 0 )
   {
      // Put { m[ 0 ] } in its proper place.
      Swap( ref m[ 0 ], ref m[ bottom-- ] );

      ReheapDown( m, 0, bottom );
      SortNodes( m, bottom );

      // Uncomment the label {L0}, comment out the tail-recursive call,
      // and uncomment the following line.
      //
      // goto L0; {bottom} already decreased after call to {Swap}.
   }
}// SortNodes

Figure 7 shows the min-heap tree structure and the array at the end of the execution of SortNodes. As a final note, if HeapSort is called with the same array after sorting, it will perform the same actions as if the array had not been sorted. This concludes the description of Heapsort over an array of integer values.

Image 7

4. Generic Heapsort

This section is not a re-hash of the code in Section 3. First and foremost, the code is generic in that it can handle array elements of an arbitrary data type that satisfies a very simple constraint. Secondly, it uses some library functions (to be described as they are used) that are naturally common to sorting algorithms. And thirdly, the code is iterative without the use of labels and goto statements. So, this section is a prelude to things to come, namely GenericSmoothSort.

In order to handle arbitrary data types, the functions that implement Heapsort must have a generic type parameter <T>, and such a type must be constrained by the IComparable interface. The class name and the function signatures in the GenericHeapSort namespace are shown in Figure 8. The function names in Section 3 have been retained. There is a using statement indicating that the code has a reference to a library named UtilSortLib, which implements functions that are common to sorting algorithms. In particular, observe that the HS class does not include a Swap function because it is a fundamental operation in any sorting algorithm, and thus it belongs in that library.

The library UtilSortLib has three classes, one for static variables and two for functions: TypedUfn<T> (short for "Typed Utility function") and UntypedUfn (short for "Un-typed Utility function"). Most of the functions have to do with displaying results on the console and are described in Section 7, but some will be described as they are used.

Image 8

The specification of a type parameter after the name of each of the functions allows them to take a generic array as a parameter, and the where constraint indicates that the parameter must be subject to comparison. This means that there must be a CompareTo function for the data type of the array elements. According to the Microsoft documentation, there is such a function for all the types that can be ordered or sorted, which is the case for all C# primitive data types. However, as will be illustrated later, an arbitrary data type must explicitly implement the comparison function.

With the exception of displaying on the console two utility global variables (the count of swap operations and the count of comparisons), and the array in tree form (by calling TypedUfn<T>.DisplayTree), function HeapSort<T> is the generic version of its counterpart in Section 3.

C#
/// <summary>Implement the Heapsort algorithm over a generic array.
/// </summary>
/// <typeparam name="T">Type of the array elements.</typeparam>
/// <param name="arr">Reference to a generic array to be sorted in ascending order.
/// </param>
public static void HeapSort<T>( T[] arr ) where T : IComparable<T>
{
   int n = arr.Length - 1;

   BuildHeap( arr, n >> 1, n );

   Console.Write( "After {BuildHeap}: " );
   Console.WriteLine( "{0} swaps; {1} comparisons", Uvar.swapCount, Uvar.compCount );
   TypedUfn<T>.DisplayTree( "after {BuildHeap}", arr );

   SortNodes( arr, n );
}// HeapSort

Functions BuildHeap<T> and ReheapDown<T> are both iterative, and do not use a target label and corresponding goto statement. However, observe the use of a break statement in function ReheapDown. Without it, the while loop in the function would not terminate. This illustrates the fact that it is not trivial to eliminate labels and goto statements that implement loops.

C#
/// <summary>Move down the heap the contents of a parent node to its right position
///          in the ascending order of elements in an array.
/// </summary>
/// <typeparam name="T">Type of the array elements.</typeparam>
/// <param name="arr">Reference to a generic array containing the heap nodes.</param>
/// <param name="parent">Index into {arr} of a parent node.</param>
/// <param name="bottom">Index into {arr} of the bottom-most node.
/// <remarks>
///           On the initial call, {bottom} must index the LAST element of {arr}.
/// </remarks>
private static void ReheapDown<T>( T[] arr, int parent, int bottom )
                    where T : IComparable<T>
{
   int maxChild, leftChild;

   leftChild = ( parent << 1 ) + 1;

   while ( leftChild <= bottom )
   {
      maxChild = MaxChildIndex( arr, leftChild, bottom );

      int cmp = arr[ parent ].CompareTo( arr[ maxChild ] );

      if ( UntypedUfn.IsLT( cmp ) )
      {
         TypedUfn<T>.Swap( "ReheapDown", arr, parent, maxChild );

         parent = maxChild;
         leftChild = ( parent << 1 ) + 1;
      }
      else // UntypedUfn.IsGE( cmp )
      {
         break;
      }
   }
}// ReheapDown

Function ReheapDown<T> calls a CompareTo function on instances of a generic data type. By definition in the Microsoft documentation, if x and y are instances of the same data type, the call x.CompareTo( y ) returns an integer value which is less than 0 if x is less than y, 0 if x is equal to y, or greater than 0 if x is greater than y. The untyped utility function UntypedUfn.IsLT returns true if its argument is less than 0 (the assertion in the comment for the else part, states that function UntypedUfn.IsGE would return true given the same argument).

C#
/// <summary>Determine whether the result from a {CompareTo} function
///          is "less than".
/// </summary>
/// <param name="cmp">Result from a call to a {CompareTo} function.</param>
/// <returns>Whether or not {cmp} meets the requirement.
/// </returns>
public static bool IsLT( int cmp )
{
   ++Uvar.cmpCount;
   return cmp < 0;
}// IsLT

There are additional untyped comparison functions (IsLE, IsEQ, IsGT) with the obvious meanings. All of them increment the global count of comparisons performed.

In the if clause, ReheapDown calls one of the two versions of the typed utility function TypedUfn<T>.Swap, which inspects the value of a global variable set from an enumeration to control the output to the console.

C#
// Enumeration to control console output

public enum OutMode { quiet, verbose }

/// <summary>Swap two generic array elements.
/// </summary>
/// <param name="fName">Name of function calling {Swap}.</param>
/// <param name="m">Reference to a generic array.</param>
/// <param name="i">Index of element to be swapped.</param>
/// <param name="j">Index of element to be swapped.
/// </param>
public static void Swap( string fName, T[] m, int i, int j )
{
   ++Uvar.swapCount;
   if ( Uvar.OutMode == OutMode.verbose )
   {
      TypedUfn<T>.DisplayArray( "", m );
      DisplaySwap( fName, i, m[ i ], j, m[ j ] );
   }
   Swap( ref m[ i ], ref m[ j ] );
}// Swap

Observe that function Swap does not have a type parameter because it resides inside a typed class. This function increments the global count of swap operations and, when Uvar.outMode equals OutMode.verbose, it calls the typed function DisplayArray and the untyped function DisplaySwap. Finally, the actual swap is performed by the two-parameter function Swap.

C#
/// <summary>Swap two generic arguments passed by reference.
/// </summary>
/// <param name="x">First argument to be swapped.</param>
/// <param name="y">Second argument to be swapped.
/// </param>
public static void Swap( ref T x, ref T y )
{
   T tmp = x;

   x = y; y = tmp;
}// Swap

Function MaxChildIndex<T> has the same logic as in Section 3, but it calls the CompareTo function and the utility function UntypedUfn.IsGT. Function SortNodes<T> is the generic iterative version of its tail-recursive counterpart.

C#
/// <summary>Determine the index into {arr} of the maximum child of a parent
///          node in the heap.
/// </summary>
/// <typeparam name="T">Type of the array elements.</typeparam>
/// <param name="arr">Reference to a generic array containing the heap nodes.</param>
/// <param name="leftChild">Index into {arr} of the left child of a parent node.</param>
/// <param name="bottom">Index into {arr} of the bottom-most node.</param>
/// <returns>The index of the maximum child.
/// </returns>
private static int MaxChildIndex<T>( T[] arr, int leftChild, int bottom )
                   where T : IComparable<T>
{
   int rightChild = leftChild + 1;

   return ( leftChild == bottom )
          ? leftChild
          : UntypedUfn.IsGT( arr[ leftChild ].CompareTo( arr[ rightChild ] ) )
            ? leftChild
            : rightChild;
}// MaxChildIndex

/// <summary>Sort in ascending order the nodes of a heap stored in an array.
/// </summary>
/// <typeparam name="T">Type of the array elements.</typeparam>
/// <param name="arr">Reference to a generic array containing the heap nodes.</param>
/// <param name="bottom">Index into {arr} of the bottom-most heap node.
/// <remarks>
///           On the initial call, {bottom} must index the LAST element of {arr}.
/// </remarks>
private static void SortNodes<T>( T[] arr, int bottom ) where T : IComparable<T>
{
   while ( bottom > 0 )
   {
      // Put { arr[ 0 ] } in its proper place.

      TypedUfn<T>.Swap( " SortNodes", arr, 0, bottom-- );

      ReheapDown( arr, 0, bottom );
   }
}// SortNodes

For comparison purposes, the execution of generic HeapSort will be illustrated hand-in-hand with the execution of generic SmoothSort. But, for the curious reader, Figure 9 shows the result of HeapSort on an array of the first 25 prime numbers in reverse (descending) order. Observe that function BuildHeap performs no swaps because the original array is already a max-heap.

Image 9

To emphasize the fact that Heapsort has the same time complexity for an N-element array whether or not it is nearly or completely sorted, when run with the resulting array in Figure 9 function BuildHeap performs 21 swaps and 42 comparisons, and the totals at the end of the sort are 108 swaps and 167 comparisons.

Since function HeapSort<T> is generic, Figure 10 shows the results of its execution of on an array of characters created from a character string.

Image 10

5. Generic Smoothsort

This sorting algorithm, proposed by Edsger W. Dijkstra (EWD796a 1981) as an alternative to Heapsort, has been commented upon by several people. On the Stack Overflow site, it has been stated that the algorithm is not stable (a big word borrowed from Electrical Engineering and Control Theory) and that stability is more desirable than in-place sorting. Recall that an unstable sort does not preserve the order of identical elements (not a big deal for sorting purposes) and that Quicksort (another excellent sorting algorithm) is also unstable. So, the sorting in-place and the smooth transition from to execution time when an N-element array is nearly or completely sorted are excellent features of Smoothsort. On the other hand, it has been said that the resources on this algorithm are very scarce, and that those available are very hard to understand. As this section will demonstrate, Dijkstra’s design and implementation of Smoothsort are not hard to understand.

The author is aware of at least one implementation of Smoothsort in C++ (Schwarz 2011). After re-hashing the mathematics underlying its design, with some errors on the lengths of Leonardo sub-trees in two figures, Schwarz ends his article with a link to a smoothsort implementation of his own, that has nothing in common with Dijkstra’s code. The class name, global-variable names, and the function signatures in the GenericSmoothSort namespace implementing Dijkstra’s Smoothsort algorithm are shown in Figure 11.

Image 11

Dijkstra’s code was written in what later became the formalism used by him and W.H.J. Feijen in their book A Method of Programming (Dijkstra and Feijen 1988). With the exception of functions Up and Down, the code in this section is almost a direct translation of Dijkstra’s code into C#. The comments on the right of the C# code reproduce the original code.

C#
/// <summary>Generic implementation of Professor Dijkstra's Smoothsort
///          (see EWD796a in the E.W. Dijkstra Archive).
/// </summary>
/// <typeparam name="T">Type of array {m} elements.</typeparam>
/// <param name="m">Reference to a generic array to be sorted.</param>
/// <remarks>
///          The comments to the right of the C# code reproduce the original code
///             in the EWD796a reference.
///          The '^' character denotes the logical "and" operator.
/// </remarks>
public static void SmoothSort<T>( T[] m ) where T : IComparable<T>
                                           // smoothsort:
{
                                           // |[
   N = m.Length;
   r1 = b1 = c1 = -1;

   q = 1; r = 0; p = b = c = 1;            //  ; q := 1; r := 0; p, b, c := 1, 1, 1
   // P3' && P4'                           //  {invariant: P3' ^ P4'}

   while ( q != N )                        //  ; do q != N
   {
      r1 = r;                              //      -> r1 := r

      if ( ( p % 8 ) == 3 )                //       ; if p mod 8 = 3
      {
         b1 = b; c1 = c;                   //           -> b1, c1 := b, c
         Sift( m );                        //            ; sift
         p = ( p + 1 ) >> 2;               //            ; p := (p + 1)/4
         Up( ref b, ref c );               //            ; up
         Up( ref b, ref c );               //            ; up
      }
      else if ( ( p % 4 ) == 1 )           //         [] p mod 4 = 1
      {
         if ( ( q + c ) < N )              //           -> if q + c < N
         {
            b1 = b; c1 = c;                //                -> b1, c1 := b, c
            Sift( m );                     //                 ; sift
         }
         else if ( ( q + c ) >= N )        //              [] q + c >= N
         {
            Trinkle( m );                  //                -> trinkle
         }                                 //              fi
         Down( ref b, ref c ); p <<= 1;    //              ; down; p := 2 * p
         while ( b != 1 )                  //              ; do b != 1
         {
            Down( ref b, ref c ); p <<= 1; //                  -> down; p := 2 * p
         }                                 //                od
         ++p;                              //              ; p := p + 1
      }                                    //         fi;
      ++q; ++r;                            //         q := q + 1; r := r + 1
   }                                       //    od {P3' ^ P4'};
   // P3' && P4'

   r1 = r; Trinkle( m );                   //    r1 := r; trinkle {invariant: P3 ^ P4}
   // P3 && P4

   while ( q != 1 )                        //  ; do q != 1
   {
      --q;                                 //      -> q := q - 1
      if ( b == 1 )                        //      ;  if b = 1
      {
         --r; --p;                         //           -> r := r - 1; p :=  p - 1
         while ( UntypedUfn.IsEven( p ) )  //            ; do even(p)
         {
            p >>= 1;                       //                -> p := p/2
            Up( ref b, ref c );            //                 ; up
         }                                 //              od
      }
      else if ( b >= 3 )                   //         [] b >= 3
      {
         --p; r = r - b + c;               //           -> p := p - 1; r := r - b + c

         if ( p == 0 )                     //            ; if p = 0
         {
                                           //                -> skip
         }
         else if ( p > 0 )                 //              [] p > 0
         {
            SemiTrinkle( m );              //                -> semitrinkle
         }                                 //              fi
         Down( ref b, ref c );             //            ; down
         p = ( p << 1 ) + 1;               //            ; p := 2*p + 1
         r = r + c;                        //            ; r := r + c
         SemiTrinkle( m );                 //            ; semitrinkle
         Down( ref b, ref c );             //            ; down
         p = ( p << 1 ) + 1;               //            ; p := 2*p + 1
      }                                    //         fi
   }                                       //    od
}// SmoothSort                             // ]|

This article is not the place to re-hash Dijkstra’s mathematics underlying his design of Smoothsort. The reader is best referred to his original description (Dijkstra 1981). For completeness sake, some of the terminology and the invariants maintained by the while loops in function SmoothSort are reproduced in Figure 12.

Image 12

The beauty of Smoothsort is the re-arrangement of the array elements to create Leonardo trees in the first while loop. Leonardo trees are binary trees whose sizes are Leonardo numbers, which are the available stretch lengths (cf. EWD796a – 3), defined by the following recurrence relation.

$LP(0) = LP(1) = 1$
$LP(n+2)=LP(n+1)+LP(n)+1$

Quoting Dijkstra, ‘[f]or the sake of the recurrent stretch length computations, we introduce for each stretch length b its "companion" c, i.e., we maintain’

$(En:n\geq 0:b=LP(n) and c=LP(n-1)$

In words, there exists a positive integer n such that b and c are, respectively, the consecutive Leonardo numbers LP( n ) and LP( n - 1 ).

In a 0-based array, the root of a Leonardo tree of size LP( n ) nodes is at index LP( n ) – 1, its left child is at index LP( n – 1 ) – 1, and its right child is at index LP( n ) – 2. Thus, Dijkstra’s choice of Leonardo numbers guarantees that every parent node always has two children (cf. EWD796a – 7, Remark 2). As an illustration, Figures 13 and 14 show excerpts from the console output obtained by executing SmoothSort on the array of the first 25 prime numbers originally in descending order. The array declaration and the call to SmoothSort are as follows.

C#
int[] a1 = { 97, 89, 83, 79, 73, 71, 67, 61, 59, 53, 47, 43, 41, 37, 31, 29, 23,
             19, 17, 13, 11,  7,  5,  3,  2 },

TypedUfn<int>.SortArray( "Smooth Sort a1 (worst case)",
                         SmS.SmoothSort<int>, a1 ); // Worst case, O( N log2 N ) time.

Image 13

The Leonardo tree shown in Figure 13 is invalid because two children exceed their parent. The designations "invalid" and "valid" for trees correspond to Dijkstra’s "dubious" and "trusty" designations (cf. EWD796a – 6) for stretches, that is, sequences of consecutive array elements. Figure 14 shows the state of the array and the Leonardo trees after swap 25.

Image 14

The first while loop in function SmoothSort scans the array from left to right, and the second while loop scans it from right to left. Figure 15 illustrates the state of the array and the Leonardo tree just after swap 30.

Image 15

The rest of this section provides the translation into C# of the additional functions implementing Smoothsort. The summary sections in their Intellisense comments reproduce Dijkstra’s descriptions.

Functions Up and Down subsume, respectively, functions up/up1 and down/down1 in the original code. The temporary variables in both functions are necessary because, as in the Algol programming language, the original functions used multiple assignment to variables and C# does not support it.

C#
/// <summary>Function to subsume the {up} and {up1} functions.
/// </summary>
/// <param name="b">Reference to variable {b} or {b1} in the original code.</param>
/// <param name="c">Reference to variable {c} or {c1} in the original code.
/// </param>
/// <remarks>In the original code,
///             up:  b, c := b + c + 1, b
///             up1: b1, c1 := b1 + c1 + 1, b1  .
///          The variables {tmp1} and {tmp2} are necessary because, unlike Algol,
///          C# does not support multiple assignment.
/// </remarks>
private static void Up( ref int b, ref int c )
{
   int tmp1 = b + c + 1, tmp2 = b;

   b = tmp1; c = tmp2;
}// Up

/// <summary>Function to subsume the {down} and {down1} functions.
/// </summary>
/// <param name="b">Reference to variable {b} or {b1} in the original code.</param>
/// <param name="c">Reference to variable {c} or {c1} in the original code.
/// </param>
/// <remarks>In the original code,
///             down:  b, c := c, b - c - 1
///             down1: b1, c1 := c1, b1 - c1 - 1  .
///          The variables {tmp1} and {tmp2} are necessary because, unlike Algol,
///          C# does not support multiple assignment.
/// </remarks>
private static void Down( ref int b, ref int c )
{
   int tmp1 = c, tmp2 = b - c - 1;

   b = tmp1; c = tmp2;
}// Down

Functions Sift, Trinkle and SemiTrinkle implement the equivalent of the "reheap down" operation in HeapSort.

C#
/// <summary>Sift the root of a "stretch".
/// </summary>
/// <typeparam name="T">Type of array {m} elements.</typeparam>
/// <param name="m">Reference to the array in which the sift takes place.
/// </param>
/// <remarks>Routine {sift} is applied to the root { m(r1) } of a stretch of
///             length {b1}, of which {c1} is the "companion" (cf. EWD796a - 9).
///          The calls to function {DisplayState} are NOT in the original code.
/// </remarks>
private static void Sift<T>( T[] m ) where T : IComparable<T>
{                                              // sift:
   while ( b1 >= 3 )                           // do b1 >= 3
   {
      int r2 = r1 - b1 + c1, cmp;              //   |[ r2: int; r2 := r1 - b1 + c1

      cmp = m[ r2 ].CompareTo( m[ r1 - 1 ] );

      if ( UntypedUfn.IsGE( cmp ) )            //    ; if m(r2) >= m(r1 - 1)
      {
                                               //        -> skip
      }
      else if ( UntypedUfn.IsLE( cmp ) )       //      [] m(r2) <= m(r1 - 1)
      {
         r2 = r1 - 1;                          //        -> r2 := r1 - 1
         Down( ref b1, ref c1 );               //         ; down1
      }                                        //      fi
      cmp = m[ r1 ].CompareTo( m[ r2 ] );
      if ( UntypedUfn.IsGE( cmp ) )            //    ; if m(r1) >=  m(r2)
      {
         b1 = 1;                               //        -> b1 := 1
      }
      else if ( UntypedUfn.IsLT( cmp ) )       //      [] m(r1) < m(r2)
      {
         TypedUfn<T>.Swap( "       Sift",
                           m, r1, r2 );        //        -> m:swap(r1, r2)
         DisplayState( m );

         r1 = r2;                              //         ; r1 := r2
         Down( ref b1, ref c1 );               //         ; down1
      }                                        //      fi
                                               //   ]|
   }                                           // od
}// Sift

/// <summary>Synonym of "trickle". To flow down by drops.
/// </summary>
/// <typeparam name="T">Type of array {m} elements.</typeparam>
/// <param name="m">The array in which the trinkle takes place.
/// </param>
/// <remarks>Routine {trinkle} is applied to the root { m(r1) } of the last stretch of
///             the standard concatenation represented by the triple (p, b, c); this
///             representation need not be normalized (cf. EWD796a - 9).
///          The calls to function {DisplayState} are NOT in the original code.
/// </remarks>
private static void Trinkle<T>( T[] m ) where T : IComparable<T>
{                                                  // trinkle:
   int p1 = p; b1 = b; c1 = c;                     // |[ p1: int; p1, b1, c1 := p, b, c

   while ( p1 > 0 )                                //  ; do p1 > 0 ->
   {
      int r3;                                      //      |[ r3: int

      while ( UntypedUfn.IsEven( p1 ) )            //       ; do even(p1)
      {
         p1 >>= 1;                                 //           -> p1 := p1/2
         Up( ref b1, ref c1 );                     //            ; up1
      }                                            //         od
      r3 = r1 - b1;                                //       ; r3 := r1 - b1

      int cmp;

      if ( p1 == 1                                 //       ; if p1 = 1
           ||                                      //            cor
           UntypedUfn.IsLE(
              (cmp
               = m[ r3 ].CompareTo( m[ r1 ] )) ) ) //            m(r3) <= m(r1)
      {
         p1 = 0;                                   //           -> p1 := 0
      }
      else if ( p1 > 1 && UntypedUfn.IsGT( cmp ) ) //         [] p1 > 1 cand m(r3) > m(r1)
      {
         --p1;                                     //           -> p1 := p1 - 1
         if ( b1 == 1 )                            //            ; if b1 = 1
         {
            TypedUfn<T>.Swap( "    Trinkle",
                              m, r1, r3 );         //                -> m:swap(r1, r3)
            DisplayState( m );

            r1 = r3;                               //                 ; r1 := r3
         }
         else if ( b1 >= 3 )                       //              [] b1 >= 3
         {
            int r2 = r1 - b1 + c1;                 //                |[ r2: int; r2 := r1 - b1 + c1

            cmp
              = m[ r2 ].CompareTo( m[ r1 - 1 ] );

            if ( UntypedUfn.IsGE( cmp ) )          //                 ; if m(r2) >= m(r1 - 1)
            {
                                                   //                     -> skip
            }
            else if ( UntypedUfn.IsLE( cmp ) )     //                   [] m(r2) <= m(r1 - 1 )
            {
               r2 = r1 - 1;                        //                     -> r2 := r1 - 1
               Down( ref b1, ref c1 );             //                      ; down1
               p1 <<= 1;                           //                      ; p1 := 2 * p1
            }                                      //                   fi
            cmp = m[ r3 ].CompareTo( m[ r2 ] );

            if ( UntypedUfn.IsGE( cmp ) )          //                 ; if m(r3) >= m(r2)
            {
               TypedUfn<T>.Swap( "    Trinkle",
                                 m, r1, r3 );      //                     -> m:swap(r1, r3)
               DisplayState( m );

               r1 = r3;                            //                      ; r1 := r3
            }
            else if ( UntypedUfn.IsLE( cmp ) )     //                   [] m(r3) <= m(r2)
            {
               TypedUfn<T>.Swap( "    Trinkle",
                                 m, r1, r2 );      //                     -> m:swap(r1,r2)
               DisplayState( m );

               r1 = r2;                            //                      ; r1 := r2
               Down( ref b1, ref c1 );             //                      ; down1
               p1 = 0;                             //                      ; p1 := 0
            }                                      //                   fi
         }                                         //                ]|
      }                                            //              fi
   }                                               //      ]|
                                                   //    od
                                                   // ]|
   Sift( m );                                      // ; sift
}// Trinkle

/// <summary>Synonym of semi-trickle.
/// </summary>
/// <typeparam name="T">Type of array {m} elements.</typeparam>
/// <param name="m">Reference to the array in which the trinkle takes place.
/// </param>
/// <remarks>Routine {semitrinkle} is applied to the root { m(r) } of a
///             stretch of length {c} which is preceded by the nonempty
///             standard concatenation represented by the triple (p, b, c);
///             again this representation is not necessarily normalized
///             (cf. EWD796a - 9).
///          The call to function {DisplayState} is NOT in the original code.
/// </remarks>
private static void SemiTrinkle<T>( T[] m ) where T : IComparable<T>
{                                         // semitrinkle:
   r1 = r - c;                            //   r1 := r - c

   int cmp = m[ r1 ].CompareTo( m[ r ] );

   if ( UntypedUfn.IsLE( cmp ) )          // ; if m(r1) <= m(r)
   {
                                          //     -> skip
   }
   else if ( UntypedUfn.IsGT( cmp ) )     //   [] m(r1) > m(r)
   {
      TypedUfn<T>.Swap( "SemiTrinkle",
                        m, r, r1 );       //     -> m:swap(r, r1)
      DisplayState( m );

      Trinkle( m );                       //      ; trinkle
   }                                      //   fi
}// SemiTrinkle

There are two functions which were not in Dijkstra’s original code. They are used to display on the console the state variables (b, c, p, r, q) of SmoothSort and, as shown in Figures 13, 14 and 15, information on the Leonardo trees that occur in the array being sorted.

C#
// Functions NOT defined in the original code.

/// <summary>Display on the console the state of the global variables,
///          of the array being sorted, and the Leonardo tree(s).
/// </summary>
/// <typeparam name="T">Type of array {m} elements.</typeparam>
/// <param name="m">Reference to the array in which the sorting is taking place.
/// </param>
private static void DisplayState<T>( T[] m ) where T : IComparable<T>
{
   LPtree<T> tree = new LPtree<T>( m, b, c );

   if ( Uvar.outMode == OutMode.verbose )
   {
      DisplayVars();
   }
   tree.TrySaveAndDisplay();
}// DisplayState

/// <summary>Display on the console the global variables {p, b, c, r, q}.
/// </summary>
private static void DisplayVars()
{
   int idx, k = LPfn.InvLP( b );
   string fmtStr1 = Uvar.elementFormat, fmtStr2 = "LP({0})";

   UntypedUfn.SkipOB( out idx ); UntypedUfn.DisplayStr( " ", ref idx, b );
   // idx == b
   Console.Write( fmtStr1, "b" );
   Console.WriteLine( fmtStr2, k );

   UntypedUfn.SkipOB( out idx ); UntypedUfn.DisplayStr( " ", ref idx, c );
   // idx == c
   Console.Write( fmtStr1, "c" );
   Console.WriteLine( fmtStr2, k - 1 );

   Console.WriteLine( "p: {0} ({1})", p, UntypedUfn.ToBinaryStr( p ) );

   UntypedUfn.SkipOB( out idx ); UntypedUfn.DisplayStr( " ", ref idx, r );
   // idx == r
   Console.Write( fmtStr1, "r" );
   Console.WriteLine( fmtStr1, "q" );
   Console.WriteLine();
}// DisplayVars

As a final example in this section, Figures 16a and 16b show the state of an array containing the first 50 prime numbers, originally in descending order, just before and after swap 45. 

Image 16

Image 17

6. Smoothsort vs. Heapsort

An array that initially is in descending order represents a worst-case scenario for any algorithm that sorts elements in ascending order. In the example of the first 50 prime numbers, upon termination, SmoothSort performed 223 swaps and 1323 comparisons, while HeapSort performed 207 swaps and 379 comparisons. Figure 17 shows the declarations of five sample integer arrays, the templates for the calls to SmoothSort and HeapSort, and a table of the number of swaps and comparisons performed in each case. Clearly, SmoothSort outperforms HeapSort when the original array is nearly or completely sorted.

Image 18

7. Utility Functions

The library UtilSortLib contains typed and untyped functions that are common to sorting algorithms. The typed functions are in class TypedUfn<T> and the untyped ones are in class UntypedUfn. The most important typed function is the one called to sort an array using a reference to a particular sorting function, whose signature must adhere to the delegate specification.

C#
/// <summary>Delegate (i.e., function reference) for sorting functions.
/// </summary>
/// <param name="arr">Reference to a generic array to be sorted.
/// </param>
public delegate void SortingFunction( T[] arr );

/// <summary>Function to invoke a sorting function, displaying on the console
///          the sorting process (if so desired by the caller).
/// </summary>
/// <param name="msg">Message to display.</param>
/// <param name="SortFn">Reference (i.e., pointer) to a sorting function.</param>
/// <param name="arr">Reference to a generic array to be sorted.</param>
/// <param name="mode">Mode for console output.</param>
/// <param name="dispTree">Whether or not to display {arr} as a binary tree.
/// </param>
public static void SortArray( string msg, SortingFunction SortFn, T[] arr,
                              OutMode mode = OutMode.quiet, bool dispTree = false )
{
   SetVariables( arr, mode );

   Console.WriteLine();
   if ( ! String.IsNullOrEmpty( msg ) )
   {
      Console.WriteLine( "{0}\n", msg );
   }
   if ( dispTree )
   {
      TypedUfn<T>.DisplayTree( "before " + msg, arr );
   }
   DisplayArray( "Original", arr ); Console.WriteLine();

   SortFn( arr ); // Apply the sorting function to the array.

   DisplayArray( "Sorted", arr );
   if ( dispTree )
   {
      TypedUfn<T>.DisplayTree( "after " + msg, arr );
   }
   Console.WriteLine( "\n{0} swaps; {1} comparisons",
                      Uvar.swapCount, Uvar.compCount );
   Console.WriteLine();
} // SortArray

Function SetVariables is called to initialize the global variables used for display purposes during sorting. As indicated, this function must be called before calling functions DisplayArray and/or DisplayTree for the first time.

C#
/// <summary>Set the global variables in class {Uvar}.
/// </summary>
/// <param name="arr">Generic array.</param>
/// <param name="mode">Mode for console output.
/// </param>
/// <remarks>This function MUST be called before calling functions
///          {DisplayArray} {DisplayIndices} and {DisplayTree} for
///          the first time.
/// </remarks>
public static void SetVariables( T[] arr, OutMode mode )
{
   Uvar.elementWidth = MaxElementWidth( arr );

   // Generate the element-format {string} to be used by
   // functions {DisplayArray} and {DisplayTree}.

   Uvar.elementFormat
      = "{0"
        + String.Format( ",{0}", Uvar.elementWidth )
        + "} ";

   Uvar.elementParams
       = String.Format( "\nElement width: {0}, element format string: '{1}'",
                        Uvar.elementWidth, Uvar.elementFormat );
   Uvar.swapCount = 0;
   Uvar.compCount = 0;
   Uvar.outMode = mode;
}// SetVariables

Since the sorting function is expected to handle generic arrays, function MaxElementWidth is called to determine the maximum array-element width to be used for proper formatting during display.

C#
/// <summary>Compute the maximum width of array elements for display purposes.
/// </summary>
/// <param name="arr">Reference to a generic array.
/// </param>
private static int MaxElementWidth( T[] arr )
{
   int maxWidth = int.MinValue;

   if ( arr[ 0 ] is char )
   {
      maxWidth = 3; // Account for single quotes.
   }
   else
   {
      int N = arr.Length, width = 0;

      for ( int i = 0; i < N; ++i )
      {
         string str = arr[ i ].ToString();
         width = str.Length;

         if ( width > maxWidth )
         {
            maxWidth = width;
         }
      }
      if ( arr[ 0 ] is string )
      {
         maxWidth += 2; // Account for double quotes.
      }
   }
   return maxWidth;
}// MaxElementWidth

There are two versions of function Swap. The first one just takes two reference arguments and exchanges them. The second is the one that usually should be called in case the swap operation is to be displayed on the console, as illustrated in Figures 13, 14 and 15.

C#
/// <summary>Swap two generic arguments passed by reference.
/// </summary>
/// <param name="x">First argument to be swapped.</param>
/// <param name="y">Second argument to be swapped.
/// </param>
public static void Swap( ref T x, ref T y )
{
   T tmp = x;

   x = y; y = tmp;
}// Swap

/// <summary>Swap two generic array elements.
/// </summary>
/// <param name="fName">Name of function calling {Swap}.</param>
/// <param name="m">Reference to a generic array.</param>
/// <param name="i">Index of element to be swapped.</param>
/// <param name="j">Index of element to be swapped.
/// </param>
public static void Swap( string fName, T[] m, int i, int j )
{
   ++Uvar.swapCount;
   if ( Uvar.outMode == OutMode.verbose )
   {
      TypedUfn<T>.DisplayArray( "", m );
      DisplaySwap( fName, m, i, j ); //  i, m[ i ], j, m[ j ] );
   }
   Swap( ref m[ i ], ref m[ j ] );
}// Swap

/// <summary>Display on the console a message describing a {Swap} operation
///          to be performed.
/// </summary>
/// <param name="fName">Name of function that will call {Swap}.</param>
/// <param name="m">Reference to a generic array.</param>
/// <param name="idx1">Index of first argument to {Swap}.</param>
/// <param name="idx2">Index of second argument to {Swap}.</param>
/// </param>
public static void DisplaySwap( string fName, T[] m, int idx1, int idx2 )
{
   int k, _1st, _2nd;

   if ( idx1 > idx2 ) // As in a swap operation in Smoothsort.
   {
      _1st = idx2; _2nd = idx1;
   }
   else // idx1 < idx2
   {
      _1st = idx1; _2nd = idx2;
   }

   UntypedUfn.SkipOB( out k ); UntypedUfn.DisplayStr( " ", ref k, _1st );
   // k == _1st
   Console.Write( Uvar.elementFormat, "↑" );

   ++k;
   UntypedUfn.DisplayStr( " ", ref k, _2nd );
   // k == _2nd
   Console.WriteLine( Uvar.elementFormat, "↑" );

   Console.WriteLine(
          "\t{0}: swap( m[ {1,2} ]:{2,2}, m[ {3,2} ]:{4,2} ), swapCount: {5}\n",
          fName, idx1, ElementToString( m, idx1 ), idx2, ElementToString( m, idx2 ),
          Uvar.swapCount );
}// DisplaySwap

Two of the last four typed utility functions are DisplayArray to display on the console an array in linear form, and DisplayTree to display an array as a binary tree. Both functions take care of displaying array elements in a suitable format, the second one also taking care of the spacing of tree nodes. Function DisplayArray calls function DisplayIndices to show the indices above the array elements.

C#
/// <summary>Display on the console an array of generic elements.
/// </summary>
/// <param name="msg">Description of {arr}.</param>
/// <param name="arr">Array to be displayed.</param>
/// <param name="leadSpace">Optional lead spacing.
/// </param>
public static void DisplayArray( string msg, T[] arr, string leadSpace = "" )
{
   int N = arr.Length;

   if ( !String.IsNullOrEmpty( msg ) )
   {
      Console.WriteLine( "\n{0} array:\n", msg );
   }
   DisplayIndices( N, leadSpace );
   if ( !String.IsNullOrEmpty( leadSpace ) )
   {
      Console.Write( leadSpace );
   }
   Console.Write( "[ " );
   for ( int i = 0; i < N; ++i )
   {
      Console.Write( Uvar.elementFormat, ElementToString( arr, i ) );
   }
   Console.WriteLine( "]" );
}// DisplayArray

/// <summary>Display the indices for the elements of an array.
/// </summary>
/// <param name="N">Array length.</param>
/// <param name="leadSpace">Optional lead spacing.
/// </param>
private static void DisplayIndices( int N, string leadSpace = "" )
{
   if ( !String.IsNullOrEmpty( leadSpace ) )
   {
      Console.Write( leadSpace );
   }
   Console.Write( "  " ); // Go past "[ ".
   for ( int i = 0; i < N; ++i )
   {
      Console.Write( Uvar.elementFormat, i );
   }
   Console.WriteLine();
}// DisplayIndices

/// <summary>Display a complete binary tree whose nodes are
///          stored in a generic array.
/// </summary>
/// <param name="msg">Message describing the tree.</param>
/// <param name="a">Generic array containing a complete binary tree.</param>
/// <remarks>
///           If {N} is the length of {a}, then for every index {i} in the interval
///           [ 0 .. ( {N} - 3 ) / 2) ] of a parent node in the complete binary
///           tree, the left child is at index ( 2 * {i} + 1 ) and the right child
///           is at index ( 2 * {i} + 2 ).
/// </remarks>
public static void DisplayTree( string msg, T[] a )
{
   if ( !String.IsNullOrEmpty( msg ) )
   {
      Console.WriteLine( "\nTree {0}:", msg );
   }
   int N = a.Length, // Number of array elements.
       maxLevel,     // Maximum level of a perfect binary tree
                     //    containing at least {N} nodes.
       level = 0,    // Current tree level.
       nodes = 1,    // Number of nodes to display at {level}.

       // Lead spacing for first node at {level}.
       leadSp = UntypedUfn.CeilPof2forTree( N, out maxLevel ),
       // Spacing between consecutive nodes at {level}.
       elemSp = 0;

   int k = 0; // Count of nodes displayed.

   do
   {
      Console.WriteLine();
      UntypedUfn.Spaces( leadSp );

      for ( int j = 1; j <= nodes && k < N; ++j )
      {
         Console.Write( Uvar.elementFormat, ElementToString( a, k ) );
         ++k;
         UntypedUfn.Spaces( elemSp - 2 );
      }
      Console.WriteLine( "\n" );
      elemSp = leadSp;
      leadSp >>= 1;
      ++level;
      nodes <<= 1;
   } while ( level <= maxLevel );
   Console.WriteLine();
}// DisplayTree

Both DisplayArray and DisplayTree call function ElementToString to account for the cases when array elements are of type char or string. Observe that if the array elements are of a non-primitive data type the definition of such type must override function ToString, and must implement an integer-valued CompareTo function.

C#
/// <summary>Generate a {string} representation of a generic array-element.
/// </summary>
/// <param name="arr">Reference to a generic array.</param>
/// <param name="i">Index into {arr}.</param>
/// <returns>The generated string.
/// </returns>
public static string ElementToString( T[] arr, int i )
{
   string str = arr[ i ].ToString();

   if ( arr[ i ] is char )
   {
      return Uvar._1quote + str + Uvar._1quote;
   }
   if ( arr[ i ] is string )
   {
      return Uvar._2quote + str + Uvar._2quote;
   }
   return str;
}// ElementToString

Finally, function DisplayTree calls function UntypedUfn.CeilPof2forTree to compute the smallest power of two for a perfect tree that can contain a given number of nodes, and the maximum level of such a tree. Those numbers are used in the spacing and the display of nodes per level of the tree.

C#
/// <summary>Compute the smallest power of two for a perfect binary
///          tree containing at least a given number of nodes.
/// </summary>
/// <param name="nodes">Minimum number of nodes in the tree.</param>
/// <param name="maxLevel">{out} parameter: Maximum 0-based level of the tree.</param>
/// <returns>The smallest power of two required.
/// </returns>
/// <remarks>Usually (as in Heapsort) {nodes} is the number of nodes in a
///          complete binary tree.
/// </remarks>
public static int CeilPof2forTree( int nodes, out int maxLevel )
{
   maxLevel = 0;

   int pOf2 = 1, // == 2 ↑ maxLevel
       sum = 1;  // Sum of nodes in the tree.

   while ( sum < nodes )
   {
      ++maxLevel;
      pOf2 <<= 1; // == 2 ↑ maxLevel
      sum += pOf2;
   }
   // sum >= nodes
   return pOf2;
}// CeilPof2forTree

Most of the untyped utility functions (in class UntypedUfn) are self-explanatory via their Intellisense comments and will not be described here.

8. Sorting Other Data Types

In this section, SmoothSort is shown to be able to handle arbitrary data types. To begin with, consider sorting an array of characters. The following code is suitable to provide a simple example. Observe that the values of the two optional arguments to function SortArray have been overridden.

C#
string str2 = "a sample string";
char[] c2 = str2.ToCharArray();

TypedUfn<char>.SortArray( "Smooth Sort c2", SmS.SmoothSort<char>, c2,
                          OutMode.verbose, true );

The execution of SortArray with the given arguments produces a lot of console output. The array is displayed as a tree before and after the sort, the calls to function Swap are documented, and the state variables (b, c, p, r, q) and unique Leonardo trees are displayed after each swap. Figure 18 shows excerpts from the console output.

Image 19

In order to demonstrate the sorting of elements from an arbitrary data type, consider the definition of a class to represent points in the two-dimensional, integer x-y plane. (The author is aware of the struct Point defined in the System.Drawing namespace. However, that struct does not define a CompareTo function, which is what was needed to be illustrated for a user-defined data type.)

C#
/// <summary>Class to encapsulate points on the two-dimensional {int} x-y plane.
/// </summary>
public class _2Dpoint : IComparable<_2Dpoint>
{
   private int x, y;

   public _2Dpoint( int _x, int _y )
   {
      x = _x; y = _y;
   }// _2Dpoint

   public override string ToString()
   {
      return String.Format( "({0} {1})", x, y );
   }// ToString

   /// <summary>Point comparison by lexicographic (i.e., dictionary) order.
   /// </summary>
   /// <param name="p">{_2Dpoint} to compare against.</param>
   /// <returns>  -1 if {this} is less than {p},
   ///             0 if {this} is equal to {p},
   ///            +1 if {this} is greater than {p}.
   /// </returns>
   public int CompareTo( _2Dpoint p )
   {
      return   ( x < p.x || ( x == p.x && y < p.y ) )
             ? -1
             : ( x == p.x && y == p.y ) ? 0 : 1;
   }// CompareTo

   public static _2Dpoint[] _2DpointArray( params int[] xyPairs )
   {
      int n = xyPairs.Length;

      if ( ( n % 2 ) != 0 )
      {
         throw new Exception( "_2DpointArray: argument list length must be even" );
      }
      _2Dpoint[] arr = new _2Dpoint[ n >> 1 ];

      int i = 0, j = 0;

      while ( i < n )
      {
         arr[ j ] = new _2Dpoint( xyPairs[ i ], xyPairs[ i + 1 ] );
         i += 2;
         ++j;
      }
      return arr;
   }// _2DpointArray
}// _2Dpoint (class)

The class _2Dpoint is very simple. It defines the basic constructor, overrides function ToString and, most importantly, defines the required integer-valued CompareTo function for sorting purposes. The comparison of two points is done according to lexicographic ordering. The class also defines a static function to generate an array of _2Dpoint instances. Such a function expects a series of pairs of numbers defining points.

The following code creates an array of 13 points, and applies SmoothSort to the array. Figure 19 shows excerpts from the console output. Observe that instances of class _2Dpoint are shown inside braces, as in {3 0}, to avoid confusion with their encoding with LP_Index instances, as with ( 1 6). (This is shown by red highlighting.)

C#
_2Dpoint[] g = _2Dpoint._2DpointArray(
                       8, 3, 5, 5, 4, 1, 1, 9, 5, 4, 1, 0,
                       4, 4, 3, 0, 4, 2, 4, 1, 1, 0, 0, 0, -1, 0 );

TypedUfn<_2Dpoint>.SortArray( "Smooth Sort g", SmS.SmoothSort<_2Dpoint>, g, OutMode.verbose );

Image 20

9. Functions on Leonardo Numbers and Leonardo Trees

By Dijkstra’s design, Leonardo numbers and trees are an integral part of Smoothsort. The class library LPlib contains several classes and a lot of functions to deal with Leonardo numbers and Leonardo trees. For ease of testing and editing, they were written within the GenericSmoothSort namespace but now they reside in a library of their own. The classes and their functions will be presented in the order of their development.

It should be noted that the LPlib functionality is NOT necessary for the execution of SmoothSort. If all the calls to function DisplayState after each call to Swap in file SmS.cs were commented out, the LP-related classes and functions would not be needed and the sort would work as expected. They are used solely to document on the console the operation of SmoothSort. So, the reader may skip this section if so desired.

9.1. Class LPfn

This static class contains a static variable and static functions related to Leonardo numbers. Function LP computes if necessary and returns a Leonardo number. As stated in Section 5, Leonardo numbers are defined by the following recurrence relation:

$LP(0) = LP(1) = 1$
$LP(n+2)=LP(n+1)+LP(n)+1$

Function LP is memoized by the use of a list of Leonardo numbers, Lpnumber, whose count of elements can increase during execution. As a consequence, Leonardo numbers are not computed recursively and are generated on demand.

C#
// List of generated Leonardo numbers. Used to memoize function {LP}.

private static List<int> LPnumber = new List<int>() { 1, 1, 3 };

/// <summary>Compute the Leonardo number indexed by {k}.
/// </summary>
/// <param name="k">Index into list {LPnumber}.</param>
/// <returns>The corresponding Leonardo number.</returns>
/// <remarks>
///          The sequence of Leonardo numbers is defined as (cf. EWD796a - 3):
///
///          LP(-1) LP(0) LP(1) LP(2) LP(3) LP(4) LP(5) LP(6) LP(7) ...
///            (-1)     1     1     3     5     9    15    25    41 ...
/// </remarks>
public static int LP( int k )
{
   if ( k < 0 )
   {
      return -1;
   }
   else
   {
      int n = LPnumber.Count - 1; // Index of maximum Leonardo number generated.

      while ( n < k ) // Generate { LPnumber[ n + 1 ] }.
      {
            LPnumber.Add( LPnumber[ n ] + LPnumber[ n - 1 ] + 1 );
            ++n;
      }
      // {n == k}
      return LPnumber[ k ];
   }
}// LP

Function InvLP computes the inverse, that is, the index, of a Leonardo number if it exists. This function is not strictly necessary, for it has been used only once (by function SmS.DisplayVars) for display purposes.

C#
/// <summary>Compute the inverse of a Leonardo number if it exists.
/// </summary>
/// <param name="lp">Leonardo number.
/// </param>
/// <returns>The index into array {LPnumber} for index {lp},
///          if it exists.
/// </returns>
/// <remarks>The function returns {-1} if {lp} is determined
///             NOT to be a Leonardo number.
///          The array {LPnumber} is NOT extended during the
///             search for the index corresponding to {lp}.
/// </remarks>
public static int InvLP( int lp )
{
   int i = -1;

   if ( lp > 0 )
   {
      int n = LPnumber.Count;

      i = 1;
      while ( ( i < n ) && ( LPnumber[ i ] != lp ) )
      {
         if ( LPnumber[ i ] > lp )
         {
            break; // {lp} is NOT a Leonardo number.
         }
         ++i;
      }
   }
   return i;
}// InvLP

There are two versions of a function (LPnumbers) to compute two consecutive Leonardo numbers. The first one determines an index n such that the values of the global variables b and c in SmoothSort are equal to two consecutive Leonardo numbers indexed by n (cf. EWD796a – 3). The second, computes two consecutive Leonardo numbers given the index of the larger one.

C#
/// <summary>Compute a value of { n >= 0 } such that
///          { LP( n ) == b} && { LP( n - 1 ) == c}.
/// </summary>
/// <param name="b">Value of {b}.</param>
/// <param name="c">Value of {c}.</param>
/// <returns>Computed value of {n}.</returns>
/// <remarks>
///    The computation of {n} is due to the following predicate
///    (cf. EWD796a - 3):
///
///             (E n: n >= 0: b = LP( n ) && c = LP( n - 1 )
///
///    In words, there exists a non-negative {n} such that
///    { b == LP( n ) } && { c == LP( n - 1 ) }, that is,
///    {b} and {c} are consecutive Leonardo numbers.
/// </remarks>
public static int LPnumbers( int b, int c )
{
   int n = 0, LPn = 1, LPn_1 = -1;

   // LPn == LP( n ); LPn_1 == LP( n - 1 )

   while ( !( n > b ) && !( LPn == b && LPn_1 == c ) )
   {
      ++n;

      int tmp = LPn;

      LPn = LPn + LPn_1 + 1; // LP( n ).
      LPn_1 = tmp;           // LP( n - 1 ).
   }
   if ( n > b ) // {b} and {c} are NOT consecutive Leonardo numbers.
   {
      n = -1; LPn = -1; LPn_1 = -1; // Return failure.
   }
   return n;
}// LPnumbers

/// <summary>Compute two consecutive Leonardo numbers for a given index.
/// </summary>
/// <param name="k">Index of the second (larger) Leonardo number.</param>
/// <param name="LPn">{out} parameter: {LP( k )}.</param>
/// <param name="LPn_1">{out} parameter: {LP( k - 1 )}.
/// </param>
public static void LPnumbers( int k, out int LPn, out int LPn_1 )
{
   int n = 0; LPn = 1; LPn_1 = -1;

   while ( n < k )
   {
      ++n;

      int tmp = LPn;

      LPn = LPn + LPn_1 + 1; // == LP( n ).
      LPn_1 = tmp;           // == LP( n - 1 ).
   }
   // {n == k}
}// LPnumbers

9.2. Class LP_Index

This class encapsulates a Leonardo number and an index into an array. It is the basic building block for encoding nodes of a Leonardo tree. The class overrides function ToString, and provides a CompareTo function and a function to determine if an encoding trivially corresponds to a singleton node.

C#
public class LP_Index
{
   private int lp;    // LP (or Lk) Leonardo number.
   private int index; // Index into a linear array.

   // Properties
   public int LP { get { return lp; } }
   public int Index { get { return index; } }

   public LP_Index( int _lp, int _index )
   {
      lp = _lp;
      index = _index;
   }// LP_Index (constructor)

   /// <summary>Overriden function {ToString}.
   /// </summary>
   /// <returns>Formatted {string} "({lp}, {index})".
   /// </returns>
   public override string ToString()
   {
      return String.Format( "({0,2} {1,2})", lp, index );
   }// ToString

   /// <summary>Lexicographic-order comparison of {this} instance and
   ///          another {LP_Index} instance.
   /// </summary>
   /// <param name="other">{LP_Index} instance to compare against.</param>
   /// <returns>{-1} if {this} is less than {other}.
   ///           {0} if {this} and {other} are equal.
   ///          {+1} if {this} is greater than {other}.
   /// </returns>
   public int CompareTo( LP_Index other )
   {
      if ( (lp < other.lp ) || ( lp == other.lp && index < other.index ) )
      {
         return -1;
      }
      if ( lp == other.lp && index == other.index )
      {
         return 0;
      }
      return +1;
   }// CompareTo

   /// <summary>Determine whether the node encoded by {this} is a
   ///          singleton (leaf) node.
   /// </summary>
   /// <returns>{true} if the node is singleton; {false} otherwise.
   /// </returns>
   /// <remarks>A node in a Leonardo tree is singleton if its leonardo
   ///          number equals 1.
   /// </remarks>
   public bool Is_Singleton()
   {
      return lp == 1;
   }// Is_Singleton
}// LP_Index (class)

9.3. Class LPtriple

This class encapsulates three LP_Index instances that encode nodes of a Leonardo tree: a left child, a right child and their parent. Instances of this class are used by class LPtree to describe the Leonardo trees over a span of the array being sorted, as delimited by global variable b and its "companion" c in SmoothSort.

C#
public class LPtriple<T> where T : IComparable<T>
{
   // Encoding of three nodes of a Leonardo tree.

   private LP_Index parent, leftChild, rightChild; // Node encodings.
   private string pKey, lChKey, rChKey;            // Keys into a dictionary.
   private string strRepr;                         // String representation of the triple.

   /// <summary>Empty constructor.
   /// </summary>
   public LPtriple()
   {
      parent = leftChild = rightChild = null;
      pKey = lChKey = rChKey = null;
      Displayed = false;
      strRepr = null;
   }// LPtriple

   /// <summary>// Encode three nodes of a possible Leonardo tree.
   /// </summary>
   /// <param name="pairs">Sequence of three {LP, Index} pairs.
   /// </param>
   /// <remarks>The tree is labeled "possible" because the validity
   ///          of its nodes must be verified by the caller.
   /// </remarks>
   /// <returns>Array encoding three nodes of a possible Leonardo tree.
   /// </returns>
   public LPtriple( params int[] pairs )
   {
      parent = leftChild = rightChild = null;
      pKey = lChKey = rChKey = null;

      if ( pairs.Length != 6 )
      {
         throw new Exception( "LPtriple constructor: 6 {int} parameters expected" );
      }
      parent = new LP_Index( pairs[ 0 ], pairs[ 1 ] );     // Encoding of parent node.
      leftChild = new LP_Index( pairs[ 2 ], pairs[ 3 ] );  // Encoding of left child node.
      rightChild = new LP_Index( pairs[ 4 ], pairs[ 5 ] ); // Encoding of right child node.
      Displayed = false;

      // Keys into  a dictionary

      pKey = parent.ToString();
      lChKey = leftChild.ToString();
      rChKey = rightChild.ToString();

      strRepr = ToString(); // {string} representation.
   }// LPtriple

   // Properties.

   public LP_Index Parent { get { return parent; } }
   public LP_Index LeftChild { get { return leftChild; } }
   public LP_Index RightChild { get { return rightChild; } }

   public string ParentKey { get { return pKey; } }
   public string LchildKey { get { return lChKey; } }
   public string RchildKey { get { return rChKey; } }

   public bool Displayed { get; set; }

   public int Width { get { return strRepr.Length; } }
   public string StrRepr { get { return strRepr; } }

   /// <summary>Generate a {string} representation of {this} triple.
   /// </summary>
   /// <returns>The generated string.
   /// </returns>
   public override string ToString()
   {
      return String.Format( "< {0} {1} {2} >",
                            LchildKey, RchildKey, ParentKey,
                            Displayed ? " *" : "" );
   }// ToString

   /// <summary>Generate a long {string} representation of {this} triple.
   /// </summary>
   /// <param name="m">Reference to a generic array.</param>
   /// <returns>The generated string.
   /// </returns>
   /// <remarks>The generated string contains the string representations
   ///          of the array {m} elements indexed by the triple.
   /// </remarks>
   public string ToLongString( T[] m )
   {
      return String.Format( "< {0}: {1}, {2}: {3}, {4}: {5} >{6}",
                            LchildKey, TypedUfn<T>.ElementToString( m, leftChild.Index ),
                            RchildKey, TypedUfn<T>.ElementToString( m, rightChild.Index ),
                            ParentKey, TypedUfn<T>.ElementToString( m, parent.Index ),
                            Displayed ? " *" : "" );
   }// ToLongString

   /// <summary>Make a fresh copy of {this} triple.
   /// </summary>
   /// <returns>Copy of {this} triple.
   /// </returns>
   public LPtriple<T> Copy()
   {
      LPtriple<T> _copy = new LPtriple<T>();

      _copy.parent = this.parent;
      _copy.leftChild = this.leftChild;
      _copy.rightChild = this.rightChild;
      _copy.pKey = this.pKey;
      _copy.lChKey = this.lChKey;
      _copy.rChKey = this.rChKey;
      _copy.strRepr = this.strRepr;
      _copy.Displayed = this.Displayed;

      return _copy;
   }// Copy

   /// <summary>Determine whether {this} triple is a valid Leonardo-tree triple.
   /// </summary>
   /// <param name="m">Reference to a generic array.</param>
   /// <returns>{true} if {this} triple is valid; {false} otherwise.
   /// </returns>
   /// <remarks>A triple is considered valid if the children nodes in {m} do not exceed
   ///           their parent node.
   /// </remarks>

   public bool IsValid( T[] m )
   {
      return UntypedUfn.IsLE( m[ leftChild.Index ].CompareTo( m[ parent.Index ] ) )
             &&
             UntypedUfn.IsLE( m[ rightChild.Index ].CompareTo( m[ parent.Index ] ) );
   }// IsValid
}// LPtriple (class)

The class overrides the standard ToString function and provides a function ToLongString which includes the values of the array elements indexed by the triple, and an asterisk if the triple has been displayed on the console. There is a Copy constructor and a function to determine whether or not a triple encodes a valid Leonardo tree or sub-tree, that is, whether the children do not exceed their parent.

9.4. Classes LPvar and LPtree

The static class LPvar contains a one-time global variable that was removed from SmoothSort because there was no reason for it to be there since it was not used in Dijkstra’s original code.

C#
using System;
using System.Collections.Generic;

namespace LPlib
{
   /// <summary>Class to encapsulate a one-time global variable. 
   /// </summary>
   public static class LPvar
   {
      // The following global variable is used to keep track of generated dictionaries of the 
      // elements from a generic array indexed by Leonardo trees, so that (sub-)trees of saved 
      // and displayed trees are not displayed again on the console.
      //
      // The index-dictionary keys are {string} representations of {LP_Index} instances, and the
      // values are arrays containing the {string} representation of left-child, right-child and 
      // parent nodes, in that order, from the array elements indexed by {LPtriple} instances.

      public static List<Dictionary<string, string[]>> nodeValuesDictList = null;

   }// LPvar (static class)
}// LPlib (namespace)

Variable nodeValuesDictList is used to keep track of dictionaries generated during the execution of SmoothSort to prevent displaying trees or sub-trees that already have been displayed. A natural choice would have been to use instances of the LP_Index class as dictionary keys. However, they cannot be used because the Dictionary class in C# computes hash codes for the keys, and can only apply hashing to primitive data types. Therefore, the string representations of LP_Index instances were used as dictionary keys. Since the class LPtree has several fields and many functions, they will be described separately.

The class LPtree contains several private fields, declared in the following partial description.

C#
using System;
using System.Collections.Generic;

using UtilSortLib;

namespace LPlib
{
   public class LPtree<T> where T : IComparable<T>
   {
      private T[] mSnap; // Snapshot of a generic array.
      private int b, c;  // Values of global variables {b} and {c}
                         //    declared in class {SmS}.
  
      private List<List<LPtriple<T>>> 
              cTriples, // List of {LPtriple} triples lists for upper bound {c}.
              bTriples; // List of {LPtriple} triples lists for upper bound {b}.

      private string cBounds, bBounds; // Interval bounds for {cTriples} and {bTriples}.
      private LP_Index rootNode;       // Reference to the root of a "b-c" Leonardo tree.

      // Dictionary of triples for (possibly disjoint) Leonardo trees.
      private Dictionary<string, LPtriple<T>> triplesDict;

      // Dictionary of values from {mSnap} elements.
      private Dictionary<string, string[]> nodeValuesDict;

      // Queue of {LP_Index} node descriptors.
      private Queue<LP_Index> nodeQueue;

 ... 
   }// LPtree (class)
}// LPlib (namespace)

The class constructor initializes all the fields and the single global variable LPvar.nodeValuesDictList on the first call. Since this constructor is called by function SmS.DisplayState after each call to TypedUfn<T>.Swap, it takes as arguments the current state of the array being sorted and the values of the global variables b and c in SmoothSort. Then it ends by calling GenerateTriples to generate the triples in the half-open intervals [ 0, c ) and [ c, b ) over the array.

C#
public LPtree( T[] _m, int _b, int _c )
{
   mSnap = new T[ _m.Length ]; // Take a snapshot of array
   _m.CopyTo( mSnap, 0 );      //    referenced by {_m}.
   b = _b; c = _c;             // Copy arguments from class {SmS}
                               //    global variables.
   if ( LPvar.nodeValuesDictList == null )
   {
      // Initialization on first call to {LPtree} constructor.

      LPvar.nodeValuesDictList = new List<Dictionary<string, string[]>>();
   }
   cTriples = new List<List<LPtriple<T>>>();   // Empty lists of triples.
   bTriples = new List<List<LPtriple<T>>>();
   cBounds = bBounds = String.Empty;
   triplesDict
      = new Dictionary<string, LPtriple<T>>(); // Empty dictionary of triples.
   nodeValuesDict
      = new Dictionary<string, string[]>();   // Empty dictionary of {string} array values.
   rootNode = null;                           // Root node of a "b-c" Leonardo tree.
   nodeQueue = null;                          // Queue of nodes at same level.

   GenerateTriples();
}// LPtree (constructor)

Triples are generated over the specified intervals as long as c is greater than 1 and then if b – c is greater than 2. They are generated by function IntervalTriples, and added to two dictionaries by function AddToDictionaries. This last function returns the LP_Index encoding of the root node of a "b-c" Leonardo tree.

C#
private void GenerateTriples()
{
   // ( E n: n >= 0 && b == LPfn.LP( n ) && c == LPfn.LP( n - 1 ) )
   // &&
   // b >= c

   triplesDict = new Dictionary<string, LPtriple<T>>();
   rootNode = null;

   int i = LPfn.InvLP( c );

   if ( c > 1 )
   {
      cTriples = IntervalTriples( 0, c );
      cBounds = String.Format( "[{0}, {1})", 0, c );

      AddNewList( RootTriple( i, c ), cTriples );

      rootNode = AddToDictionaries( cTriples, triplesDict );
   }
   if ( ( b - c ) > 2 )
   {
      bTriples = IntervalTriples( c, b );
      bBounds = String.Format( "[{0}, {1})", c, b );

      ++i;
      AddNewList( RootTriple( i, b ), bTriples );

      rootNode = AddToDictionaries( bTriples, triplesDict );
   }
   nodeQueue = new Queue<LP_Index>();

   if ( rootNode != null )
   {
      nodeQueue.Enqueue( rootNode );
   }
}// GenerateTriples

Function IntervalTriples generates a list of LPtriple lists over a half-open interval of the array being sorted. The reason for creating a list of lists is that some triples encode nodes at an upper level in the tree hierarchy and other triples encode nodes at the same level as other nodes. Figure 20 shows an example of both cases.

C#
/// <summary>Generate a list of {LPtriple} lists for a half-open interval
///          [ {lb}, {ub} ) of {mSnap}.
/// </summary>
/// <param name="lb">Lower bound on array {mSnap}.</param>
/// <param name="ub">Upper bound on array {mSnap}.</param>
/// <returns>The list of lists of triples generated.
/// </returns>
/// <remarks>The upper bound {ub} is NOT included in the last triple.
/// </remarks>
private List<List<LPtriple<T>>> IntervalTriples( int lb, int ub )
{
   List<List<LPtriple<T>>> listOfTriplesLists = new List<List<LPtriple<T>>>();

   int i = 1,              // Index of Leonardo number.
       LPi = LPfn.LP( i ), // Leonardo number {i}.
       k = lb;             // Running index into array {m}.

   LPtriple<T> triple0 = new LPtriple<T>(), triple1 = new LPtriple<T>();

   while ( LPi < ub )
   {
      ++i;
      LPi = LPfn.LP( i );

      if ( lb + LPi >= ub )
      {
         break;
      }
      k = lb + LPi - 1;

      MakeNewTriple( i, k, ref triple0, ref triple1, listOfTriplesLists );
   };

   i = 1;

   while ( k < ub )
   {
      ++i;
      LPi = LPfn.LP( i );
      k += LPi;

      if ( k >= ub )
      {
         break;
      }
      MakeNewTriple( i, k, ref triple0, ref triple1, listOfTriplesLists );
   };
   return listOfTriplesLists;
}// IntervalTriples

Image 21

The decision to create a new level of triples or to extend a list of triples at the current last level is carried out by function MakeNewTriple, which calls either AddNewList or ExtendLastList to perform those tasks.

C#
/// <summary>Generate a parent, left-child, right-child triple to index
///          elements from array {mSnap}.
/// </summary>
/// <param name="i">Index of a Leonardo number.</param>
/// <param name="k">Index into array {mSnap}.</param>
/// <param name="triple0">Reference to the last triple generated (if any).</param>
/// <param name="triple1">Reference to the variable that will get the new triple.</param>
/// <param name="listOfTriplesLists">Reference to a list of lists of triples.
/// </param>
private void MakeNewTriple( int i, int k,
                            ref LPtriple<T> triple0, ref LPtriple<T> triple1,
                            List<List<LPtriple<T>>> listOfTriplesLists )
{
   triple0 = triple1.Copy();

   triple1
      = new LPtriple<T>( LPfn.LP( i ), k,           // for mSnap[ k ],
                         LPfn.LP( i - 1 ), k - 2,   // for mSnap[ k - 2 ],
                         LPfn.LP( i - 2 ), k - 1 ); // for mSnap[ k - 1 ] );

   if ( ( listOfTriplesLists.Count == 0 )
        ||
        UntypedUfn.IsEQ(
                  mSnap[ triple1.LeftChild.Index ].CompareTo( mSnap[ triple0.Parent.Index ] ) ) )
   {
      AddNewList( triple1, listOfTriplesLists );
   }
   else
   {
      ExtendLastList( listOfTriplesLists, triple1 );
   }
}// MakeNewTriple

/// <summary>Create a new list containing a triple and add the list to a list
///          of lists of triples.
/// </summary>
/// <param name="triple">Triple to add to the new list.</param>
/// <param name="listOfTriplesLists">List of lists of triples.
/// </param>
private void AddNewList( LPtriple<T> triple, List<List<LPtriple<T>>> listOfTriplesLists )
{
   List<LPtriple<T>> tripleList = new List<LPtriple<T>>();

   tripleList.Add( triple );
   listOfTriplesLists.Add( tripleList );
}// AddNewList

/// <summary>Add a triple to the last list of triples.
/// </summary>
/// <param name="listOfTriplesLists">List of lists of triples.</param>
/// <param name="triple">Triple to add.
/// </param>
private void ExtendLastList( List<List<LPtriple<T>>> listOfTriplesLists, LPtriple<T> triple )
{
   List<LPtriple<T>> tripleList = listOfTriplesLists[ listOfTriplesLists.Count - 1 ];

   tripleList.Add( triple );
}// ExtendLastList

Function RootTriple is called by GenerateTriples to create the root-node triple for the generated lists of c- and b-triples, adding a new list with that node.

C#
/// <summary>Create the root triple for a list of triples.
/// </summary>
/// <param name="i">Index of Leonardo number.</param>
/// <param name="k">Index into array {mSnap}.</param>
/// <returns>The triple generated.
/// </returns>
private LPtriple<T> RootTriple( int i, int k )
{
   // root index        = k - 1
   // left child index  = LP( i - 1 ) - 1
   // right child index = k - 2

   int LPi_1 = LPfn.LP( i - 1 );

   return new LPtriple<T>( LPfn.LP( i ), k - 1,       // for mSnap[ k - 1 ],
                           LPi_1, LPi_1 - 1,          // for mSnap[ LPfn.LP( i - 1 ) - 1 ],
                           LPfn.LP( i - 2 ), k - 2 ); // for mSnap[ k - 2 ] );
}// RootTriple

When the lists of c- and b-triples are complete, function AddToDictionaries is called to add the triples from each list to the dictionary of triples that will be used to display Leonardo trees, and to the dictionary nodeValuesDict which is used to avoid displaying Leonardo trees more than once. The string representation of the LP_Index coding of the parent from each triple is used as the key in both dictionaries. The values in the first dictionary are triples, while in the second the values are arrays containing the string representations of the indexed elements from the snapshot (mSnap) of the array being sorted.

C#
/// <summary>Add to two dictionaries the corresponding elements from
///           a list of lists of triples.
/// </summary>
/// <param name="listOfTriplesLists">List of lists of triples.</param>
/// <param name="dictionary">Reference to a dictionary.</param>
/// <returns>Reference to the {LP_Index} encoding the parent
///           (a root node) from the last triple (if any).
/// </returns>
private LP_Index AddToDictionaries( List<List<LPtriple<T>>> listOfTriplesLists,
                                    Dictionary<string, LPtriple<T>> dictionary )
{
   LPtriple<T> lastTriple = null;

   foreach ( List<LPtriple<T>> triplesList in listOfTriplesLists )
   {
      foreach ( LPtriple<T> triple in triplesList )
      {
         string key = triple.Parent.ToString();

         dictionary.Add( key, triple );

         string[] valueArr =
            new string[ 3 ] { mSnap[ triple.LeftChild.Index ].ToString(),
                              mSnap[ triple.RightChild.Index ].ToString(),
                              mSnap[ triple.Parent.Index ].ToString() };

         nodeValuesDict.Add( key, valueArr );
         lastTriple = triple;
      }
   }
   return lastTriple == null ? null : lastTriple.Parent;
}// AddToDictionaries

As mentioned in the caption of Figure 20, if the value of the optional argument to function TrySaveAndDisplay is true, the c- and b-triples are displayed on the console by calling function DisplayTriples. Then, if there are nodes to be displayed and if the dictionary nodeValuesDict is not in LPvar.nodeValuesDictList (the global dictionary list) it is added to that list, and the Leonardo tree(s) are displayed, starting with the tree spanned by the c and b sub-trees and continuing with the disjoint trees.

C#
/// <summary>Try to save in a list, and display on the console the c- and b-triples
///          (if requested), the dictionary of triples, the (L, R, ↑) labels for
///          array elements, and the nodes of the Leonardo tree(s).
/// </summary>
/// <param name="dispTriples">Optional parameter indicating whether to display
///                           the generated c- and b-triples (if any).
/// </param>
public void TrySaveAndDisplay( bool dispTriples = false )
{
   if ( triplesDict.Count > 0 )
   {
      if ( dispTriples )
      {
         if ( cTriples.Count > 0 )
         {
            DisplayTriples( "c", cBounds, cTriples );
            if ( bTriples.Count > 0 )
            {
               DisplayTriples( "b", bBounds, bTriples );
            }
         }
      }
      if ( ( nodeQueue.Count > 0 ) )
      {
         if ( !InDictionaryList( LPvar.nodeValuesDictList ) )
         {
            LPvar.nodeValuesDictList.Add( nodeValuesDict );

            DisplayLabels( triplesDict );

            DisplayNodes( nodeQueue, triplesDict ); // Tree spanned by the
                                                    //    "b" and "c" sub-trees.
            DisplayDisjointTrees( triplesDict );
         }
      }
   }
}// TrySaveAndDisplay

Function DisplayTriples is called by TrySaveAndDisplay to display the lists of c and b triples. Function InDictionaryList determines, with the help of functions SubTreeOf and SameElementValues, whether a dictionary of node values is in the global list of dictionaries. The Intellisense comments of those functions are pretty much self- explanatory.

C#
/// <summary>Display on the console a list of triples encoding a Leonardo tree.
/// </summary>
/// <param name="msg">Name of the list.</param>
/// <param name="listOfTriplesLists">List of lists of triples.
/// </param>
private void DisplayTriples( string name, string bounds,
                             List<List<LPtriple<T>>> listOfTriplesLists )
{
   Console.WriteLine( "\n{0} sub-tree triples {1} :\n", name, bounds );

   foreach ( List<LPtriple<T>> triplesList in listOfTriplesLists )
   {
      foreach ( LPtriple<T> triple in triplesList )
      {
         Console.Write( triple.ToLongString( mSnap ) );
         Console.Write( " " );
      }
      Console.WriteLine();
   }
}// DisplayTriples

/// <summary>Determine whether the dictionary {nodeValuesDict} contains the
///          same node values as another dictionary in the global list of
///          node-values dictionaries.
/// </summary>
/// <returns>{true} if there is a saved dictionary with the same node values.
///          {false} otherwise.
/// </returns>
private bool InDictionaryList( List<Dictionary<string, string[]>> nodeValsDictList )
{
   bool inList = false;

   if ( nodeValsDictList.Count > 0 )
   {
      foreach ( Dictionary<string, string[]> otherDict in nodeValsDictList )
      {
         if ( SubTreeOf( otherDict ) )
         {
            inList = true; break;
         }
      }
   }
   return inList;
}// InDictionaryList

/// <summary>Determine whether dictionary {nodeValuesDict} and another
///          dictionary contain the same node values.
/// </summary>
/// <param name="otherDict">Reference to another dictionary.</param>
/// <returns>{true} if both dictionaries contain the same node values.
///          {false} otherwise.
/// </returns>
/// <remarks>By definition, a tree is a sub-tree of itself.  Hence, this
///          function also tests tree equality.
/// </remarks>
private bool SubTreeOf( Dictionary<string, string[]> otherDict )
{
   foreach ( KeyValuePair<string, string[]> kvp in nodeValuesDict )
   {
      if ( otherDict.ContainsKey( kvp.Key ) )
      {
         if ( !SameElementValues( kvp.Value, otherDict[ kvp.Key ] ) )
         {
            return false;
         }
      }
      else
      {
         return false;
      }
   }
   return true;
}// SubTreeOf

/// <summary>Determine whether two {string} arrays contain the same values.
/// </summary>
/// <param name="idxArr1">Reference to first array.</param>
/// <param name="idxArr2">Reference to second array.</param>
/// <returns>{true} if the arrays contain the same values; {false} otherwise.
/// </returns>
private bool SameElementValues( string[] idxArr1, string[] idxArr2 )
{
   return idxArr1[ 0 ].Equals( idxArr2[ 0 ] )  // Left child.
          &&
          idxArr1[ 1 ].Equals( idxArr2[ 1 ] )  // Right child.
          &&
          idxArr1[ 2 ].Equals( idxArr2[ 2 ] ); // Parent
}// SameElementValues

If a dictionary of node values is not in the global list, function TrySaveAndDisplay adds it to the list and calls DisplayLabels to display, with some leading space, the snap of the array being sorted and, one line at a time, a triple of node encodings and labels ("L", "R" and "↑") to mark the left-child, right-child and root nodes of Leonardo trees encoded by the triple. The leading space is computed by function MaxTripleWidth.

C#
/// <summary>Display on the console the array being sorted, the triples
///          in a dictionary, and "L", "R" and "↑" labels under the
///          array elements.
/// </summary>
/// <param name="dictionary">Dictionary of {LPtriple} instances.
/// </param>
private void DisplayLabels( Dictionary<string, LPtriple<T>> dictionary )
{
   string fmtStr = Uvar.elementFormat,
          leadSpFmt = "{0," + MaxTripleWidth( dictionary ).ToString() + "}",
          leadSpace = String.Format( leadSpFmt, " " );

   Console.WriteLine();
   TypedUfn<T>.DisplayArray( "", mSnap, leadSpace );

   foreach ( KeyValuePair<string, LPtriple<T>> kvp in dictionary )
   {
      LPtriple<T> triple = kvp.Value;

      int idx;

      Console.Write( triple.StrRepr );
      UntypedUfn.SkipOB( out idx );
      UntypedUfn.DisplayStr( " ", ref idx, triple.LeftChild.Index );
      Console.Write( fmtStr, "L" );
      UntypedUfn.DisplayStr( " ", ref idx, triple.RightChild.Index - 1 );
      Console.Write( fmtStr, "R" );
      UntypedUfn.DisplayStr( " ", ref idx, triple.Parent.Index - 2 );
      Console.WriteLine( fmtStr, "↑" );
   }
   Console.WriteLine();
}// DisplayLabels

/// <summary>Compute the maximum width (number of characters) of the {string}
///          representations of triples in a dictionary.
/// </summary>
/// <param name="dictionary">Dictionary of triples.</param>
/// <returns>The maximum width.
/// </returns>
private static int MaxTripleWidth( Dictionary<string, LPtriple<T>> dictionary )
{
   int maxWidth = 0;

   foreach ( KeyValuePair<string, LPtriple<T>> kvp in dictionary )
   {
      LPtriple<T> triple = kvp.Value;

      if ( triple.Width > maxWidth )
      {
         maxWidth = triple.Width;
      }
   }

   return maxWidth;
}// MaxTripleWidth

The last two functions called by TrySaveAndDisplay are DisplayNodes and DisplayDisjointTrees. The first is very similar to function DisplayTree in class TypedUfn<T> of UtilSortLib, but it uses queues to handle the level-by-level traversal of a tree (that is, the technique used in breadth-first search). After displaying the "b-c" Leonardo tree, function DisplayDisjointTrees is called to display any trees not accessible by that tree, as illustrated in Figures 14, 16a and 16b.

C#
/// <summary>Level-by-level (i.e., breadth-first) display on the console
///           the nodes of a Leonardo tree.
/// </summary>
/// <param name="nodeQueue">Queue of {LP_Index} elements encoding nodes from
///                         array {mSnap} to be displayed.</param>
/// <param name="dictionary">Dictionary of node-encoding triples.</param>
/// <remarks>The elements in the queue encode nodes at the same level
///          of a binary tree.
/// </remarks>
private void DisplayNodes( Queue<LP_Index> nodeQueue,
                           Dictionary<string, LPtriple<T>> dictionary )
{
   int maxLevel,
      // Leading space for first node.
       leadSp = UntypedUfn.CeilPof2forTree( 2 * dictionary.Count + 2,
                                            out maxLevel ),
      // Space between consecutive nodes at the same level.
       elemSp = 0;

   bool valid = true; // Whether or not the tree is valid.

   Queue<LP_Index> childrenQueue;
   List<string> absentKeys = new List<string>();

   Console.WriteLine();
   do
   {
      UntypedUfn.Spaces( leadSp );
      childrenQueue = new Queue<LP_Index>();

      while ( nodeQueue.Count > 0 )
      {
         LP_Index node = nodeQueue.Dequeue();
         string key = node.ToString();

         Console.Write( Uvar.elementFormat,
                        TypedUfn<T>.ElementToString( mSnap, node.Index ) );
         UntypedUfn.Spaces( elemSp - 2 );

         if ( dictionary.ContainsKey( key ) )  // {node} is a parent.
         {
            LPtriple<T> triple = dictionary[ node.ToString() ];

            childrenQueue.Enqueue( triple.LeftChild );
            childrenQueue.Enqueue( triple.RightChild );

            if ( valid ) // The tree is valid so far.
            {
               valid = triple.IsValid( mSnap );
            }
            triple.Displayed = true;
         }
         else if ( !node.Is_Singleton() ) // Not a singleton node.
         {
            absentKeys.Add( key );
         }
      }
      Console.WriteLine( "\n\n" );

      if ( childrenQueue.Count > 0 )
      {
         nodeQueue = childrenQueue;

         elemSp = leadSp;
         leadSp >>= 1;
      }
   } while ( childrenQueue.Count > 0 );

   Console.WriteLine( "{0} Leonardo tree\n", valid ? "Valid" : "Invalid" );
}// DisplayNodes
/// <summary>Display on the console the isolated Leonardo trees (if any).
/// </summary>
/// <param name="dictionary">Dictionary of triples.
/// </param>
private void DisplayDisjointTrees( Dictionary<string, LPtriple<T>> dictionary )
{
   Dictionary<string, LPtriple<T>> subDict;
   Queue<LP_Index> nodeQueue;
   LP_Index lastParent;

   do
   {
      subDict = new Dictionary<string, LPtriple<T>>();
      nodeQueue = new Queue<LP_Index>();
      lastParent = null;

      foreach ( KeyValuePair<string, LPtriple<T>> kvp in dictionary )
      {
         LPtriple<T> triple = kvp.Value;

         if ( !triple.Displayed )
         {
            subDict.Add( kvp.Key, kvp.Value );
            lastParent = triple.Parent;
         }
      }
      if ( lastParent != null )
      {
         nodeQueue.Enqueue( lastParent );
         DisplayNodes( nodeQueue, subDict );
      }
   } while ( lastParent != null );
}// DisplayDisjointTrees

10. Conclusion

This article has presented an almost literal translation into C# of Edsger W. Dijkstra’s Smoothsort algorithm. It was implemented as a collection of generic C# functions. Since Smoothsort was proposed as an alternative to Heapsort for sorting in place, a generic implementation of the latter in C# was also presented. Both algorithms were tested on integer and character data types, and on instances of a class to handle points on the integer x-y plane. In sample executions, Smoothsort was shown to outperform Heapsort when the original array is nearly or completely sorted.

11. Using the Code

The code described in this article is in several directories. The paths for the directories are listed and explained in Figure 21, as the code was stored in a USB drive with drive letter "F:". Each of the terminal directories in the paths contains a complete Visual Studio solution, so that the code is ready to use. The solutions under the TESTS directory (TestHeapSort and TestSmoothSort) are console applications. All the other solutions are libraries, which cannot be executed by themselves. The test applications provide examples of how to use the code, and can be run without changes to the libraries. Most of the examples are commented out and the reader can uncomment particular ones to run tests of the sorting algorithms.

Image 22

If any changes are made in the libraries, they should be re-built. Libraries _2DpointsLib and UtilSortLib are independent from each other and can be built in any order, but they MUST be built before GenericHeapSort. Library LPlib must be built after UtilSortLib and before GenericSmoothSort. The two sorting libraries are also independent from each other. With the re-built libraries, the test applications can be re-built in any order.

Each of the Visual Studio solutions listed in Figure 21 contains a _TXT sub-directory in which there are one or more text files describing the update history of the solution’s C# source file(s).

12. References

Dijkstra, E.W., 1981.
"Smoothsort, an alternative for sorting in situ." Article EWD796a, 16 August 1981. In the E.W.Dijkstra Archive, available from the Department of Computer Sciences of the University of Texas at Austin.

Dijkstra, E.W. and W.H.J. Feijen, 1988.
A Method of Programming. Addison-Wesley Publishing Co., Inc. Reading, Massachusetts. First English edition printed in 1988. Reprinted in 1989.

Orejel, J.L., 2003.
"Applied Algorithms and Data Structures." Unpublished textbook. Appendix VI Solutions to Chapter 6 Exercises VI.2.1, Section 6.4.1, pp. VI-8 - VI.9.

Schwarz, K., 2011.
"Smoothsort Demystified." Unpublished article. Posted at https://www.keithschwarz.com/smoothsort/.

van Gasteren, A.J.M, E.W. Dijkstra, and W.H.J Feijen, 1981.
"Heapsort." Article AvG8/EWD794/WF39, 22 June 1981. In the E.W.Dijkstra Archive, available from the Department of Computer Sciences of the University of Texas at Austin.

Williams, J,W.J., 1964.
"Algorithm 232 HEAPSORT." Communications of the ACM, Vol. 7, No. 6, June 1964, pp. 347-348. Reprinted in Collected Algorithms from ACM, Vol. II, Algorithms 221-492, Association for Computing Machinery, Inc., New York, 1980, p. 232-P 1-0.

13. Postscript

Image 23

In the Spring semester of 1986-87, as a doctoral student, the author had the privilege of taking Professor Edsger Wybe Dijkstra’s course "Capita Selecta in Computing Science" (Selected Topics in Computing Science) in the Department of Computer Sciences of the University of Texas at Austin. (Yes, I took the gruelling but fascinating final oral exam, having to prove the famous Knaster-Tarski theorem, and passed the course with a grade of "A"). This article is respectfully dedicated to Edsger’s memory.

History

  • 11th December, 2023: Initial version

License

This article, along with any associated source code and files, is licensed under The MIT License