This activity will present two insertion based sorting algorithms, the insertion sort and the binary tree sort algorithms.

Activity Details

Insertion sort Algorithm

The insertion sort is good for sorting moderately sized arrays (up to some tens of elements). For larger arrays, we will see later a much faster algorithm. The basic idea of insertion sort is as follows: Assume that at the start of iteration i of the algorithm, the first i elements of the array are already sorted. The array therefore consists of a left part which is sorted and a right part which is unsorted, as shown in Figure 3.1.1
.

Next the algorithm takes the (i+1)-th element x and inserts it into the appropriate position of the sorted part to the left of x by shifting the elements larger than x one position to the right. Thus, we have that the first i+1 element sorted. After n iterations (where n is the length of the array), the whole array is sorted. The insertion sort algorithm is presented below:

insertionSort(int[] a)
{ int n = a.length;
for (int i = 1; i < n; i++)
{ // a is sorted from 0 to i-1
int x = a[i];
int j = i-1;
while (j >= 0 && a[j] > x)
{ a[j+1] = a[j];
j = j-1; }
a[j+1] = x; }

The algorithm uses the variable x as a temporary buffer to hold the value of a[i] while the larger elements in the sorted part of the array are shifted one position to the right. When all elements have been shifted, this value is written back into the appropriate position of the array.

In the first iteration of the outer loop, the algorithm performs at most one comparison operation, in the second iteration at most two, and so on. The worst case occurs when the input array is sorted in descending order, in which we have that

1+2+...+(n-1) = n*(n-1)/2 = n2/2

comparisons (the number of element swap operations is approximately the same). Therefore, the time complexity of the algorithm is quadratic in the length n of the array. In average, however, in each iteration of the outer loop only half of the maximum number of comparisons has to be performed until the appropriate insertion point is found. Why? In this case the algorithm performs an average of approximately n2/4 steps.

Sorting an array by this algorithm is therefore much slower than searching for an element in an array (which takes at most n steps, if the array is unsorted, and at most log2 n steps, if the array is sorted). For instance, if the size of the array is 1024, approximately 262000 comparison operations have to be performed to sort the array. If the array has 8192 elements, approximately 16 million comparisons have to be performed.

However, for data that are already sorted or almost sorted, the insertion sort does much better. When data is in order, the inner loop is never executed. Why? In this case, the algorithm needs only n steps. If the data is almost sorted, the inner loop is only very infrequently executed and the algorithm runs in “almost” n steps. It is therefore a simple and efficient way to order a collection that is only slightly out of order.

Binary Tree Sort Algorithm

Given an array of elements to sort, the tree sort algorithms proceeds as follows:

Build a binary search tree out of the elements

Traverse the tree in order and copy the elements back into the array.

This algorithm time complexity is as follows:

Worst case - Occurs when the binary search tree is unbalanced, for example when the data is already sorted. Building the binary search tree takes O(n2), and to traverse the binary tree takes O(n) time. The total is O(n2) + O(n) = O(n2).

Best case - Building the binary search tree takes O(n log n) and to traverse the binary tree takes O(n). The total is O(n log n) + O(n) = O(n log n).

Conclusion

This activity presented two sorting algorithms based on the insertion method. Both algorithms' performance were quadratic in the worst case. Such algorithms may be suited to sorting small data collections.

Assessment

Write one to three sentences to answer each of the following questions:

Insertion sort is better for nearly-sorted lists than reverse-ordered lists. Why?

How much extra memory is required for each sort?

If the list is composed of four identical elements, how many elements change position when insertion sort is used? Why?

For each of the following lists, draw the result of each insertion of the insertion sorting routine. You do not need to show the result of each comparison, just the final insertion of the element.

[ 5 | 4 | 1 ]

[ 3 | 1 | 3 ]

[ 2 | 5 | 4 | 0 ]

Consider the following unsorted list: [ 6 | 8 | 3 | 5 | 1 | 9 | 2 | 2 ]. Show the steps used by the tree sort algorithm to obtain a sorted list.