For those who have a hard time understanding it.
Many code fragments has been copypasted from https://rosettacode.org/wiki/Sorting_algorithms/Quicksort and reworked.
This is not a Quicksort, but the algorithm I've devised right now.
We find an arithmetic mean of the input array and call it "pivot". All elements of array smaller than pivot, are stacked at left. All elements larger than pivot - at right. All elements equals to pivot - at the center.
#!/usr/bin/env python def arith_mean(lst): return sum(lst)/len(lst) def my_sort(arr, padding): if len(arr) <= 1: return arr less = [] pivots = [] more = [] pivot=arith_mean(arr) print " "*padding + "start: pivot=", pivot, "input=", arr for i in arr: if i < pivot: less.append(i) elif i > pivot: more.append(i) else: pivots.append(i) print " "*padding + "finish: less=", less, "pivots=", pivots, "more=", more return my_sort(less, padding+1) + pivots + my_sort(more, padding+1) a = [478, 184, 59, 265, 608, 37, 789, 0, 372, 296, 11, 251, 28, 92, 283, 832] r=my_sort(a, 0) print "result=", r
And how it works:
start: pivot= 286 input= [478, 184, 59, 265, 608, 37, 789, 0, 372, 296, 11, 251, 28, 92, 283, 832] finish: less= [184, 59, 265, 37, 0, 11, 251, 28, 92, 283] pivots= [] more= [478, 608, 789, 372, 296, 832] start: pivot= 121 input= [184, 59, 265, 37, 0, 11, 251, 28, 92, 283] finish: less= [59, 37, 0, 11, 28, 92] pivots= [] more= [184, 265, 251, 283] start: pivot= 37 input= [59, 37, 0, 11, 28, 92] finish: less= [0, 11, 28] pivots= [37] more= [59, 92] start: pivot= 13 input= [0, 11, 28] finish: less= [0, 11] pivots= [] more= [28] start: pivot= 5 input= [0, 11] finish: less= [0] pivots= [] more= [11] start: pivot= 75 input= [59, 92] finish: less= [59] pivots= [] more= [92] start: pivot= 245 input= [184, 265, 251, 283] finish: less= [184] pivots= [] more= [265, 251, 283] start: pivot= 266 input= [265, 251, 283] finish: less= [265, 251] pivots= [] more= [283] start: pivot= 258 input= [265, 251] finish: less= [251] pivots= [] more= [265] start: pivot= 562 input= [478, 608, 789, 372, 296, 832] finish: less= [478, 372, 296] pivots= [] more= [608, 789, 832] start: pivot= 382 input= [478, 372, 296] finish: less= [372, 296] pivots= [] more= [478] start: pivot= 334 input= [372, 296] finish: less= [296] pivots= [] more= [372] start: pivot= 743 input= [608, 789, 832] finish: less= [608] pivots= [] more= [789, 832] start: pivot= 810 input= [789, 832] finish: less= [789] pivots= [] more= [832] result= [0, 11, 28, 37, 59, 92, 184, 251, 265, 283, 296, 372, 478, 608, 789, 832]
I use arithmetic mean to make left/right parts almost equal in size.
It works because you make a "sandwich" of input elements, which is in plain English can be described like that:
sort_recursively(elements smaller than pivot) + elements equal to pivot + sort_recursively(elements larger than pivot)"
If left and right parts are sorted (recursively), the resulting "sandwich" is also sorted.
Now the problem: computing arithmetic mean in sorting algorithm is absurdic and slow.
We can pick pivot just randomly.
#!/usr/bin/env python import random def quickSort(arr, padding): if len(arr) <= 1: return arr less = [] pivots = [] more = [] pivot = arr[random.randint(0, len(arr)-1)] #pivot = arr[len(arr)-1] # or last #pivot = arr[0] # or first print " "*padding + "start: pivot=", pivot, "input", arr for i in arr: if i < pivot: less.append(i) elif i > pivot: more.append(i) else: pivots.append(i) print " "*padding + "finish: less=", less, "pivots=", pivots, "more=", more return quickSort(less, padding+1) + pivots + quickSort(more, padding+1) a = [478, 184, 59, 265, 608, 37, 789, 0, 372, 296, 11, 251, 28, 92, 283, 832] r=quickSort(a, 0) print "result=", r
And it works (each time differently, though, but result is always sorted):
start: pivot= 608 input [478, 184, 59, 265, 608, 37, 789, 0, 372, 296, 11, 251, 28, 92, 283, 832] finish: less= [478, 184, 59, 265, 37, 0, 372, 296, 11, 251, 28, 92, 283] pivots= [608] more= [789, 832] start: pivot= 184 input [478, 184, 59, 265, 37, 0, 372, 296, 11, 251, 28, 92, 283] finish: less= [59, 37, 0, 11, 28, 92] pivots= [184] more= [478, 265, 372, 296, 251, 283] start: pivot= 37 input [59, 37, 0, 11, 28, 92] finish: less= [0, 11, 28] pivots= [37] more= [59, 92] start: pivot= 0 input [0, 11, 28] finish: less= [] pivots= [0] more= [11, 28] start: pivot= 28 input [11, 28] finish: less= [11] pivots= [28] more= [] start: pivot= 59 input [59, 92] finish: less= [] pivots= [59] more= [92] start: pivot= 296 input [478, 265, 372, 296, 251, 283] finish: less= [265, 251, 283] pivots= [296] more= [478, 372] start: pivot= 251 input [265, 251, 283] finish: less= [] pivots= [251] more= [265, 283] start: pivot= 265 input [265, 283] finish: less= [] pivots= [265] more= [283] start: pivot= 478 input [478, 372] finish: less= [372] pivots= [478] more= [] start: pivot= 832 input [789, 832] finish: less= [789] pivots= [832] more= [] result= [0, 11, 28, 37, 59, 92, 184, 251, 265, 283, 296, 372, 478, 608, 789, 832]
This is randomized quicksort, which is sometimes used. The randomized version is not the most popular. But I used it here to stress the fact that any element of the input array can be used as pivot. You can use the first or the last (by commenting/uncommenting corresponding lines).
Now you can understand the Haskell one-liner from rosettacode.com:
qsort [] = [] qsort (x:xs) = qsort [y | y <- xs, y < x] ++ [x] ++ qsort [y | y <- xs, y >= x]
x here is just the first element of the list. And simultaneously, it's a pivot.
In plain English, it can be described as:
pivot = first element of array sort_recursively(bunch of elements from array, smaller than pivot) + pivot + sort_recursively(...elements larger than pivot)
So is Python one-liner from the same website:
def qsort(L): return (qsort([y for y in L[1:] if y < L[0]]) + L[:1] + qsort([y for y in L[1:] if y >= L[0]])) if len(L) > 1 else L
Now the problem: you can rewrite Python/Haskell versions to pure C, but you will need to allocate/free/copy chunks of data. No good. You want to do this in-place.
This is the classic version written by C.A.R. Hoare in 1962, with my printing statements. It's the most popular implementation of Quicksort, and also, the most used sorting algorithm today.
And how it works: there are two pointers, first at left (begin) and second at right (end). Pointers moves toward each other to the middle, until they reach two elements, which are not in place: the first is larger than pivot and the second is smaller than pivot. When you find them two, you swap to in-place.
And when these two pointers met somewhere in the middle, a partitioning happens at this point. The left part is then sorted recursively, so is the right part.
#include <stdio.h> void quicksort(int *A, int len, int padding); void print_array(int *A, int len) { for (int i = 0; i < len; i++) printf("%3d ", A[i]); } int main (void) { int a[] = {478, 184, 59, 265, 608, 37, 789, 0, 123}; int n = sizeof a / sizeof a[0]; quicksort(a, n, 0); printf ("sorted: "); print_array(a, n); printf("\n"); return 0; } void swap(int *a, int *b) { int t=*a; *a=*b; *b=t; }; void print_n_tabs(int n) { for (int i=0; i<n; i++) printf ("\t"); }; /* print array like: 92 265 59 184 251 296 283 *** *** elements at indices i and j are underlined by *** */ void print_array_with_underlinings (int *A, int len, int u1, int u2, int padding) { print_n_tabs(padding); print_array(A, len); printf("\n"); print_n_tabs(padding); for (int z=0; z<len; z++) { if (z==u1 || z==u2) printf ("*** "); else printf (" "); }; printf ("\n"); }; void quicksort(int *A, int len, int padding) { if (len < 2) return; int pivot = A[len / 2]; print_n_tabs(padding); printf ("begin, len=%d, pivot=%d input= ", len, pivot); print_array(A, len); printf("\n"); int i, j; for (i = 0, j = len - 1; ; i++, j--) { while (A[i] < pivot) i++; print_n_tabs(padding); printf ("an element larger than pivot found: A[i=%d]=%d\n", i, A[i]); while (A[j] > pivot) j--; print_n_tabs(padding); printf ("an element smaller than pivot found: A[j=%d]=%d\n", j, A[j]); if (i >= j) { print_n_tabs(padding); printf ("i >= j, so we stop here. size of the first partition = %d, second = %d\n", i, len-i); break; }; print_n_tabs(padding); printf ("swapping at i=%d,j=%d\n", i, j); print_array_with_underlinings (A, len, i, j, padding); swap(&A[i], &A[j]); } print_n_tabs(padding); printf ("end, less= "); print_array(A, i); printf ("more= "); print_array(A+i, len-i); printf("\n"); quicksort(A, i, padding+1); quicksort(A + i, len - i, padding+1); }
The result:
begin, len=9, pivot=608 input= 478 184 59 265 608 37 789 0 123 an element larger than pivot found: A[i=4]=608 an element smaller than pivot found: A[j=8]=123 swapping at i=4,j=8 478 184 59 265 608 37 789 0 123 *** *** an element larger than pivot found: A[i=6]=789 an element smaller than pivot found: A[j=7]=0 swapping at i=6,j=7 478 184 59 265 123 37 789 0 608 *** *** an element larger than pivot found: A[i=7]=789 an element smaller than pivot found: A[j=6]=0 i >= j, so we stop here. size of the first partition = 7, second = 2 end, less= 478 184 59 265 123 37 0 more= 789 608 begin, len=7, pivot=265 input= 478 184 59 265 123 37 0 an element larger than pivot found: A[i=0]=478 an element smaller than pivot found: A[j=6]=0 swapping at i=0,j=6 478 184 59 265 123 37 0 *** *** an element larger than pivot found: A[i=3]=265 an element smaller than pivot found: A[j=5]=37 swapping at i=3,j=5 0 184 59 265 123 37 478 *** *** an element larger than pivot found: A[i=5]=265 an element smaller than pivot found: A[j=4]=123 i >= j, so we stop here. size of the first partition = 5, second = 2 end, less= 0 184 59 37 123 more= 265 478 begin, len=5, pivot=59 input= 0 184 59 37 123 an element larger than pivot found: A[i=1]=184 an element smaller than pivot found: A[j=3]=37 swapping at i=1,j=3 0 184 59 37 123 *** *** an element larger than pivot found: A[i=2]=59 an element smaller than pivot found: A[j=2]=59 i >= j, so we stop here. size of the first partition = 2, second = 3 end, less= 0 37 more= 59 184 123 begin, len=2, pivot=37 input= 0 37 an element larger than pivot found: A[i=1]=37 an element smaller than pivot found: A[j=1]=37 i >= j, so we stop here. size of the first partition = 1, second = 1 end, less= 0 more= 37 begin, len=3, pivot=184 input= 59 184 123 an element larger than pivot found: A[i=1]=184 an element smaller than pivot found: A[j=2]=123 swapping at i=1,j=2 59 184 123 *** *** an element larger than pivot found: A[i=2]=184 an element smaller than pivot found: A[j=1]=123 i >= j, so we stop here. size of the first partition = 2, second = 1 end, less= 59 123 more= 184 begin, len=2, pivot=123 input= 59 123 an element larger than pivot found: A[i=1]=123 an element smaller than pivot found: A[j=1]=123 i >= j, so we stop here. size of the first partition = 1, second = 1 end, less= 59 more= 123 begin, len=2, pivot=478 input= 265 478 an element larger than pivot found: A[i=1]=478 an element smaller than pivot found: A[j=1]=478 i >= j, so we stop here. size of the first partition = 1, second = 1 end, less= 265 more= 478 begin, len=2, pivot=608 input= 789 608 an element larger than pivot found: A[i=0]=789 an element smaller than pivot found: A[j=1]=608 swapping at i=0,j=1 789 608 *** *** an element larger than pivot found: A[i=1]=789 an element smaller than pivot found: A[j=0]=608 i >= j, so we stop here. size of the first partition = 1, second = 1 end, less= 608 more= 789 sorted: 0 37 59 123 184 265 478 608 789
Quicksort is recursive, but at some depth you can switch the algorithm to another.
For example, the standard Go language implementation of Quicksort switching to ShellSort for arrays smaller or equal to 12 elements. Perhaps, the Go developers thought recursion would be too expensive for such a short array?
func quickSort(data Interface, a, b, maxDepth int) { for b-a > 12 { // Use ShellSort for slices <= 12 elements if maxDepth == 0 { heapSort(data, a, b) return } maxDepth-- mlo, mhi := doPivot(data, a, b) // Avoiding recursion on the larger subproblem guarantees // a stack depth of at most lg(b-a). if mlo-a < b-mhi { quickSort(data, a, mlo, maxDepth) a = mhi // i.e., quickSort(data, mhi, b) } else { quickSort(data, mhi, b, maxDepth) b = mlo // i.e., quickSort(data, a, mlo) } } if b-a > 1 { // Do ShellSort pass with gap 6 // It could be written in this simplified form cause b-a <= 12 for i := a + 6; i < b; i++ { if data.Less(i, i-6) { data.Swap(i, i-6) } } insertionSort(data, a, b) } }
( https://golang.org/src/sort/sort.go )
Some feedback: https://news.ycombinator.com/item?id=21612736.