Friday, August 5, 2016

Optimized quick sort

In the previous article we discussed about quick sort algorithm. In this article we are going to discuss about further optimizing quick sort.

The optimized version of quick sort algorithm we are going to use is known as QuickIn Sort. QuickIn Sort is significantly faster in practice than other O (n log n) algorithms.


Previous article explains how quick sort works. As described in it, quick sort do the real sorting in it's partitioning step. Every time when the recursive function is called, the size of the partition gets smaller. 


Even for a very large data set, the recursive function call occur till the partition size is the smallest possible. When this partitioning occur, there is a threshold size of partition after which, the partitioning costs more than other O (N^2) algorithms. This is an inefficiency of quick sort algorithm.




where A, B, k and c are constants , N is the number of elements in the data set.


The L.H.S of the inequality gives the time complexity of quick sort while R.H.S gives the time complexity of other sorting algorithms.


Insertion sort is an algorithm which has an average time complexity of O (N^2). But the time efficiency of insertion sort depends on how sorted the data set is.


For small N values above inequality becomes true.


eg:






The quickIn sort algorithm use insertion sort for partitions of size less than the threshold value. The threshold value needs to be determined by testing.

Following is an implementation of quickInSort function


The partition function will be the same as in normal quick sort algorithm. Please refer previous article for the partition function. 

Following is an implementation of insertion sort function 

You can find the complete implementation of QuickIn Sort algorithm with tests on github here.

Tuesday, August 2, 2016

Learn Quick Sort

Quick Sort


Quick Sort is a divide and conquer recursive algorithm used in sorting a data set. It is one of the fastest sorting algorithms with an average time complexity of O(Nlog(N)) and a worst case time complexity of O(N²).


Algorithm

  • Pick a pivot ( element of the data set in the range)
  • Iterate through the data set once and re-arrange data so that all the elements larger than the pivot are on the right side of the pivot and elements smaller than the pivot are on the left. 
  • Apply the same re-arranging method recursively to the left and the right parts of the data set separated by the pivot.


Picking the pivot


The pivot can be any element of the data set within the range of the partition. But the pivot will decide the running time of the sorting process. Because if you choose the smallest/largest element of the data set every time, the partitioning will be useless and hence running time will be O(N²). Therefore spending some time to pick a good pivot won't be a waste. 

As mentioned above, choosing left-most element, right-most element or the element at middle as the pivot may lead to time inefficiency as the arrangement of the data set is not known. Choosing a random element also has the same risk. The best pivot will be the median of the data set. But finding it would not be wise considering the time efficiency. 

The easiest way to choose a pivot which is not a corner element is, picking the median element of the first, middle and the last element of the data set. This will ensure that you are not at the worst case every time. 

Advantages

  • No additional memory required. ( Sort in place )
  • Very fast sorting for average case

Disadvantage

  • Worst-case complexity is O(N²)
  • Sorting time not guaranteed.


Following is an implementation of quick sort algorithm in C



Partition function which does the sorting



Here is the complete implementation of quick sort algorithm with example tests on github. 

Optimized quick sort

In the previous article  we discussed about quick sort algorithm. In this article we are going to discuss about further optimizing quick sor...