data structures - Sorting Algorithms & Time Complexity - Stack Overflow
I have multiple questions about the complexity in time of different sorting algorithms
sorting - What sort algorithm provides the best worst-case performance? - Stack Overflow
What is the most optimized sorting algorithm?
What is the best sorting algorithm for random data?
What is the fastest sorting algorithm?
What is the best sorting algorithm for nearly sorted data?
Videos
I was asking myself this question a while ago, and I decided to go ahead and write some code to figure that out. The chart is displaying number of inputs on the x axis and time on the y axis.
As you can see from the image, RadixSort is generally the fastest, followed by QuickSort. Their time complexities are:
- RadixSort: O(N*W), where N is the number of elements to sort and W is the number of bits required to store each key.
- QuickSort: O(N*logN), where N is the number of elements to sort.
Anyway, RadixSort speed comes at a cost. In fact, the space complexities of the two algorithms are the following:
- RadixSort: O(N+W), where N is the number of elements to sort and W is the number of bits required to store each key.
- QuickSort: O(logN), or O(N) depending on how the pivots are chosen: https://cs.stackexchange.com/questions/138335/what-is-the-space-complexity-of-quicksort.
Algorithm Time Complexities Best Average Worst
Selection Sort ฮฉ(n^2) ฮธ(n^2) O(n^2)
Bubble Sort ฮฉ(n) ฮธ(n^2) O(n^2)
Insertion Sort ฮฉ(n) ฮธ(n^2) O(n^2)
Heap Sort ฮฉ(n log(n)) ฮธ(n log(n)) O(n log(n))
Quick Sort ฮฉ(n log(n)) ฮธ(n log(n)) O(n^2)
Merge Sort ฮฉ(n log(n)) ฮธ(n log(n)) O(n log(n))
Bucket Sort ฮฉ(n+k) ฮธ(n+k) O(n^2)
Radix Sort ฮฉ(nk) ฮธ(nk) O(nk)
The time complexity of Quicksort is O(n log n) in the best case, O(n log n) in the average case, and O(n^2) in the worst case. But because it has the best performance in the average case for most inputs, Quicksort is generally considered the โfastestโ sorting algorithm.
How is it determined exactly?
Like do they do actual calculation to determine it or do they determine it using the structure of the algorithm? (like for example, if there's a loop in another loop, they immediately determine its worst complexity as O(nยฒ) )
Is it rounded? like for the selection sort for example, its maximum complexity is O(nยฒ), which seems understandable as there's a loop inside of another loop, yet, the second loop doesn't go through the entire list since it ignores the part that was already sorted, so the complexity should be lower than nยฒ right?
next for the quick sort algorithm, on Wikipedia, it says the average complexity is O(n log n), but does it usually go lower than that? because then why should one use it over the merge sort since this one always has a complexity of O(n log n) (at least I think it does)
I'm asking those questions here because I didn't manage to find an answer to these on the rest of the internet, so thanks in advance, you would really help me out! :)
make sure you have seen this:
visualizing sort algorithms - it helped me decide what sort alg to use.
Depends on data. For example for integers (or anything that can be expressed as integer) the fastest is radix sort which for fixed length values has worst case complexity of O(n). Best general comparison sort algorithms have complexity of O(n log n).
In general terms, there are the $O(n^2)$ sorting algorithms, such as insertion sort, bubble sort, and selection sort, which you should typically use only in special circumstances; Quicksort, which is worst-case $O(n^2)$ but quite often $O(n\log n)$ with good constants and properties and which can be used as a general-purpose sorting procedure; the $O(n\log n)$ algorithms, like merge-sort and heap-sort, which are also good general-purpose sorting algorithms; and the $O(n)$, or linear, sorting algorithms for lists of integers, such as radix, bucket and counting sorts, which may be suitable depending on the nature of the integers in your lists.
If the elements in your list are such that all you know about them is the total order relationship between them, then optimal sorting algorithms will have complexity $\Omega(n\log n)$. This is a fairly cool result and one for which you should be able to easily find details online. The linear sorting algorithms exploit further information about the structure of elements to be sorted, rather than just the total order relationship among elements.
Even more generally, optimality of a sorting algorithm depends intimately upon the assumptions you can make about the kind of lists you're going to be sorting (as well as the machine model on which the algorithm will run, which can make even otherwise poor sorting algorithms the best choice; consider bubble sort on machines with a tape for storage). The stronger your assumptions, the more corners your algorithm can cut. Under very weak assumptions about how efficiently you can determine "sortedness" of a list, the optimal worst-case complexity can even be $\Omega(n!)$.
This answer deals only with complexities. Actual running times of implementations of algorithms will depend on a large number of factors which are hard to account for in a single answer.
The answer, as is often the case for such questions, is "it depends". It depends upon things like (a) how large the integers are, (b) whether the input array contains integers in a random order or in a nearly-sorted order, (c) whether you need the sorting algorithm to be stable or not, as well as other factors, (d) whether the entire list of numbers fits in memory (in-memory sort vs external sort), and (e) the machine you run it on.
In practice, the sorting algorithm in your language's standard library will probably be pretty good (pretty close to optimal), if you need an in-memory sort. Therefore, in practice, just use whatever sort function is provided by the standard library, and measure running time. Only if you find that (i) sorting is a large fraction of the overall running time, and (ii) the running time is unacceptable, should you bother messing around with the sorting algorithm. If those two conditions do hold, then you can look at the specific aspects of your particular domain and experiment with other fast sorting algorithms.
But realistically, in practice, the sorting algorithm is rarely a major performance bottleneck.
As the title stated, I had an assignment that need me to create the fastest algorithm to sort a range of N numbers, where 1000<= N <= 100000000. My prof also said to consider various distributions of the input data. For instance, the values can be randomly distributed or focused on a certain range. My thought would be probably doing a heap sort as it always O( n log n ) but I could be wrong. Any ideas on how should I approach this question?
