|  | Home | Libraries | People | FAQ | More | 
              The spreadsort
              flash_sort (another hybrid algorithm), by comparison is
              𝑶(N) for evenly distributed lists. The problem
              is, flash_sort is merely an MSD radix
              sort combined with 𝑶(N*N) insertion sort
              to deal with small subsets where the MSD Radix Sort is inefficient,
              so it is inefficient with chunks of data around the size at which it
              switches to insertion_sort, and ends up operating as an
              enhanced MSD Radix Sort. For uneven distributions this makes it especially
              inefficient.
            
              integer_sortfloat_sortflash_sort's 𝑶(N)
              performance for even distributions comes at the cost of cache misses,
              which on modern architectures are extremely expensive, and in testing
              on modern systems ends up being slower than cutting up the data in
              multiple, cache-friendly steps. Also worth noting is that on most modern
              computers, log2(available RAM)/log2(L1 cache size) is
              around 3, where a cache miss takes more than 3 times as long as an
              in-cache random-access, and the size of max_splits
              is tuned to the size of the cache. On a computer where cache misses
              aren't this expensive, max_splits could be increased
              to a large value, or eliminated entirely, and integer_sort/float_sort
              would have the same 𝑶(N) performance on even distributions.
            
              Adaptive Left Radix (ALR) is similar to flash_sort, but
              more cache-friendly. It still uses insertion_sort. Because ALR uses
              𝑶(N*N) insertion_sort, it isn't efficient
              to use the comparison-based fallback sort on large lists, and if the
              data is clustered in small chunks just over the fallback size with
              a few outliers, radix-based sorting iterates many times doing little
              sorting with high overhead. Asymptotically, ALR is still 𝑶(N*log(K/S
              + S)), but with a very small S (about
              2 in the worst case), which compares unfavorably with the 11 default
              value of max_splits for Spreadsort.
            
              ALR also does not have the 𝑶(N*log(N)) fallback,
              so for small lists that are not evenly distributed it is extremely
              inefficient. See the alrbreaker and binaryalrbreaker
              testcases for examples; either replace the call to sort with a call
              to ALR and update the ALR_THRESHOLD at the top, or as a quick comparison
              make get_max_count return ALR_THRESHOLD (20 by default
              based upon the paper). These small tests take 4-10 times as long with
              ALR as std::sort
              in the author's testing, depending on the test system, because they
              are trying to sort a highly uneven distribution. Normal Spreadsort
              does much better with them, because get_max_count is designed
              around minimizing worst-case runtime.
            
              burst_sort is an efficient hybrid algorithm for strings
              that uses substantial additional memory.
            
              string_sortstring_sort
              postal_sort and string_sortpostal_sort was not found in a
              search for source.
            
              string_sort
              Another difference is not applying the stack-size restriction. Because
              of the equality check in string_sortstring_sortstring_sortstd::strings) had comparable runtime
              to introsort,
              but making a hybrid of the two allows reduced overhead and substantially
              superior performance.