<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Experimental chill]]></provider_name><provider_url><![CDATA[http://danlark.org]]></provider_url><author_name><![CDATA[Danila Kutenin]]></author_name><author_url><![CDATA[https://danlark.org/author/kutdanila/]]></author_url><title><![CDATA[Miniselect: Practical and Generic Selection&nbsp;Algorithms]]></title><type><![CDATA[link]]></type><html><![CDATA[
<p>Today I present a big effort from my side to publish <a href="https://github.com/danlark1/miniselect">miniselect</a> &#8212; generic C++ library to support multiple selection and partial sorting algorithms. It is already <a href="https://github.com/ClickHouse/ClickHouse/pull/16825">used</a> in <a href="https://clickhouse.tech/">ClickHouse</a> with huge performance benefits. Exact benchmarks and results will be later in this post and now let&#8217;s tell some stories about how it all arose. I publish this library under Boost License and any contributions are highly welcome.</p>



<h2>It all started with sorting</h2>



<p>While reading lots of articles, papers, and posts from Hacker News, I found it pretty funny each several months new &#8220;shiny&#8221;, &#8220;fastest&#8221;, &#8220;generic&#8221; sorting algorithms to come or remembered from old papers such as the recent paper on <a href="https://blog.acolyer.org/2020/10/19/the-case-for-a-learned-sorting-algorithm/">learned sorting</a>, <a href="https://sortingsearching.com/2020/06/06/kirkpatrick-reisch.html">Kirkpatrick-Reisch</a> sort or <a href="https://news.ycombinator.com/item?id=14661659">pdqsort</a>. It is that we are essentially 65+ years into writing sorting algorithms, and we still find improvements. Shouldn&#8217;t sorting items be a &#8220;solved&#8221; problem by now? Unfortunately, not. New hardware features come, we find that sorting numbers can be actually done faster than best comparison <img src="https://s0.wp.com/latex.php?latex=O%28n+%5Clog+n%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n &#92;log n)" class="latex" /> time complexity and we still find improvements in sorting algorithms like avoiding <a href="https://www.researchgate.net/publication/301614727_BlockQuicksort_How_Branch_Mispredictions_don't_affect_Quicksort">branches in partitions</a> and trying to find good pivots as pdqsort does. Also, there are many open questions in that area as &#8220;what is the minimum number of comparisons needed?&#8221;.</p>



<p>Huge competition is still going on in sorting algorithms and I believe we are not near the optimal sorting and learned sorting looks like the next step. But it uses the fundamental fact that no one expects sorting to be completed in a couple of passes and we can understand something about data during first array passes. We will understand why it matters later.</p>



<p>My favorite general sorting is <a href="https://github.com/orlp/pdqsort">pdqsort</a>, it proves to be currently the best general sorting algorithm and it shows a significant boost over all standard sorts that are provided in C++. It is also <a href="https://docs.rs/pdqsort/1.0.3/pdqsort/">used</a> in Rust.</p>



<h2>Selection and Partial Sorting</h2>



<p>Nearly a couple of months ago I started thinking about a slightly different approach when it comes to sorting &#8212; partial sorting algorithms. It means that you don&#8217;t need to sort all <img src="https://s0.wp.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n" class="latex" /> elements but only find <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> smallest and sort them. For example, it is widely used in SQL queries when you do <code>ORDER BY LIMIT N</code> and <code>N</code> is often small, from 1-10 to ideally couple of thousands, bigger values still happen but rare. And, oh god, how little engineering and theoretical research has been done there compared to full sorting algorithms. In fact, the question of specifically finding <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th order statistics when <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> is small is open and no good solution is presented. Also, partial sorting is quite easy to obtain after that, you need to sort the first <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> elements by some sorting algorithm to get optimal <img src="https://s0.wp.com/latex.php?latex=O%28n+%2B+k+%5Clog+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n + k &#92;log k)" class="latex" /> comparisons and we will look at only one example when it is not the case. Yes, there are a bunch of median algorithms that can be generalized to find the <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th smallest element. So, what are they? Yeah, you may know some of them but let&#8217;s revise, it is useful to know your enemies.</p>



<h3>QuickSelect</h3>



<p>This is almost the very first algorithm for finding the <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th smallest element, just do like <a href="https://en.wikipedia.org/wiki/Quicksort">QuickSort</a> but don&#8217;t go recursively in two directions, that&#8217;s it. Pick middle or even random element and partition by this element, see in which of two parts <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> is located, update the one of the borders, voila, after maximum of <img src="https://s0.wp.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n" class="latex" /> partitions you will find <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th smallest element. Good news that on average it takes <img src="https://s0.wp.com/latex.php?latex=O%28n%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n)" class="latex" /> comparisons if we pick random pivot. That is because if we define <img src="https://s0.wp.com/latex.php?latex=C%28n%2C+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="C(n, k)" class="latex" /> is the expected number of comparisons for finding <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th element in <img src="https://s0.wp.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n" class="latex" /> elements and <img src="https://s0.wp.com/latex.php?latex=C%28n%29+%3D+%5Cmax_%7B1%7D%5E%7Bn%7D+C%28n%2C+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="C(n) = &#92;max_{1}^{n} C(n, k)" class="latex" />, then during one stage we do <img src="https://s0.wp.com/latex.php?latex=n+-+1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - 1" class="latex" /> comparisons and uniformly pick any pivot, then even if we pick the biggest part on each step</p>



<p class="has-text-align-center"><img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+C%28n%29+%5Cleq+n+-+1+%2B+%5Cfrac%7B2%7D%7Bn%7D%5Csum_%7Bn%2F2%7D%5E%7Bn+-+1%7D+C%28i%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;displaystyle C(n) &#92;leq n - 1 + &#92;frac{2}{n}&#92;sum_{n/2}^{n - 1} C(i)" class="latex" /></p>



<p class="has-text-align-center"><img src="https://s0.wp.com/latex.php?latex=%5Cdisplaystyle+C%28n%29+%5Cleq+%28n+-+1%29+%2B+%5Cmathrm%7Bavg%7D%28C%28n%2F2%29%2C+%5Cldots%2C+C%28n%29%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;displaystyle C(n) &#92;leq (n - 1) + &#92;mathrm{avg}(C(n/2), &#92;ldots, C(n))" class="latex" /></p>



<p>If assuming by induction that <img src="https://s0.wp.com/latex.php?latex=C%28i%29+%5Cleq+4i&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="C(i) &#92;leq 4i" class="latex" /> with an obvious induction base, we get</p>



<p class="has-text-align-center"><img src="https://s0.wp.com/latex.php?latex=C%28n%29+%5Cleq+%28n+-+1%29+%2B+%5Cmathrm%7Bavg%7D%28C%28n%2F2%29%2C+%5Cldots%2C+C%28n%29%29+%5Cleq+n+-+1+%2B+4%283n%2F4%29+%3C+4n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="C(n) &#92;leq (n - 1) + &#92;mathrm{avg}(C(n/2), &#92;ldots, C(n)) &#92;leq n - 1 + 4(3n/4) &lt; 4n" class="latex" /> </p>



<p class="has-text-align-left">Bad news is that the worst case will still be <img src="https://s0.wp.com/latex.php?latex=O%28n%5E2%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n^2)" class="latex" /> if we are unfortunate and always pick the biggest element as a pivot, thus partitioning .</p>



<p>In that sense that algorithm provides lots of pivot &#8220;strategies&#8221; that are used nowadays, for example, picking pivot as a <img src="https://s0.wp.com/latex.php?latex=n%2F2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n/2" class="latex" /> element of the array or picking pivot from 3 random elements . Or do like <code>std::nth_element</code> from libcxx &#8212; choose the middle out out of <img src="https://s0.wp.com/latex.php?latex=A%5B0%5D%2C+A%5Bn%2F2%5D%2C+A%5Bn+-+1%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="A[0], A[n/2], A[n - 1]" class="latex" />.</p>



<p>I decided to visualize all algorithms I am going to talk about today, so quickselect with a median of 3 strategy on random input looks something like this:</p>



<figure class="wp-block-image size-large"><img data-attachment-id="579" data-permalink="https://danlark.org/nth-element-clang-2020-11-09_11-18-38/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/nth-element-clang-2020-11-09_11.18.38.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="nth-element-clang-2020-11-09_11.18.38" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/nth-element-clang-2020-11-09_11.18.38.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/nth-element-clang-2020-11-09_11.18.38.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/nth-element-clang-2020-11-09_11.18.38.gif?w=1024" alt="" class="wp-image-579" /><figcaption>nth_element in libcxx, median of 3 strategies</figcaption></figure>



<p>And random pivot out of 3 elements works similar</p>



<figure class="wp-block-image size-large"><img data-attachment-id="581" data-permalink="https://danlark.org/median-of-3-random-2020-11-09_11-06-02/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-3-random-2020-11-09_11.06.02.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="median-of-3-random-2020-11-09_11.06.02" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-3-random-2020-11-09_11.06.02.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-3-random-2020-11-09_11.06.02.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/median-of-3-random-2020-11-09_11.06.02.gif?w=1024" alt="" class="wp-image-581" /><figcaption>Finding median in median of 3 random algorithm</figcaption></figure>



<p>For a strategy like <a href="https://github.com/llvm/llvm-project/blob/3ed89b51da38f081fedb57727076262abb81d149/libcxx/include/algorithm#L5159">libcxx</a> (C++ llvm standard library) does, there are quadratic counterexamples that are pretty easy to detect, such patterns also appear in real data. The counterexample looks like that:</p>



<figure class="wp-block-embed is-type-rich is-provider-embed wp-block-embed-embed"><div class="wp-block-embed__wrapper">
<style>.gist table { margin-bottom: 0; }</style><div style="tab-size: 8" id="gist106376435" class="gist">
    <div class="gist-file" translate="no">
      <div class="gist-data">
        <div class="js-gist-file-update-container js-task-list-container file-box">
  <div id="file-median_3_killer-h" class="file my-2">
    
  <div itemprop="text" class="Box-body p-0 blob-wrapper data type-c  ">

      
<table class="highlight tab-size js-file-line-container" data-tab-size="8" data-paste-markdown-skip>
      <tr>
        <td id="file-median_3_killer-h-L1" class="blob-num js-line-number" data-line-number="1"></td>
        <td id="file-median_3_killer-h-LC1" class="blob-code blob-code-inner js-file-line"><span class="pl-k">struct</span> <span class="pl-en">Median3Killer</span> {</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L2" class="blob-num js-line-number" data-line-number="2"></td>
        <td id="file-median_3_killer-h-LC2" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">static</span> std::vector&lt;<span class="pl-c1">uint32_t</span>&gt; <span class="pl-en">Gen</span>(<span class="pl-c1">size_t</span> size) {</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L3" class="blob-num js-line-number" data-line-number="3"></td>
        <td id="file-median_3_killer-h-LC3" class="blob-code blob-code-inner js-file-line">    <span class="pl-c1">size_t</span> k = size / <span class="pl-c1">2</span>;</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L4" class="blob-num js-line-number" data-line-number="4"></td>
        <td id="file-median_3_killer-h-LC4" class="blob-code blob-code-inner js-file-line">    std::vector&lt;<span class="pl-c1">uint32_t</span>&gt; v;</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L5" class="blob-num js-line-number" data-line-number="5"></td>
        <td id="file-median_3_killer-h-LC5" class="blob-code blob-code-inner js-file-line">    v.<span class="pl-c1">reserve</span>(size);</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L6" class="blob-num js-line-number" data-line-number="6"></td>
        <td id="file-median_3_killer-h-LC6" class="blob-code blob-code-inner js-file-line">    <span class="pl-k">for</span> (<span class="pl-c1">size_t</span> i = <span class="pl-c1">1</span>; i &lt; k + <span class="pl-c1">1</span>; ++i) {</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L7" class="blob-num js-line-number" data-line-number="7"></td>
        <td id="file-median_3_killer-h-LC7" class="blob-code blob-code-inner js-file-line">      <span class="pl-k">if</span> (i &amp; <span class="pl-c1">1</span>) {</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L8" class="blob-num js-line-number" data-line-number="8"></td>
        <td id="file-median_3_killer-h-LC8" class="blob-code blob-code-inner js-file-line">        v.<span class="pl-c1">push_back</span>(i);</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L9" class="blob-num js-line-number" data-line-number="9"></td>
        <td id="file-median_3_killer-h-LC9" class="blob-code blob-code-inner js-file-line">      } <span class="pl-k">else</span> {</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L10" class="blob-num js-line-number" data-line-number="10"></td>
        <td id="file-median_3_killer-h-LC10" class="blob-code blob-code-inner js-file-line">        v.<span class="pl-c1">push_back</span>(k + i &#8211; <span class="pl-c1">1</span>);</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L11" class="blob-num js-line-number" data-line-number="11"></td>
        <td id="file-median_3_killer-h-LC11" class="blob-code blob-code-inner js-file-line">      }</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L12" class="blob-num js-line-number" data-line-number="12"></td>
        <td id="file-median_3_killer-h-LC12" class="blob-code blob-code-inner js-file-line">    }</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L13" class="blob-num js-line-number" data-line-number="13"></td>
        <td id="file-median_3_killer-h-LC13" class="blob-code blob-code-inner js-file-line">    <span class="pl-k">for</span> (<span class="pl-c1">size_t</span> i = <span class="pl-c1">1</span>; i &lt; k + <span class="pl-c1">1</span>; ++i) {</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L14" class="blob-num js-line-number" data-line-number="14"></td>
        <td id="file-median_3_killer-h-LC14" class="blob-code blob-code-inner js-file-line">      v.<span class="pl-c1">push_back</span>(<span class="pl-c1">2</span> * i);</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L15" class="blob-num js-line-number" data-line-number="15"></td>
        <td id="file-median_3_killer-h-LC15" class="blob-code blob-code-inner js-file-line">    }</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L16" class="blob-num js-line-number" data-line-number="16"></td>
        <td id="file-median_3_killer-h-LC16" class="blob-code blob-code-inner js-file-line">    <span class="pl-k">return</span> v;</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L17" class="blob-num js-line-number" data-line-number="17"></td>
        <td id="file-median_3_killer-h-LC17" class="blob-code blob-code-inner js-file-line">  }</td>
      </tr>
      <tr>
        <td id="file-median_3_killer-h-L18" class="blob-num js-line-number" data-line-number="18"></td>
        <td id="file-median_3_killer-h-LC18" class="blob-code blob-code-inner js-file-line">};</td>
      </tr>
</table>


  </div>

  </div>
</div>

      </div>
      <div class="gist-meta">
        <a href="https://gist.github.com/danlark1/c70d6a1b3f7d523d341cf2f4f720f31b/raw/fd5fd5d57388a78db5aff5e3baf35532cfe4b710/median_3_killer.h" style="float:right">view raw</a>
        <a href="https://gist.github.com/danlark1/c70d6a1b3f7d523d341cf2f4f720f31b#file-median_3_killer-h">median_3_killer.h</a>
        hosted with &#10084; by <a href="https://github.com">GitHub</a>
      </div>
    </div>
</div>

</div></figure>



<figure class="wp-block-image size-large"><img data-attachment-id="584" data-permalink="https://danlark.org/nth_element_clang_median_3_killer-2020-11-10_22-55-30/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/nth_element_clang_median_3_killer-2020-11-10_22.55.30.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="nth_element_clang_median_3_killer-2020-11-10_22.55.30" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/nth_element_clang_median_3_killer-2020-11-10_22.55.30.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/nth_element_clang_median_3_killer-2020-11-10_22.55.30.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/nth_element_clang_median_3_killer-2020-11-10_22.55.30.gif?w=1024" alt="" class="wp-image-584" /><figcaption>std::nth_element in libcxx for Medianof3Killer</figcaption></figure>



<p>This is definitely quadratic. By the way, this is perfectly ok with the C++ standard wording as it says:</p>



<figure class="wp-block-image size-large"><img data-attachment-id="587" data-permalink="https://danlark.org/2020-11-10-230126_795x63_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png" data-orig-size="795,63" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-10-230126_795x63_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png?w=795" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png?w=795" alt="" class="wp-image-587" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png 795w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-230126_795x63_scrot.png?w=768 768w" sizes="(max-width: 795px) 100vw, 795px" /><figcaption><a href="https://eel.is/c++draft/alg.nth.element#5">https://eel.is/c++draft/alg.nth.element#5</a></figcaption></figure>



<h2>Median of Medians</h2>



<p>For a long time, computer scientists thought that it is impossible to find medians in worst-case linear time, however, Blum, Floyd, Pratt, Rivest, Tarjan came up with BFPRT algorithm or like sometimes it is called, median of medians algorithm.</p>



<p>Median of medians algorithm: Given array <img src="https://s0.wp.com/latex.php?latex=A&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="A" class="latex" /> of size <img src="https://s0.wp.com/latex.php?latex=n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n" class="latex" /> and integer <img src="https://s0.wp.com/latex.php?latex=k+%5Cleq+n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k &#92;leq n" class="latex" />,</p>



<ol><li>Group the array into <img src="https://s0.wp.com/latex.php?latex=n%2F5&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n/5" class="latex" /> groups of size 5 and find the median of each group. (For simplicity, we will ignore integrality issues.)</li><li>Recursively, find the true median of the medians. Call this <img src="https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="p" class="latex" />.</li><li>Use <img src="https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="p" class="latex" /> as a pivot to partition the array.</li><li>Recurse on the appropriate piece.</li></ol>



<p>When we find the median <img src="https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="p" class="latex" /> of <img src="https://s0.wp.com/latex.php?latex=g+%3D+n%2F5&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="g = n/5" class="latex" /> groups, at least <img src="https://s0.wp.com/latex.php?latex=g%2F2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="g/2" class="latex" /> of them have at least 3 out of 5 elements that are smaller or equal than <img src="https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="p" class="latex" />, that said the biggest out of 2 partitioned chunks have size <img src="https://s0.wp.com/latex.php?latex=7n%2F10&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="7n/10" class="latex" /> and we have the reccurence</p>



<p class="has-text-align-center"><img src="https://s0.wp.com/latex.php?latex=C%28n%29+%5Cleq+cn+%2B+C%28n%2F5%29+%2B+C%287n%2F10%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="C(n) &#92;leq cn + C(n/5) + C(7n/10)" class="latex" /></p>



<p>If we appropriately build the recurse tree we will see that</p>



<figure class="wp-block-image size-large"><img data-attachment-id="589" data-permalink="https://danlark.org/2020-11-10-232345_1330x458_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png" data-orig-size="1330,458" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-10-232345_1330x458_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=1024" alt="" class="wp-image-589" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-10-232345_1330x458_scrot.png 1330w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>This is the geometric series with <img src="https://s0.wp.com/latex.php?latex=cn%281+%2B+9%2F10+%2B+%289%2F10%29%5E2+%2B+%289%2F10%29%5E3+%2B+%5Cldots+...%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="cn(1 + 9/10 + (9/10)^2 + (9/10)^3 + &#92;ldots ...)" class="latex" /> which gives us the result <img src="https://s0.wp.com/latex.php?latex=C%28n%29+%5Cleq+10+c+n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="C(n) &#92;leq 10 c n" class="latex" />.</p>



<p>Actually, this constant 10 is really big. For example, if we look a bit closer, <img src="https://s0.wp.com/latex.php?latex=c&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="c" class="latex" /> is at least 1 because we need to partition the array, then finding median out of 5 elements cannot be done in less than 6 comparisons (can be proven by only brute-forcing) and in 6 comparisons it can be done in the following way</p>



<ol><li>Use three comparisons and shuffle around the numbers so that <img src="https://s0.wp.com/latex.php?latex=a%5B1%5D+%3C+a%5B2%5D%2C+a%5B4%5D+%3C+a%5B5%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[1] &lt; a[2], a[4] &lt; a[5]" class="latex" />, and <img src="https://s0.wp.com/latex.php?latex=a%5B1%5D+%3C+a%5B4%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[1] &lt; a[4]" class="latex" />.</li><li>If <img src="https://s0.wp.com/latex.php?latex=a%5B3%5D+%3E+a%5B2%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[3] &gt; a[2]" class="latex" />, then the problem is fairly easy. If <img src="https://s0.wp.com/latex.php?latex=a%5B2%5D+%3C+a%5B4%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[2] &lt; a[4]" class="latex" />, the median value is the smaller of <img src="https://s0.wp.com/latex.php?latex=a%5B3%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[3]" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=a%5B4%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[4]" class="latex" />. If not, the median value is the smaller of <img src="https://s0.wp.com/latex.php?latex=a%5B2%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[2]" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=a%5B5%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[5]" class="latex" />.</li><li>So <img src="https://s0.wp.com/latex.php?latex=a%5B3%5D+%3C+a%5B2%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[3] &lt; a[2]" class="latex" />. If <img src="https://s0.wp.com/latex.php?latex=a%5B3%5D+%3E+a%5B4%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[3] &gt; a[4]" class="latex" />, then the solution is the smaller of <img src="https://s0.wp.com/latex.php?latex=a%5B3%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[3]" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=a%5B5%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[5]" class="latex" />. Otherwise, the solution is the smaller of <img src="https://s0.wp.com/latex.php?latex=a%5B2%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[2]" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=a%5B4%5D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="a[4]" class="latex" />.</li></ol>



<p>So that maximum <img src="https://s0.wp.com/latex.php?latex=c&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="c" class="latex" /> can be <img src="https://s0.wp.com/latex.php?latex=11%2F5&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="11/5" class="latex" /> and it gives us the upper bound <img src="https://s0.wp.com/latex.php?latex=22n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="22n" class="latex" /> comparisons which looks like it can be achieved. Some other tricks can be done in place to achieve a bit lower constants like <img src="https://s0.wp.com/latex.php?latex=18n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="18n" class="latex" /> (for example, sorting arrays of 5 and comparing less afterwards). In practice, the constant is really big and you can see it from the following demonstration which was even fastened because it took quite a few seconds:</p>



<figure class="wp-block-image size-large"><img data-attachment-id="593" data-permalink="https://danlark.org/median-of-medians-2020-11-09_11-04-33/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-medians-2020-11-09_11.04.33.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="median-of-medians-2020-11-09_11.04.33" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-medians-2020-11-09_11.04.33.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-medians-2020-11-09_11.04.33.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/median-of-medians-2020-11-09_11.04.33.gif?w=1024" alt="" class="wp-image-593" /><figcaption>Median of medians for random input</figcaption></figure>



<h2>HeapSelect</h2>



<p>Another approach to finding <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th element is to create a <a href="https://en.wikipedia.org/wiki/Heap_(data_structure)">heap</a> on an array of size <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> and push other <img src="https://s0.wp.com/latex.php?latex=n+-+k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - k" class="latex" /> elements into this heap. C++ <code>std::partial_sort</code> works that way (with additional heap sorting of the first heap). It shows good results for very small <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> and random/ascending arrays, however starts to significantly degrade with growing <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> and becomes impractical. Best case <img src="https://s0.wp.com/latex.php?latex=O%28n%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n)" class="latex" />, worst <img src="https://s0.wp.com/latex.php?latex=O%28n+%5Clog+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n &#92;log k)" class="latex" />, average <img src="https://s0.wp.com/latex.php?latex=O%28n+%5Clog+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n &#92;log k)" class="latex" />.</p>



<figure class="wp-block-image size-large"><img data-attachment-id="596" data-permalink="https://danlark.org/partial_sort-2020-11-09_12-28-40/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/partial_sort-2020-11-09_12.28.40.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="partial_sort-2020-11-09_12.28.40" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/partial_sort-2020-11-09_12.28.40.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/partial_sort-2020-11-09_12.28.40.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/partial_sort-2020-11-09_12.28.40.gif?w=1024" alt="" class="wp-image-596" /><figcaption>std::partial_sort, two stages, first HeapSelect then heap sort of the first half, accelerated for speed</figcaption></figure>



<h2>IntroSelect</h2>



<p>As the previous algorithm is not very much practical and QuickSelect is really good on average, in 1997 <a href="http://www.cs.rpi.edu/~musser/gp/introsort.ps">&#8220;Introspective Sorting and Selection Algorithms&#8221;</a>  from David Musser came out with a sorting algorithm called &#8220;IntroSelect&#8221;. </p>



<p>IntroSelect works by optimistically starting out with QuickSelect and only switching to MedianOfMedians if it recurses too many times without making sufficient progress. Simply limiting the recursion to constant depth is not good enough, since this would make the algorithm switch on all sufficiently large arrays. Musser discusses a couple of simple approaches:</p>



<ul><li>Keep track of the list of sizes of the subpartitions processed so far. If at any point <img src="https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t" class="latex" />&nbsp;recursive calls have been made without halving the list size, for some small positive <img src="https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t" class="latex" />, switch to the worst-case linear algorithm.</li><li>Sum the size of all partitions generated so far. If this exceeds the list size times some small positive constant <img src="https://s0.wp.com/latex.php?latex=t&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="t" class="latex" />, switch to the worst-case linear algorithm.</li></ul>



<p>This algorithm came into <a href="https://github.com/gcc-mirror/gcc/blob/e0af865ab9d9d5b6b3ac7fdde26cf9bbf635b6b4/libstdc%2B%2B-v3/include/bits/stl_algo.h#L4748">libstdcxx</a> and guess which strategy was chosen? Correct, none of them. Instead, they try <img src="https://s0.wp.com/latex.php?latex=2%5Clog+n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2&#92;log n" class="latex" /> QuickSelect steps and if not successful, fallback to HeapSelect algorithm. So, worst case <img src="https://s0.wp.com/latex.php?latex=O%28n+%5Clog+n%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n &#92;log n)" class="latex" />, average <img src="https://s0.wp.com/latex.php?latex=O%28n%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n)" class="latex" /></p>



<figure class="wp-block-image size-large"><img data-attachment-id="598" data-permalink="https://danlark.org/nth-element-gcc-2020-11-09_11-06-37/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/nth-element-gcc-2020-11-09_11.06.37.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="nth-element-gcc-2020-11-09_11.06.37" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/nth-element-gcc-2020-11-09_11.06.37.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/nth-element-gcc-2020-11-09_11.06.37.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/nth-element-gcc-2020-11-09_11.06.37.gif?w=1024" alt="" class="wp-image-598" /><figcaption>std::nth_element in libstdcxx, &#8220;IntroSelect&#8221;</figcaption></figure>



<h2>PDQSelect</h2>



<p>Now that most of the known algorithms come to an end 😈, we can start looking into something special and extraordinary. And the first one to look at is pdqselect which comes pretty straightforward from <a href="https://github.com/orlp/pdqsort">pdqsort</a>, the algorithm is basically QuickSelect but with some interesting ideas on how to choose an appropriate pivot:</p>



<ol><li>If there are <img src="https://s0.wp.com/latex.php?latex=n+%3C+24&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n &lt; 24" class="latex" /> elements, use <a href="https://en.wikipedia.org/wiki/Insertion_sort">insertion sort</a> to partition or even sort them. As insertion sort is really fast for a small amount of elements, it is reasonable</li><li>If it is more, choose <img src="https://s0.wp.com/latex.php?latex=p&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="p" class="latex" /> &#8212; pivot:<ol><li>If there are less or equal than 128 elements, choose pseudomedian (or &#8220;ninther&#8221;, or median of medians which are all them same) of the following 3 groups:<ol><li>begin, mid, end</li><li>begin + 1, mid &#8211; 1, end &#8211; 1</li><li>begin + 2, mid + 1, end &#8211; 2</li></ol></li><li>If there are more than 128 elements, choose median of 3 from begin, mid, end</li></ol></li><li>Partition the array by the chosen pivot with avoiding <a href="https://www.researchgate.net/publication/301614727_BlockQuicksort_How_Branch_Mispredictions_don't_affect_Quicksort">branches</a>:<ol><li>The partition is called bad if it splits less than <img src="https://s0.wp.com/latex.php?latex=1%2F8n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="1/8n" class="latex" /> elements</li><li>If the total number of bad partitions exceeds <img src="https://s0.wp.com/latex.php?latex=%5Clog+n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;log n" class="latex" />, use <code>std::nth_element</code> or any other fallback algorithm and return</li><li>Otherwise, try to defeat some patterns in the partition by (sizes are l_size and r_size respectively):<ol><li>Swapping begin, begin + l_size / 4</li><li>Swapping p &#8211; 1 and p &#8211; l_size / 4</li><li>And if the number of elements is more than 128<ol><li>begin + 1, begin + l_size / 4 + 1</li><li>begin + 2, begin + l_size / 4 + 2</li><li>p &#8211; 2, p &#8211; l_size / 4 + 1</li><li>p &#8211; 3, p &#8211; l_size / 4 + 2</li></ol></li><li>Do the same with the right partition</li></ol></li></ol></li><li>Choose the right partition part and repeat like in QuickSelect</li></ol>



<figure class="wp-block-image size-large"><img data-attachment-id="605" data-permalink="https://danlark.org/pdqselect-2020-11-09_11-03-23/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/pdqselect-2020-11-09_11.03.23.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="pdqselect-2020-11-09_11.03.23" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/pdqselect-2020-11-09_11.03.23.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/pdqselect-2020-11-09_11.03.23.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/pdqselect-2020-11-09_11.03.23.gif?w=1024" alt="" class="wp-image-605" /><figcaption>pdqselect on random input</figcaption></figure>



<h2>Median Of Ninthers or Alexandrescu algorithm</h2>



<p>For a long time, there were no practical improvements in finding <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th element, and only in 2017 very well recognized among C++ community Andrei Alexandrescu published a paper on <a href="https://erdani.com/research/sea2017.pdf">Fast Deterministic Selection</a> where worst case median algorithm becomes practical and can be used in real code.</p>



<p>There are 2 key ideas:</p>



<ul><li>We now find the pseudomedian (or ninther, or median of medians which are all the same) of 9 elements as it was done similarly in pdqsort. Use that partition when <img src="https://s0.wp.com/latex.php?latex=n+%2F+6+%5Cleq+k+%5Cleq+5n%2F6&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n / 6 &#92;leq k &#92;leq 5n/6" class="latex" /></li></ul>



<figure class="wp-block-image size-large"><img data-attachment-id="608" data-permalink="https://danlark.org/2020-11-11-011822_1115x344_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png" data-orig-size="1115,344" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-11-011822_1115x344_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=1024" alt="" class="wp-image-608" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011822_1115x344_scrot.png 1115w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<ul><li>Introduce MedianOfMinima for <img src="https://s0.wp.com/latex.php?latex=k+%3C+n%2F6&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k &lt; n/6" class="latex" />. MedianOfMedians computes medians of small groups and takes their median to find a pivot approximating the median of the array. In this case, we pursue an order statistic skewed to the left, so instead of the median of each group, we compute its minimum; then, we obtain the pivot by computing the median of those groupwise minima.</li></ul>



<figure class="wp-block-image size-large"><img data-attachment-id="610" data-permalink="https://danlark.org/2020-11-11-011920_1127x457_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png" data-orig-size="1127,457" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-11-011920_1127x457_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=1024" alt="" class="wp-image-610" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-011920_1127x457_scrot.png 1127w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p><img src="https://s0.wp.com/latex.php?latex=n%2F6&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n/6" class="latex" /> is not chosen arbitrarily because in order to preserve the linearity of the algorithm we need to make sure that while recursing on <img src="https://s0.wp.com/latex.php?latex=2k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2k" class="latex" /> elements we partition more than <img src="https://s0.wp.com/latex.php?latex=2k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2k" class="latex" /> elements and thus <img src="https://s0.wp.com/latex.php?latex=n%2F2+-+k+%3E+2k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n/2 - k &gt; 2k" class="latex" />. MedianOfMaxima is done the same way and for <img src="https://s0.wp.com/latex.php?latex=k+%3E+5n%2F6&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k &gt; 5n/6" class="latex" />. The resulting algorithm turns out to be the following</p>



<figure class="wp-block-image size-large"><img data-attachment-id="612" data-permalink="https://danlark.org/2020-11-11-012412_1181x570_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png" data-orig-size="1181,570" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-11-012412_1181x570_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=1024" alt="" class="wp-image-612" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-012412_1181x570_scrot.png 1181w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>Turns out it is a better algorithm than all above (except it did not know about pdqselect) and shows good results. My advice that if you need a deterministic worst-case linear algorithm this one is the best (we will talk about a couple of more randomized algorithms later).</p>



<figure class="wp-block-image size-large"><img data-attachment-id="614" data-permalink="https://danlark.org/median-of-ninthers-2020-11-09_11-05-25/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-ninthers-2020-11-09_11.05.25.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="median-of-ninthers-2020-11-09_11.05.25" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-ninthers-2020-11-09_11.05.25.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/median-of-ninthers-2020-11-09_11.05.25.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/median-of-ninthers-2020-11-09_11.05.25.gif?w=1024" alt="" class="wp-image-614" /><figcaption>QuickSelect adaptive on random data</figcaption></figure>



<h2>Small k</h2>



<p>All these algorithms are good and linear but they require lots of comparisons, like, minimum <img src="https://s0.wp.com/latex.php?latex=2n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2n" class="latex" /> for all <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />. However, I know a good algorithm for <img src="https://s0.wp.com/latex.php?latex=k+%3D+2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k = 2" class="latex" /> which requires only <img src="https://s0.wp.com/latex.php?latex=n+%2B+%5Clceil+%5Clog+n+%5Crceil+-+2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n + &#92;lceil &#92;log n &#92;rceil - 2" class="latex" /> comparisons (I am also not going to prove it is minimal but it is). Let&#8217;s quickly revise how it works.</p>



<p>For finding a minimum you just compare linearly the winner with all others and basically the second place can be anyone who lost to the winner, so we need to compare them within each other. Unfortunately, the winner may have won linear number of others and we will not get the desired amount of comparisons. To mitigate this, we need to make a knockout tournament where the winner only plays <img src="https://s0.wp.com/latex.php?latex=%5Clog+n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;log n" class="latex" /> games like that:</p>



<figure class="wp-block-image size-large"><img data-attachment-id="618" data-permalink="https://danlark.org/2020-11-11-014957_1515x388_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png" data-orig-size="1515,388" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-11-014957_1515x388_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=1024" alt="" class="wp-image-618" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-014957_1515x388_scrot.png 1515w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>And all we need to do next is to compare all losers to the winner</p>



<figure class="wp-block-image size-large"><img data-attachment-id="620" data-permalink="https://danlark.org/imageedit_7_4753659525/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png" data-orig-size="1515,388" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="imageedit_7_4753659525" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=1024" alt="" class="wp-image-620" srcset="https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/imageedit_7_4753659525.png 1515w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>And any of them can be the second. And we use only <img src="https://s0.wp.com/latex.php?latex=%5Clog+n+-+1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;log n - 1" class="latex" /> comparisons for that.</p>



<p>What can we do to find the third and other elements? Possibly not optimal in comparison count but at least not so bad can follow the strategy:</p>



<p>First set up a binary tree for a knockout tournament on <img src="https://s0.wp.com/latex.php?latex=n+-+k+%2B+2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - k + 2" class="latex" /> items. (This takes <img src="https://s0.wp.com/latex.php?latex=n+-+k+%2B+1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - k + 1" class="latex" /> comparisons.) The largest item is greater than <img src="https://s0.wp.com/latex.php?latex=n+-+k+%2B+1&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - k + 1" class="latex" /> others, so it can’t be <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th largest. Replace it, where it appears at an external node of the tree, by one of the <img src="https://s0.wp.com/latex.php?latex=k+-+2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k - 2" class="latex" /> elements held in reserve, and find the largest element of the resulting <img src="https://s0.wp.com/latex.php?latex=n+-+k+%2B+2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - k + 2" class="latex" />; this requires at most <img src="https://s0.wp.com/latex.php?latex=%5Clog%28n+%2B+2+-+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;log(n + 2 - k)" class="latex" /> comparisons because we need to recompute only one path in the tree. Repeat this operation <img src="https://s0.wp.com/latex.php?latex=k+-+2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k - 2" class="latex" /> times in all, for each element held in reserve.</p>



<p>It will give us the estimation of <img src="https://s0.wp.com/latex.php?latex=n+-+k+%2B+%28k+-+1%29%5Clog+%28n+%2B+2+-+k%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n - k + (k - 1)&#92;log (n + 2 - k)" class="latex" /> comparisons. Assume you need to find top 10 out of millions of long strings and this might be a good solution to this instead of comparing at least <img src="https://s0.wp.com/latex.php?latex=2n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2n" class="latex" /> times.<em> However, it requires additional memory to remember the path of the winner and I currently do not know how to remove it thus making the algorithm impractical because of allocations or additional level of indirections.</em></p>



<p>At that time my knowledge of selection algorithms ended and I decided to address one known guy.</p>



<div class="wp-block-image"><figure class="aligncenter"><img src="https://www.premiosfronterasdelconocimiento.es/wp-content/uploads/sites/2/2019/06/FBBVA-10-tic-Knuth.jpg" alt="Donald E. Knuth - Premios Fronteras" /><figcaption>You know him</figcaption></figure></div>



<p>In The Art of Computer Programming, Volume 3, Sorting and Searching I read almost 100-150 pages in order to understand what the world knows about minimal comparison sorting and selection algorithms and found a pretty interesting one called Floyd-Rivest algorithm. Actually, even Alexandrescu paper cites it but in an unusual way:</p>



<figure class="wp-block-image size-large"><img data-attachment-id="627" data-permalink="https://danlark.org/2020-11-11-023517_1100x120_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png" data-orig-size="1100,120" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-11-023517_1100x120_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=1024" alt="" class="wp-image-627" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-023517_1100x120_scrot.png 1100w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<h2>Floyd-Rivest algorithm</h2>



<p>This algorithm is essentially the following</p>



<p>The general steps are:</p>



<ol><li>Select a small random sample <img src="https://s0.wp.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="S" class="latex" />&nbsp;of size <img src="https://s0.wp.com/latex.php?latex=n%5E%7B2%2F3%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n^{2/3}" class="latex" />&nbsp;from the array.</li><li>From&nbsp;<img src="https://s0.wp.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="S" class="latex" />, recursively select two elements,&nbsp;<img src="https://s0.wp.com/latex.php?latex=u&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u" class="latex" />&nbsp;and&nbsp;<img src="https://s0.wp.com/latex.php?latex=v&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v" class="latex" />, such that <img src="https://s0.wp.com/latex.php?latex=u+%3C+v&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u &lt; v" class="latex" /> (essentially they take <img src="https://s0.wp.com/latex.php?latex=k+-+n%5E%7B1%2F3%7D%5Clog%5E%7B1%2F2%7Dn&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k - n^{1/3}&#92;log^{1/2}n" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=k+%2B+n%5E%7B1%2F3%7D%5Clog%5E%7B1%2F2%7Dn&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k + n^{1/3}&#92;log^{1/2}n" class="latex" />). These two elements will be the&nbsp;<strong>pivots</strong>&nbsp;for the partition and are expected to contain the<em> </em><img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th smallest element of the entire list between them.</li><li>Using&nbsp;<img src="https://s0.wp.com/latex.php?latex=u&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u" class="latex" />&nbsp;and&nbsp;<img src="https://s0.wp.com/latex.php?latex=v&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v" class="latex" />, partition&nbsp;<img src="https://s0.wp.com/latex.php?latex=S&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="S" class="latex" />&nbsp;into three sets (less, more and between).</li><li>Partition the remaining elements in<em> </em>the array&nbsp;by comparing them to&nbsp;<em>u</em>&nbsp;or&nbsp;<em>v</em>&nbsp;and placing them into the appropriate set. If&nbsp;<img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />&nbsp;is smaller than half the number of the elements in&nbsp;the array, then the remaining elements should be compared to&nbsp;<img src="https://s0.wp.com/latex.php?latex=v&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v" class="latex" />&nbsp;first and then only to&nbsp;<img src="https://s0.wp.com/latex.php?latex=u&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u" class="latex" />&nbsp;if they are smaller than&nbsp;<img src="https://s0.wp.com/latex.php?latex=v&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v" class="latex" />. Otherwise, the remaining elements should be compared to&nbsp;<img src="https://s0.wp.com/latex.php?latex=u&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u" class="latex" />&nbsp;first and only to&nbsp;<img src="https://s0.wp.com/latex.php?latex=v&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="v" class="latex" />&nbsp;if they are greater than&nbsp;<img src="https://s0.wp.com/latex.php?latex=u&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="u" class="latex" />.</li><li>Apply the algorithm recursively to the appropriate set to select the&nbsp;<img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th smallest in the array.</li></ol>



<p>Then in 2004 it was <a href="https://core.ac.uk/download/pdf/82672439.pdf">proven</a> that this method (slighly modified in bound selection) will have <img src="https://s0.wp.com/latex.php?latex=n+%2B+%5Cmin%28k%2C+n+-+k%29+%2B+O%28%5Csqrt%7Bn+%5Clog+n%7D%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n + &#92;min(k, n - k) + O(&#92;sqrt{n &#92;log n})" class="latex" /> comparisons with probability at least <img src="https://s0.wp.com/latex.php?latex=1+-+2n%5E%7B-1%2F2%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="1 - 2n^{-1/2}" class="latex" /> (and the constant in the power can be tuned).</p>



<p>This algorithm tries to find the appropriate subsamples and proves that the <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" />th element will be there with high probability.</p>



<figure class="wp-block-image size-large"><img data-attachment-id="632" data-permalink="https://danlark.org/floyd-rivest-2020-11-09_11-03-59/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/floyd-rivest-2020-11-09_11.03.59.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="floyd-rivest-2020-11-09_11.03.59" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/floyd-rivest-2020-11-09_11.03.59.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/floyd-rivest-2020-11-09_11.03.59.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/floyd-rivest-2020-11-09_11.03.59.gif?w=1024" alt="" class="wp-image-632" /><figcaption>Floyd-Rivest median on random data</figcaption></figure>



<figure class="wp-block-image size-large"><img data-attachment-id="636" data-permalink="https://danlark.org/median_of_medians-2020-11-09_12-37-26/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/median_of_medians-2020-11-09_12.37.26.gif" data-orig-size="1720,1034" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="median_of_medians-2020-11-09_12.37.26" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/median_of_medians-2020-11-09_12.37.26.gif?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/median_of_medians-2020-11-09_12.37.26.gif?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/median_of_medians-2020-11-09_12.37.26.gif?w=1024" alt="" class="wp-image-636" /><figcaption>Floyd-Rivest k=n/10 order statistics</figcaption></figure>



<p>Yet the worst case of the algorithm is still <img src="https://s0.wp.com/latex.php?latex=O%28n%5E2%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="O(n^2)" class="latex" /> but it tries to optimize the minimum amount of comparisons on average case, not the worst case.</p>



<p>For small <img src="https://s0.wp.com/latex.php?latex=k&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k" class="latex" /> it is really good as it only used <img src="https://s0.wp.com/latex.php?latex=n+%2B+k+%2B+o%28n%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="n + k + o(n)" class="latex" /> comparisons which is significantly better than all &#8220;at least <img src="https://s0.wp.com/latex.php?latex=2n&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2n" class="latex" /> comparisons&#8221; algorithms and even for median it is <img src="https://s0.wp.com/latex.php?latex=3n%2F2&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="3n/2" class="latex" /> which is significantly better.</p>



<h2>What it resulted in</h2>



<p>So I decided to code all these algorithms with the C++ standard API and to test against each other and possibly submit to something performance heavy as DBMS for ORDER BY LIMIT N clauses.</p>



<p>I ended up doing <a href="https://github.com/danlark1/miniselect">miniselect</a> library. For now, it is header-only but I don&#8217;t guarantee that in the future, it contains almost all algorithms except the tournament one which is very hard to do in the general case.</p>



<p>I tested on <code>Intel(R) Core(TM) i5-4200H CPU @ 2.80GHz</code> (yeah, a bit old, sorry). We are going to find median and 1000th elements out of <img src="https://s0.wp.com/latex.php?latex=10%5E7&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="10^7" class="latex" /> and <img src="https://s0.wp.com/latex.php?latex=2%5E16&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="2^16" class="latex" /> arrays. Benchmark data:</p>



<ul><li>Ascending</li><li>Descending</li><li>Median3Killer which I described above</li><li>PipeOrgan: half of the values are ascending, the other half is descending</li><li>PushFront: ascending but the smallest element added to the end</li><li>PushMiddle: ascending but the middle element added to the end</li><li>Random</li><li>Random01: random only with values 0 and 1</li><li>Shuffled16: random only with 16 overall values</li></ul>



<p>You can see other values also <a href="https://drive.google.com/drive/folders/1DHEaeXgZuX6AJ9eByeZ8iQVQv0ueP8XM">here</a>.</p>



<div class="wp-block-jetpack-slideshow alignwide" data-effect="slide"><div class="wp-block-jetpack-slideshow_container swiper-container"><ul class="wp-block-jetpack-slideshow_swiper-wrapper swiper-wrapper"><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-651" data-id="651" src="https://danlarkorg.files.wordpress.com/2020/11/result_10000000_1000-2.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-650" data-id="650" src="https://danlarkorg.files.wordpress.com/2020/11/result_10000000_5000000-2.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-652" data-id="652" src="https://danlarkorg.files.wordpress.com/2020/11/result_comparisons_10000000_1000-2.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-653" data-id="653" src="https://danlarkorg.files.wordpress.com/2020/11/result_comparisons_10000000_5000000-2.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-655" data-id="655" src="https://danlarkorg.files.wordpress.com/2020/11/result_65536_1000.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-656" data-id="656" src="https://danlarkorg.files.wordpress.com/2020/11/result_65536_32768.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-657" data-id="657" src="https://danlarkorg.files.wordpress.com/2020/11/result_comparisons_65536_1000.png" /></figure></li><li class="wp-block-jetpack-slideshow_slide swiper-slide"><figure><img alt="" class="wp-block-jetpack-slideshow_image wp-image-658" data-id="658" src="https://danlarkorg.files.wordpress.com/2020/11/result_comparisons_65536_32768.png" /></figure></li></ul><a class="wp-block-jetpack-slideshow_button-prev swiper-button-prev swiper-button-white" role="button"></a><a class="wp-block-jetpack-slideshow_button-next swiper-button-next swiper-button-white" role="button"></a><a aria-label="Pause Slideshow" class="wp-block-jetpack-slideshow_button-pause" role="button"></a><div class="wp-block-jetpack-slideshow_pagination swiper-pagination swiper-pagination-white"></div></div></div>



<p>As you see, Floyd-Rivest outperforms in time all other algorithms and even Alexandrescu in most of the cases. One downside is that Floyd-Rivest performs worse on data where there are many equal elements, that is expected and probably can be fixed as it is pointed out in <a href="https://core.ac.uk/download/pdf/82672439.pdf">Kiwiel&#8217;s paper</a>.</p>



<h2>Real World</h2>



<p>This effort also resulted in a <a href="https://github.com/ClickHouse/ClickHouse/pull/16825">patch</a> to <a href="https://clickhouse.tech/">ClickHouse</a> where we got the following <a href="https://clickhouse-test-reports.s3.yandex.net/16825/811c3e5cd18a19200edcf720b70754fc0536c07b/performance_comparison/report.html#fail1">benchmarks</a>:</p>



<figure class="wp-block-image size-large"><img data-attachment-id="661" data-permalink="https://danlark.org/2020-11-11-032224_1440x796_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png" data-orig-size="1440,796" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2020-11-11-032224_1440x796_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=1024" src="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=1024" alt="" class="wp-image-661" srcset="https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=1024 1024w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png?w=768 768w, https://danlarkorg.files.wordpress.com/2020/11/2020-11-11-032224_1440x796_scrot.png 1440w" sizes="(max-width: 1024px) 100vw, 1024px" /><figcaption>Time in the first columns in seconds</figcaption></figure>



<p>As you see, most of the queries have ORDER BY LIMIT N. Topmost queries were significantly optimized because <code>std::partial_sort</code> works worst when the data is descending, real-world usage as you can see in string_sort benchmarks was optimized by 10-15%. Also, there were many queries that have been optimized by 5% where sorting is not the bottleneck but still, it is nice. It ended up with a <strong>2.5%</strong> overall performance boost across all queries.</p>



<p>Other algorithms showed worse performance and you can see it from the benchmarks above. So, now Floyd-Rivest is used in production, and for a good reason. But, of course, it does not diminish the results of Mr. Alexandrescu and his contributions are very valuable when it comes to determinism and worst case.</p>



<h2>Library</h2>



<p>I publish <a href="https://github.com/danlark1/miniselect">miniselect</a> under Boost License for everybody to use and adjust to the projects. I spent several days in the last week trying to make it work everywhere (except Windows for now but it should be good). We support C++11 and further with GCC 7+ and Clang 6+. We carefully test the algorithms for standard compliance, with fuzzing, sanitizers, etc. Fuzzing managed to find bugs that I thought only at the last moment so it really helps.</p>



<p>If you want to use it, please read the API but it should be an easy <code>sed</code>/<code>perl</code>/<code>regex</code> replacement with zero issues except for the ties between elements might resolve in a different way, however, C++ standard says it is ok and you should not rely on that.</p>



<p>Any contributions and other algorithms that I might miss are highly welcome, I intend to get this library as a reference to many implementations of selection algorithms so that the users can try and choose the best options for them. (Should I do the same for sorting algorithms?)</p>



<h2>Conclusion</h2>



<p>This was a long story of me trying to find the solutions to the problems that were puzzling me for the past month or so. I probably read 300+ pages of math/algorithms/discussions/forums to finally find everything that the world knows about it and to make it work for real-world applications with huge performance benefits. I think I can come up with something better in a long term after researching this stuff for a while longer but I will let you know in this blog if anything arises 🙂</p>



<h4>References I find useful (together with zero mentions in the post) </h4>



<ol><li><a href="http://www.cs.uwm.edu/faculty/ad/select.pdf">Selection Algorithms with Small Groups</a> where the Median of Medians algorithms are a little bit twisted for groups of size 3 and 4 with linear time worst-case guarantee</li><li><a href="https://core.ac.uk/download/pdf/82672439.pdf">Understandable proof of second biggest element</a> minimum comparisons bound</li><li><a href="https://core.ac.uk/download/pdf/82672439.pdf">On Floyd and Rivest’s SELECT algorithm</a> by Krzysztof C. Kiwiel</li><li>The Art of Computer Programming, Volume 3, Sorting and Searching, Chapter about Minimum-Comparison Sorting and Selection</li><li><a href="http://people.csail.mit.edu/rivest/pubs/FR75b.pdf">Original Floyd-Rivest paper</a> </li><li><a href="https://www.math.ucsd.edu/~sbuss/CourseWeb/Math261C_2014S/2014_04_02_Marco_FRSelect.pdf">Simple proof</a> of a bit worse bound for Floyd-Rivest algorithm by Sam Buss</li><li><a href="https://erdani.com/research/sea2017.pdf">Fast Deterministic Selection</a> by Andrei Alexandrescu</li><li><a href="https://web.cs.wpi.edu/~hofri/medsel.pdf">An efficient algorithm for the approximate median selection problem</a> by S. Battiato, D. Cantone, D. Catalano and G. Cincotti</li><li><a href="https://www.yumpu.com/en/document/view/2165810/an-efficient-implementation-of-blum-floyd-pratt-rivest-moonflare">An efficient implementation of Blum, Floyd, Pratt, Rivest, and Tarjan’s worst-case linear selection algorithm</a></li><li><a href="https://pdf.sciencedirectassets.com/271538/1-s2.0-S0304397500X00357/1-s2.0-0304397595002251/main.pdf?X-Amz-Security-Token=IQoJb3JpZ2luX2VjEBoaCXVzLWVhc3QtMSJGMEQCIHyTLl7ECCIse%2BIyy5pVafANw%2Bdsfl%2BKJK%2BkBlReoEuRAiAEPdXeMENfFwoPWJILAPN21Q4R5ev3aVUaUaanaCxWnCq9AwiC%2F%2F%2F%2F%2F%2F%2F%2F%2F%2F8BEAMaDDA1OTAwMzU0Njg2NSIMmFsIn%2BYz%2FYKeDoBbKpED7fSpH6vh8fKNnwVTRo0YGHbh8vaMCgFfVYApW3EZv0lhtIUuHmELYT47qx9zG9JaMVpIPtesrbfnqngJLD6aT%2Bsn2%2BSQEPVNjBpVOS6HVxyo9T2sPrcmaoiyQcVF5F4kX3Y49MyTVnrNKdjue6eUKrBhsEkYE6RAhyKebsDpOiLjq8KAgCN4e%2FPBfZhJhDU%2FAt4mh26u6eBpTNpqTytvdCcsb7E%2B0uYdmA52oZEKN5KDXw8xmzM9usjfElvnXNe0lMcI8aD%2F6zLqk4kOtPXUvOiJEWCU7e%2BidRnBQRY%2BeAiAiyUhZrcGxT9D6wXdi9AY3DwYOag2vqUxoL89DuWh8OSiMSsHzl%2Fz4cTMTLPfnMohOMSqGmUWsSkP6NjLfrOHeD3xhhUXK6vfBAu0nbGLKw1Ky%2BwP1VTpJWpX1bxcKtMzKgVfvcwXXDhlgGi6aZevLCV43Jc%2F7GmRolo0rfn7dJ34LGADTyGW2GoPRT89SkIuwnLMRL9iOda6gkbj8AQu298hBJ4NZvvP%2B%2BS73pbde2Ywufes%2FQU67AFGzLToXD%2BZMsM4IgBuyg2VT2BJe4SAZAJRdjt8UpMmfsRXFF0mseyGxXov85SUG%2FTqktGHO4bd3iTxBnQfXWKW2yoLMqoJaVkiGrim41lCvAWjxNhjSmTjJi%2FH1YTqINR10Y%2B6z9oa9FHnI8lPfaYVUA3XaDvqvTmvj1%2FLPgAzImImN0vhbC25Gqsu%2BcRWFqsC11s5ZG6cCdUkTvTuNJVWV9v%2BEmk669fZVxTmOLE2gp761oa1UrE%2B%2BAiFDt%2FTfZfk3M8iyyabf%2Bzh3Pa32QNS2OwTKH0Qj5WEBa6Tm5JCIvu%2F2D8UpHw8wkR%2BiQ%3D%3D&amp;X-Amz-Algorithm=AWS4-HMAC-SHA256&amp;X-Amz-Date=20201111T030451Z&amp;X-Amz-SignedHeaders=host&amp;X-Amz-Expires=300&amp;X-Amz-Credential=ASIAQ3PHCVTY5GVWCQFK%2F20201111%2Fus-east-1%2Fs3%2Faws4_request&amp;X-Amz-Signature=061a4b10958c92d3185624598c0aa819cadf8a8f9c4921559687067b22cd5461&amp;hash=88c95693aa6239d3ae74411d89b40165107bf4605c56a0641efa879c3a775b05&amp;host=68042c943591013ac2b2430a89b270f6af2c76d8dfd086a07176afe7c76c2c61&amp;pii=0304397595002251&amp;tid=spdf-a13af2b7-d4d2-4db0-807f-67ce504d3b8c&amp;sid=5c5a21f39f882947952841c747512811d391gxrqb&amp;type=client">Selection from read-only memory and sorting with minimum data movement</a> by J. Ian Munro, Venkatesh Raman</li></ol>



<figure class="wp-block-embed aligncenter is-type-rich is-provider-twitter wp-block-embed-twitter"><div class="wp-block-embed__wrapper">
<div class="embed-twitter"><blockquote class="twitter-tweet" data-width="500" data-dnt="true"><p lang="en" dir="ltr">Today I present miniselect &#8212; generic C++ library for various selection algorithms <a href="https://t.co/NAJmqMOE8a">https://t.co/NAJmqMOE8a</a>. Aaand post about my efforts and research in that area <a href="https://t.co/YrNgZshrAJ">https://t.co/YrNgZshrAJ</a>. Many insights and gifs are included!</p>&mdash; Danila Kutenin (@Danlark1) <a href="https://twitter.com/Danlark1/status/1326371416934068224?ref_src=twsrc%5Etfw">November 11, 2020</a></blockquote><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script></div>
</div></figure>
]]></html><thumbnail_url><![CDATA[https://danlarkorg.files.wordpress.com/2020/11/result_comparisons_65536_32768.png?fit=440%2C330]]></thumbnail_url><thumbnail_width><![CDATA[180]]></thumbnail_width><thumbnail_height><![CDATA[329]]></thumbnail_height></oembed>