<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Experimental chill]]></provider_name><provider_url><![CDATA[http://danlark.org]]></provider_url><author_name><![CDATA[Danila Kutenin]]></author_name><author_url><![CDATA[https://danlark.org/author/kutdanila/]]></author_url><title><![CDATA[How a Bug(?) in the Linux CRC-32 Checksum Turned out not to be a&nbsp;Bug]]></title><type><![CDATA[link]]></type><html><![CDATA[
<p>A friend of mine nerd sniped me with a good story to investigate which kinda made me giggle and want to tell to the public some cool things happened to me during my experience with hashing and other stuff.</p>



<p>The story is simple and complex at the same time, I&#8217;d try to stick to a simple version with the links for those who do understand the underlying computations better.</p>



<p>A Cyclic Redundancy Check (CRC) is the remainder of binary division of a potentially long message, by a CRC polynomial. This technique is employed in network and storage applications due to its effectiveness at detecting errors. It has some good properties as linearity. </p>



<p>Speaking in mathematical language CRC is calculated in the following way:</p>



<p><img src="https://s0.wp.com/latex.php?latex=CRC+%28M+%28x%29%29+%3D+x%5E%7Bdeg%28P%28x%29%29%7D+%5Ccdot+M%28x%29+%5C++%5Cmathrm%7B+mod%7D+%5C++P+%28x%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="CRC (M (x)) = x^{deg(P(x))} &#92;cdot M(x) &#92;  &#92;mathrm{ mod} &#92;  P (x)" class="latex" /></p>



<p>where the polynomial <img src="https://s0.wp.com/latex.php?latex=P%28x%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="P(x)" class="latex" /> defines the CRC algorithm and the symbol <img src="https://s0.wp.com/latex.php?latex=%5Ccdot+&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="&#92;cdot " class="latex" /> denotes <a href="https://en.wikipedia.org/wiki/Carry-less_product">carry-less multiplication</a>. Most of the times the polynomial <img src="https://s0.wp.com/latex.php?latex=P%28x%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="P(x)" class="latex" /> is irreducible, it just leads to fewer errors in some applications. CRC-32 is basically a CRC with <img src="https://s0.wp.com/latex.php?latex=P%28x%29&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="P(x)" class="latex" /> of degree 32 over Galois Field GF(2). Nevermind, it still does not matter much, mostly for the context.</p>



<p>x86 processors for quite some time have instructions <a href="https://en.wikipedia.org/wiki/CLMUL_instruction_set">CLMUL</a> which are responsible for multiplication in GF(2) and can be enabled with <code>-mplcmul</code> in your C/C++ compilers. They are useful, for example, in <a href="https://d-nb.info/1172811326/34">erasure coding</a> to speed up the recovery from the parity parts. As CRC uses something similar to this, there are algorithms that speed up the CRC computation with such instructions.</p>



<p>The main and almost the only one source of knowledge how to do it properly with other SIMD magic is <a href="https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf">Fast CRC Computation for Generic Polynomials Using PCLMULQDQ Instruction</a> from Intel which likely hasn&#8217;t been changed since 2009, it has lots of pictures and it is nice to read when you understand the maths. But even if you don&#8217;t, it has a well defined algorithm which you can implement and check with other naive approaches that it is correct.</p>



<p>When it tries to define the final result, it uses so called scary <a href="https://en.wikipedia.org/wiki/Barrett_reduction">Barrett reduction</a> which is basically the subtraction of the inverse times the value. Just some maths.</p>



<figure class="wp-block-image size-large"><img data-attachment-id="695" data-permalink="https://danlark.org/2021-03-08-110917_951x651_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png" data-orig-size="951,651" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2021-03-08-110917_951x651_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?w=951" src="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?w=951" alt="" class="wp-image-695" srcset="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png 951w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?w=768 768w" sizes="(max-width: 951px) 100vw, 951px" /></figure>



<p>Still not so relevant though. But here comes the most interesting part. In the guide there are some constants for that reduction for <a href="https://en.wikipedia.org/wiki/Gzip">gzip</a> CRC. They are looking this way</p>



<figure class="wp-block-image size-large"><img data-attachment-id="698" data-permalink="https://danlark.org/2021-03-08-111339_781x396_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png" data-orig-size="781,396" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2021-03-08-111339_781x396_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png?w=781" src="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png?w=781" alt="" class="wp-image-698" srcset="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png 781w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png?w=300 300w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111339_781x396_scrot.png?w=768 768w" sizes="(max-width: 781px) 100vw, 781px" /><figcaption>Look at k6&#8242; and u&#8217;</figcaption></figure>



<p>If we look closely, we would notice that k6&#8242; and u&#8217; are ending in <code>640 and 641</code> respectively. So far, so good, yet, in the Linux kernel the constants are slightly different, let me show you</p>



<figure class="wp-block-image size-large"><img data-attachment-id="700" data-permalink="https://danlark.org/2021-03-08-111836_622x169_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png" data-orig-size="622,169" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2021-03-08-111836_622x169_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png?w=622" src="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png?w=622" alt="" class="wp-image-700" srcset="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png 622w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-111836_622x169_scrot.png?w=300 300w" sizes="(max-width: 622px) 100vw, 622px" /><figcaption><a href="https://github.com/torvalds/linux/blob/v5.11/arch/x86/crypto/crc32-pclmul_asm.S#L72" rel="nofollow">https://github.com/torvalds/linux/blob/v5.11/arch/x86/crypto/crc32-pclmul_asm.S#L72</a></figcaption></figure>



<p>It is stated to be written from the guide in the header, so they should be same. The constant is <code>0x1DB710641</code> vs <code>0x1DB710640</code> stated in the guide, the off by 1 but with the same 3 digits in the end as u&#8217;.</p>



<p>Two is that there&#8217;s also this line a little earlier on the same page: <img src="https://s0.wp.com/latex.php?latex=P%28x%29%27&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="P(x)&#039;" class="latex" /> = 0x1DB710641 This <img src="https://s0.wp.com/latex.php?latex=P%28x%29%27&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="P(x)&#039;" class="latex" /> number differs from <img src="https://s0.wp.com/latex.php?latex=k6%27&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k6&#039;" class="latex" /> only in the final bit: 0x41 vs 0x40. This isn&#8217;t coincidental. Calculating <img src="https://s0.wp.com/latex.php?latex=k6%27&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="k6&#039;" class="latex" /> (in the Galois Field GF(2) space) means dividing (i.e. bitwise XOR-ing) a 33-bit polynomial (<img src="https://s0.wp.com/latex.php?latex=x%5E%7B32%7D&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="x^{32}" class="latex" />, with only one &#8216;on&#8217; bit) by the 33-bit polynomial <img src="https://s0.wp.com/latex.php?latex=P%28x%29%27&#038;bg=ffffff&#038;fg=000&#038;s=0&#038;c=20201002" alt="P(x)&#039;" class="latex" />.</p>



<p>I thought for a while this is a legitimate bug. But wait a second, how can this be even possible given that CRC is used worldwide.</p>



<p>0x1DB710640 camp:</p>



<ul><li><a href="https://github.com/srijs/rust-crc32fast/blob/0ec13c8b3fd90a31bb18d5338d9148e1a73b9c39/src/specialized/pclmulqdq.rs#L64">Rust</a></li><li>Intel with a guide and <a href="https://github.com/intel/soft-crc/blob/34a84bfd8278ff48e6aa67f0746618242266c8a2/crc_ether.c#L43">code</a></li><li>Other &#8220;clean&#8221; implementations</li></ul>



<p>0x1DB710641 camp:</p>



<ul><li><a href="https://github.com/torvalds/linux/blob/v5.11/arch/x86/crypto/crc32-pclmul_asm.S#L72">Linux</a></li><li><a href="https://source.chromium.org/chromium/chromium/src/+/master:third_party/zlib/crc32_simd.c;l=36;drc=9d4ec9349a1bf609eedb917c44c69eb0df9ff6bb">Zlib</a></li><li><a href="https://github.com/golang/go/blob/414fa8c35e7c2f65e2c767d6db2f25791e53b5c1/src/hash/crc32/crc32_amd64.s#L143">Golang</a></li><li><a href="https://lists.gnupg.org/pipermail/gcrypt-devel/2019-September/004807.html">GNUPG</a></li></ul>



<p>I decided that this is too good to be true to be a bug, even though the number is changed, I also tried 0x42 and the tests fail. After that I started looking at the code and managed to prove that this constant +-1 does not matter</p>



<p> Let&#8217;s look the snippet where this constant is used from <a href="https://source.chromium.org/chromium/chromium/src/+/master:third_party/zlib/crc32_simd.c;l=36;drc=9d4ec9349a1bf609eedb917c44c69eb0df9ff6bb">zlib</a>:</p>



<figure class="wp-block-embed is-type-rich is-provider-embed wp-block-embed-embed"><div class="wp-block-embed__wrapper">
<style>.gist table { margin-bottom: 0; }</style><div style="tab-size: 8" id="gist108328473" class="gist">
    <div class="gist-file" translate="no">
      <div class="gist-data">
        <div class="js-gist-file-update-container js-task-list-container file-box">
  <div id="file-crc_strange_value-h" class="file my-2">
    
  <div itemprop="text" class="Box-body p-0 blob-wrapper data type-c  ">

      
<table class="highlight tab-size js-file-line-container" data-tab-size="8" data-paste-markdown-skip>
      <tr>
        <td id="file-crc_strange_value-h-L1" class="blob-num js-line-number" data-line-number="1"></td>
        <td id="file-crc_strange_value-h-LC1" class="blob-code blob-code-inner js-file-line"><span class="pl-k">static</span> <span class="pl-k">const</span> <span class="pl-c1">uint64_t</span> <span class="pl-en">zalign</span>(<span class="pl-c1">16</span>) poly[] = { <span class="pl-c1">0x01db710641</span>, <span class="pl-c1">0x01f7011641</span> };</td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L2" class="blob-num js-line-number" data-line-number="2"></td>
        <td id="file-crc_strange_value-h-LC2" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span>                                          ^^^^^^^^^^^^</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L3" class="blob-num js-line-number" data-line-number="3"></td>
        <td id="file-crc_strange_value-h-LC3" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> No use of poly</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L4" class="blob-num js-line-number" data-line-number="4"></td>
        <td id="file-crc_strange_value-h-LC4" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> &#8230;</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L5" class="blob-num js-line-number" data-line-number="5"></td>
        <td id="file-crc_strange_value-h-LC5" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> x0, x1, x2, x3 are 128 bit registers</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L6" class="blob-num js-line-number" data-line-number="6"></td>
        <td id="file-crc_strange_value-h-LC6" class="blob-code blob-code-inner js-file-line">x3 = _mm_setr_epi32(~<span class="pl-c1">0</span>, <span class="pl-c1">0</span>, ~<span class="pl-c1">0</span>, <span class="pl-c1">0</span>); <span class="pl-c"><span class="pl-c">//</span> Set reverse, i.e. bits from 0 to 31 are 1s, from 32 to 63 are zeros, etc</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L7" class="blob-num js-line-number" data-line-number="7"></td>
        <td id="file-crc_strange_value-h-LC7" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> No reassignment of x3</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L8" class="blob-num js-line-number" data-line-number="8"></td>
        <td id="file-crc_strange_value-h-LC8" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> &#8230;</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L9" class="blob-num js-line-number" data-line-number="9"></td>
        <td id="file-crc_strange_value-h-LC9" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">/*</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L10" class="blob-num js-line-number" data-line-number="10"></td>
        <td id="file-crc_strange_value-h-LC10" class="blob-code blob-code-inner js-file-line"><span class="pl-c"> * Barret reduce to 32-bits.</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L11" class="blob-num js-line-number" data-line-number="11"></td>
        <td id="file-crc_strange_value-h-LC11" class="blob-code blob-code-inner js-file-line"><span class="pl-c"> <span class="pl-c">*/</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L12" class="blob-num js-line-number" data-line-number="12"></td>
        <td id="file-crc_strange_value-h-LC12" class="blob-code blob-code-inner js-file-line">x0 = _mm_load_si128((__m128i*)poly); <span class="pl-c"><span class="pl-c">//</span> Load 16 bytes</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L13" class="blob-num js-line-number" data-line-number="13"></td>
        <td id="file-crc_strange_value-h-LC13" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L14" class="blob-num js-line-number" data-line-number="14"></td>
        <td id="file-crc_strange_value-h-LC14" class="blob-code blob-code-inner js-file-line">x2 = _mm_and_si128(x1, x3);              <span class="pl-c"><span class="pl-c">//</span> Do logical AND with x3</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L15" class="blob-num js-line-number" data-line-number="15"></td>
        <td id="file-crc_strange_value-h-LC15" class="blob-code blob-code-inner js-file-line">x2 = _mm_clmulepi64_si128(x2, x0, <span class="pl-c1">0x10</span>); <span class="pl-c"><span class="pl-c">//</span> Do CLMUL with the high 8 bytes of poly</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L16" class="blob-num js-line-number" data-line-number="16"></td>
        <td id="file-crc_strange_value-h-LC16" class="blob-code blob-code-inner js-file-line">x2 = _mm_and_si128(x2, x3);              <span class="pl-c"><span class="pl-c">//</span> Do logical AND with x3</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L17" class="blob-num js-line-number" data-line-number="17"></td>
        <td id="file-crc_strange_value-h-LC17" class="blob-code blob-code-inner js-file-line">x2 = _mm_clmulepi64_si128(x2, x0, <span class="pl-c1">0x00</span>); <span class="pl-c"><span class="pl-c">//</span> Do CLMUL with the low 8 bytes of poly</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L18" class="blob-num js-line-number" data-line-number="18"></td>
        <td id="file-crc_strange_value-h-LC18" class="blob-code blob-code-inner js-file-line">x1 = _mm_xor_si128(x1, x2);              <span class="pl-c"><span class="pl-c">//</span> XOR x1 and x2</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L19" class="blob-num js-line-number" data-line-number="19"></td>
        <td id="file-crc_strange_value-h-LC19" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L20" class="blob-num js-line-number" data-line-number="20"></td>
        <td id="file-crc_strange_value-h-LC20" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">/*</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L21" class="blob-num js-line-number" data-line-number="21"></td>
        <td id="file-crc_strange_value-h-LC21" class="blob-code blob-code-inner js-file-line"><span class="pl-c"> * Return the crc32.</span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L22" class="blob-num js-line-number" data-line-number="22"></td>
        <td id="file-crc_strange_value-h-LC22" class="blob-code blob-code-inner js-file-line"><span class="pl-c"> <span class="pl-c">*/</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_strange_value-h-L23" class="blob-num js-line-number" data-line-number="23"></td>
        <td id="file-crc_strange_value-h-LC23" class="blob-code blob-code-inner js-file-line"><span class="pl-k">return</span> _mm_extract_epi32(x1, <span class="pl-c1">1</span>); <span class="pl-c"><span class="pl-c">//</span> Return bytes from 4 to 8 of x1, i.e. extract the second 32 bit integer </span></td>
      </tr>
</table>


  </div>

  </div>
</div>

      </div>
      <div class="gist-meta">
        <a href="https://gist.github.com/danlark1/0e1a2aaed0920a49071b5c7bec7d1cce/raw/e65993ca589df79e50509010b1fa18cddfa7dc6e/crc_strange_value.h" style="float:right">view raw</a>
        <a href="https://gist.github.com/danlark1/0e1a2aaed0920a49071b5c7bec7d1cce#file-crc_strange_value-h">crc_strange_value.h</a>
        hosted with &#10084; by <a href="https://github.com">GitHub</a>
      </div>
    </div>
</div>

</div></figure>



<p>Speaking about CLMUL instruction, it has the following signature <code>__m128i&nbsp;_mm_clmulepi64_si128&nbsp;(__m128i&nbsp;a,&nbsp;__m128i&nbsp;b,&nbsp;const int&nbsp;imm8)</code>, i.e. takes two 16 byte registers and mask and returns 16 byte register. The algorithm is the following</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img data-attachment-id="707" data-permalink="https://danlark.org/2021-03-08-115915_522x458_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png" data-orig-size="522,458" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2021-03-08-115915_522x458_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png?w=522" src="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png?w=522" alt="" class="wp-image-707" srcset="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png 522w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-115915_522x458_scrot.png?w=300 300w" sizes="(max-width: 522px) 100vw, 522px" /></figure></div>



<p>While executing the 15th line in gist <code>x2 = _mm_clmulepi64_si128(x2, x0, 0x10);</code> the mask imm8 takes the high 8 bytes in TEMP2 and thus the result isn&#8217;t changed, no need to worry here.</p>



<p>The most interesting part is the second _mm_clmulepi64_si128 with the third argument 0x00 which takes first 8 bytes from the operation in TEMP2. Actually the resulting values would be different but all we need to prove is that the bytes from 4 to 8 are the same because return happens with _mm_extract_epi32 which returns exactly uint32_t of that bytes (to be clear, the xor from x1 and x2 but if we prove bytes from 4 to 8 are the same for x2, it would be sufficient).</p>



<p>The bytes from 4 to 8 are only used in one loop in the operation:</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img data-attachment-id="711" data-permalink="https://danlark.org/2021-03-08-120407_506x120_scrot/" data-orig-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png" data-orig-size="506,120" data-comments-opened="1" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2021-03-08-120407_506x120_scrot" data-image-description="" data-image-caption="" data-medium-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png?w=300" data-large-file="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png?w=506" src="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png?w=506" alt="" class="wp-image-711" srcset="https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png 506w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png?w=150 150w, https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-120407_506x120_scrot.png?w=300 300w" sizes="(max-width: 506px) 100vw, 506px" /></figure></div>



<p>TEMP2 is now our &#8220;magic&#8221; k6 value and TEMP1 is just some input. Note that when changing from 0x1DB710640 to 0x1DB710641 we only swap bit TEMP2[0]. Given it makes AND with all bits when <code>i</code> equals to <code>j</code>, the result would not change if and only if TEMP1[<code>j</code>] is zero for all j from 32 to 63.</p>



<p>And this turns out to be true because before the second CLMUL happens the following: <code>x2 = _mm_and_si128(x2, x3);</code>. And as you can see, x3 has bits zero from 32 to 63. And the returning result isn&#8217;t changed. What a coincidence! Given the conditions, if the last byte is changed to 0x42, only the highest bit can differ at the very most as it changes TEMP2[1].</p>



<p>For now I don&#8217;t know for 100% if it was made on purpose, to me looks like a human issue where the value was copy pasted and accidentally it worked. I wish during the interviews I also may miss any +-1 because of such bit magic 🙂</p>



<h2>Bonus: speaking about CRC</h2>



<p>This is not for the first time I face some weird bit magic issues with CRC, for example, look at the following code:</p>



<figure class="wp-block-embed is-type-rich is-provider-embed wp-block-embed-embed"><div class="wp-block-embed__wrapper">
<style>.gist table { margin-bottom: 0; }</style><div style="tab-size: 8" id="gist108328634" class="gist">
    <div class="gist-file" translate="no">
      <div class="gist-data">
        <div class="js-gist-file-update-container js-task-list-container file-box">
  <div id="file-crc_fnv_fiasco-c" class="file my-2">
    
  <div itemprop="text" class="Box-body p-0 blob-wrapper data type-c  ">

      
<table class="highlight tab-size js-file-line-container" data-tab-size="8" data-paste-markdown-skip>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L1" class="blob-num js-line-number" data-line-number="1"></td>
        <td id="file-crc_fnv_fiasco-c-LC1" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> For a fully compiled version look at <a href="https://gist.github.com/danlark1/69c9d5e1d356eaa32dbdfccb2e19f401" rel="nofollow">https://gist.github.com/danlark1/69c9d5e1d356eaa32dbdfccb2e19f401</a></span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L2" class="blob-num js-line-number" data-line-number="2"></td>
        <td id="file-crc_fnv_fiasco-c-LC2" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">include</span> <span class="pl-s"><span class="pl-pds">&lt;</span>stdint.h<span class="pl-pds">&gt;</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L3" class="blob-num js-line-number" data-line-number="3"></td>
        <td id="file-crc_fnv_fiasco-c-LC3" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">include</span> <span class="pl-s"><span class="pl-pds">&lt;</span>stddef.h<span class="pl-pds">&gt;</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L4" class="blob-num js-line-number" data-line-number="4"></td>
        <td id="file-crc_fnv_fiasco-c-LC4" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">include</span> <span class="pl-s"><span class="pl-pds">&lt;</span>stdio.h<span class="pl-pds">&gt;</span></span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L5" class="blob-num js-line-number" data-line-number="5"></td>
        <td id="file-crc_fnv_fiasco-c-LC5" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L6" class="blob-num js-line-number" data-line-number="6"></td>
        <td id="file-crc_fnv_fiasco-c-LC6" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">define</span> <span class="pl-en">FNV32INIT</span> <span class="pl-c1">2166136261U</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L7" class="blob-num js-line-number" data-line-number="7"></td>
        <td id="file-crc_fnv_fiasco-c-LC7" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">define</span> <span class="pl-en">FNV32PRIME</span> <span class="pl-c1">16777619U</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L8" class="blob-num js-line-number" data-line-number="8"></td>
        <td id="file-crc_fnv_fiasco-c-LC8" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">define</span> <span class="pl-en">FNV64INIT</span> <span class="pl-c1">14695981039346656037ULL</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L9" class="blob-num js-line-number" data-line-number="9"></td>
        <td id="file-crc_fnv_fiasco-c-LC9" class="blob-code blob-code-inner js-file-line">#<span class="pl-k">define</span> <span class="pl-en">FNV64PRIME</span> <span class="pl-c1">1099511628211ULL</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L10" class="blob-num js-line-number" data-line-number="10"></td>
        <td id="file-crc_fnv_fiasco-c-LC10" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L11" class="blob-num js-line-number" data-line-number="11"></td>
        <td id="file-crc_fnv_fiasco-c-LC11" class="blob-code blob-code-inner js-file-line"><span class="pl-c"><span class="pl-c">//</span> Any table</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L12" class="blob-num js-line-number" data-line-number="12"></td>
        <td id="file-crc_fnv_fiasco-c-LC12" class="blob-code blob-code-inner js-file-line"><span class="pl-k">extern</span> <span class="pl-k">unsigned</span> <span class="pl-k">long</span> crc64table[<span class="pl-c1">256</span>];</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L13" class="blob-num js-line-number" data-line-number="13"></td>
        <td id="file-crc_fnv_fiasco-c-LC13" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L14" class="blob-num js-line-number" data-line-number="14"></td>
        <td id="file-crc_fnv_fiasco-c-LC14" class="blob-code blob-code-inner js-file-line"><span class="pl-k">unsigned</span> <span class="pl-k">long</span> <span class="pl-en">fnv64</span>(<span class="pl-k">const</span> <span class="pl-k">void</span> *p, <span class="pl-c1">size_t</span> len) {</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L15" class="blob-num js-line-number" data-line-number="15"></td>
        <td id="file-crc_fnv_fiasco-c-LC15" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">unsigned</span> <span class="pl-k">long</span> init = FNV64INIT;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L16" class="blob-num js-line-number" data-line-number="16"></td>
        <td id="file-crc_fnv_fiasco-c-LC16" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">const</span> <span class="pl-k">unsigned</span> <span class="pl-k">char</span> *_p = (<span class="pl-k">const</span> <span class="pl-k">unsigned</span> <span class="pl-k">char</span> *)p;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L17" class="blob-num js-line-number" data-line-number="17"></td>
        <td id="file-crc_fnv_fiasco-c-LC17" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">for</span> (<span class="pl-c1">size_t</span> i = <span class="pl-c1">0</span>; i &lt; len; ++i) {</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L18" class="blob-num js-line-number" data-line-number="18"></td>
        <td id="file-crc_fnv_fiasco-c-LC18" class="blob-code blob-code-inner js-file-line">    init = (init * FNV64PRIME) ^ _p[i];</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L19" class="blob-num js-line-number" data-line-number="19"></td>
        <td id="file-crc_fnv_fiasco-c-LC19" class="blob-code blob-code-inner js-file-line">  }</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L20" class="blob-num js-line-number" data-line-number="20"></td>
        <td id="file-crc_fnv_fiasco-c-LC20" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">return</span> init;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L21" class="blob-num js-line-number" data-line-number="21"></td>
        <td id="file-crc_fnv_fiasco-c-LC21" class="blob-code blob-code-inner js-file-line">}</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L22" class="blob-num js-line-number" data-line-number="22"></td>
        <td id="file-crc_fnv_fiasco-c-LC22" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L23" class="blob-num js-line-number" data-line-number="23"></td>
        <td id="file-crc_fnv_fiasco-c-LC23" class="blob-code blob-code-inner js-file-line"><span class="pl-k">unsigned</span> <span class="pl-k">long</span> <span class="pl-en">crc64</span>(<span class="pl-k">unsigned</span> <span class="pl-k">long</span> crc, <span class="pl-k">const</span> <span class="pl-k">void</span> *p, <span class="pl-c1">size_t</span> len) {</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L24" class="blob-num js-line-number" data-line-number="24"></td>
        <td id="file-crc_fnv_fiasco-c-LC24" class="blob-code blob-code-inner js-file-line">  <span class="pl-c1">size_t</span> i, t;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L25" class="blob-num js-line-number" data-line-number="25"></td>
        <td id="file-crc_fnv_fiasco-c-LC25" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">const</span> <span class="pl-k">unsigned</span> <span class="pl-k">char</span> *_p = (<span class="pl-k">const</span> <span class="pl-k">unsigned</span> <span class="pl-k">char</span> *)p;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L26" class="blob-num js-line-number" data-line-number="26"></td>
        <td id="file-crc_fnv_fiasco-c-LC26" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">for</span> (i = <span class="pl-c1">0</span>; i &lt; len; i++) {</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L27" class="blob-num js-line-number" data-line-number="27"></td>
        <td id="file-crc_fnv_fiasco-c-LC27" class="blob-code blob-code-inner js-file-line">    t = ((crc &gt;&gt; <span class="pl-c1">56</span>) ^ (*_p++)) &amp; <span class="pl-c1">0xFF</span>;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L28" class="blob-num js-line-number" data-line-number="28"></td>
        <td id="file-crc_fnv_fiasco-c-LC28" class="blob-code blob-code-inner js-file-line">    crc = crc64table[t] ^ (crc &lt;&lt; <span class="pl-c1">8</span>);</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L29" class="blob-num js-line-number" data-line-number="29"></td>
        <td id="file-crc_fnv_fiasco-c-LC29" class="blob-code blob-code-inner js-file-line">  }</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L30" class="blob-num js-line-number" data-line-number="30"></td>
        <td id="file-crc_fnv_fiasco-c-LC30" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">return</span> crc;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L31" class="blob-num js-line-number" data-line-number="31"></td>
        <td id="file-crc_fnv_fiasco-c-LC31" class="blob-code blob-code-inner js-file-line">}</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L32" class="blob-num js-line-number" data-line-number="32"></td>
        <td id="file-crc_fnv_fiasco-c-LC32" class="blob-code blob-code-inner js-file-line">
</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L33" class="blob-num js-line-number" data-line-number="33"></td>
        <td id="file-crc_fnv_fiasco-c-LC33" class="blob-code blob-code-inner js-file-line"><span class="pl-k">int</span> <span class="pl-en">main</span>() {</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L34" class="blob-num js-line-number" data-line-number="34"></td>
        <td id="file-crc_fnv_fiasco-c-LC34" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">const</span> <span class="pl-k">char</span>* data1 = <span class="pl-s"><span class="pl-pds">&quot;</span>l.im/8ca<span class="pl-pds">&quot;</span></span>;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L35" class="blob-num js-line-number" data-line-number="35"></td>
        <td id="file-crc_fnv_fiasco-c-LC35" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">const</span> <span class="pl-k">char</span>* data2 = <span class="pl-s"><span class="pl-pds">&quot;</span>l.im/8cf<span class="pl-pds">&quot;</span></span>;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L36" class="blob-num js-line-number" data-line-number="36"></td>
        <td id="file-crc_fnv_fiasco-c-LC36" class="blob-code blob-code-inner js-file-line">  <span class="pl-c1">size_t</span> size = <span class="pl-c1">8</span>;</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L37" class="blob-num js-line-number" data-line-number="37"></td>
        <td id="file-crc_fnv_fiasco-c-LC37" class="blob-code blob-code-inner js-file-line">  <span class="pl-k">unsigned</span> <span class="pl-k">long</span> crc = <span class="pl-c1">crc64</span>(<span class="pl-c1">fnv64</span>(data1, size), data1, size);</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L38" class="blob-num js-line-number" data-line-number="38"></td>
        <td id="file-crc_fnv_fiasco-c-LC38" class="blob-code blob-code-inner js-file-line">  <span class="pl-c1">printf</span>(<span class="pl-s"><span class="pl-pds">&quot;</span><span class="pl-c1">%lu</span><span class="pl-cce">\n</span><span class="pl-pds">&quot;</span></span>, crc); <span class="pl-c"><span class="pl-c">//</span> 6193082687915471267</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L39" class="blob-num js-line-number" data-line-number="39"></td>
        <td id="file-crc_fnv_fiasco-c-LC39" class="blob-code blob-code-inner js-file-line">  crc = <span class="pl-c1">crc64</span>(<span class="pl-c1">fnv64</span>(data2, size), data2, size);</td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L40" class="blob-num js-line-number" data-line-number="40"></td>
        <td id="file-crc_fnv_fiasco-c-LC40" class="blob-code blob-code-inner js-file-line">  <span class="pl-c1">printf</span>(<span class="pl-s"><span class="pl-pds">&quot;</span><span class="pl-c1">%lu</span><span class="pl-cce">\n</span><span class="pl-pds">&quot;</span></span>, crc); <span class="pl-c"><span class="pl-c">//</span> 6193082687915471267</span></td>
      </tr>
      <tr>
        <td id="file-crc_fnv_fiasco-c-L41" class="blob-num js-line-number" data-line-number="41"></td>
        <td id="file-crc_fnv_fiasco-c-LC41" class="blob-code blob-code-inner js-file-line">}</td>
      </tr>
</table>


  </div>

  </div>
</div>

      </div>
      <div class="gist-meta">
        <a href="https://gist.github.com/danlark1/33e7206e0d708cb6e5c5fdf5d6afd19a/raw/73118b232c30353407e8b4de78fc7be113c9f464/crc_fnv_fiasco.c" style="float:right">view raw</a>
        <a href="https://gist.github.com/danlark1/33e7206e0d708cb6e5c5fdf5d6afd19a#file-crc_fnv_fiasco-c">crc_fnv_fiasco.c</a>
        hosted with &#10084; by <a href="https://github.com">GitHub</a>
      </div>
    </div>
</div>

</div></figure>



<p>It seeds CRC64 hash with <a href="https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function">FNV</a> hash and the results are the same. Though the last byte of the 8 word strings are different. I faced it once in URL hashing and it was failing for urls with very short domain and path.<em> The proof&nbsp;is left as an exercise to the&nbsp;reader</em>, try it out, really, it is some good bit twiddling. More hashing is sometimes bad hashing, be careful 🙂</p>



<p></p>



<p>Thanks to <a href="https://nigeltao.github.io/">Nigel Tao</a> who first suspected an issue with the Linux kernel and described it in the <a href="https://github.com/google/wuffs/blob/9e5817e45d6ef222e1c7919ae3b6565a7146bdc2/std/crc32/common_up_x86_sse42.wuffs#L138-L195">wuffs</a> repository. I think no one should do anything as it perfectly works or at the very most fix the constants to better match the guideline.</p>
]]></html><thumbnail_url><![CDATA[https://danlarkorg.files.wordpress.com/2021/03/2021-03-08-110917_951x651_scrot.png?fit=440%2C330]]></thumbnail_url><thumbnail_width><![CDATA[440]]></thumbnail_width><thumbnail_height><![CDATA[301]]></thumbnail_height></oembed>