update dht_sec document

2014-01-11 07:50:01 +00:00 · 2014-01-11 07:50:01 +00:00 · 1179c137d6
parent d86c8dcc4d
commit 1179c137d6
2 changed files with 61 additions and 50 deletions
--- a/docs/dht_sec.html
+++ b/docs/dht_sec.html
@ -55,14 +55,14 @@
 <div class="contents topic" id="table-of-contents">
 <p class="topic-title first">Table of contents</p>
 <ul class="simple">
-<li><a class="reference internal" href="#id1" id="id3">BitTorrent DHT security extension</a></li>
-<li><a class="reference internal" href="#considerations" id="id4">considerations</a></li>
-<li><a class="reference internal" href="#node-id-restriction" id="id5">Node ID restriction</a></li>
-<li><a class="reference internal" href="#bootstrapping" id="id6">bootstrapping</a></li>
-<li><a class="reference internal" href="#rationale" id="id7">rationale</a></li>
-<li><a class="reference internal" href="#enforcement" id="id8">enforcement</a></li>
-<li><a class="reference internal" href="#backwards-compatibility-and-transition" id="id9">backwards compatibility and transition</a></li>
-<li><a class="reference internal" href="#forward-compatibility" id="id10">forward compatibility</a></li>
+<li><a class="reference internal" href="#id1" id="id2">BitTorrent DHT security extension</a></li>
+<li><a class="reference internal" href="#considerations" id="id3">considerations</a></li>
+<li><a class="reference internal" href="#node-id-restriction" id="id4">Node ID restriction</a></li>
+<li><a class="reference internal" href="#bootstrapping" id="id5">bootstrapping</a></li>
+<li><a class="reference internal" href="#rationale" id="id6">rationale</a></li>
+<li><a class="reference internal" href="#enforcement" id="id7">enforcement</a></li>
+<li><a class="reference internal" href="#backwards-compatibility-and-transition" id="id8">backwards compatibility and transition</a></li>
+<li><a class="reference internal" href="#forward-compatibility" id="id9">forward compatibility</a></li>
 </ul>
 </div>
 <div class="section" id="id1">
@ -109,21 +109,21 @@ forced to run their DHT nodes on the same node ID.</p>
 of IPs, as well as allowing more than one node ID per external IP, the node
 ID can be restricted at each class level of the IP.</p>
 <p>Another important property of the restriction put on node IDs is that the
-distribution of the IDs remoain uniform. This is why CRC32 was chosen
-as the hash function. See <a class="reference external" href="http://blog.libtorrent.org/2012/12/dht-security/">comparisons of hash functions</a>.</p>
+distribution of the IDs remoain uniform. This is why CRC32C (Castagnoli) was
+chosen as the hash function.</p>
 <p>The expression to calculate a valid ID prefix (from an IPv4 address) is:</p>
 <pre class="literal-block">
-crc32((ip &amp; 0x030f3fff) .. r)
+crc32c((ip &amp; 0x030f3fff) | (r &lt;&lt; 29))
 </pre>
 <p>And for an IPv6 address (<tt class="docutils literal">ip</tt> is the high 64 bits of the address):</p>
 <pre class="literal-block">
-crc32((ip &amp; 0x0103070f1f3f7fff) ..  r)
+crc32c((ip &amp; 0x0103070f1f3f7fff) | (r &lt;&lt; 61))
 </pre>
 <p><tt class="docutils literal">r</tt> is a random number in the range [0, 7]. The resulting integer,
 representing the masked IP address is supposed to be big-endian before
-hashed. The &quot;..&quot; means concatenation.</p>
+hashed. The &quot;|&quot; operator means bit-wise OR.</p>
 <p>The details of implementing this is to evaluate the expression, store the
-result in a big endian 64 bit integer and hash those 8 bytes with CRC32.</p>
+result in a big endian 64 bit integer and hash those 8 bytes with CRC32C.</p>
 <p>The first (most significant) 21 bits of the node ID used in the DHT MUST
 match the first 21 bits of the resulting hash. The last byte of the hash MUST
 match the random number (<tt class="docutils literal">r</tt>) used to generate the hash.</p>
@ -144,10 +144,10 @@ for (int i = 0; i &lt; num_octets; ++i)

 uint32_t rand = std::rand() &amp; 0xff;
 uint8_t r = rand &amp; 0x7;
+ip[0] |= r &lt;&lt; 5;

-uint32_t crc = crc32(0, nullptr, 0);
-crc = crc32(crc, ip, num_octets);
-crc = crc32(crc, &amp;r, 1);
+uint32_t crc = 0;
+crc = crc32c(crc, ip, num_octets);

 // only take the top 21 bits from crc
 node_id[0] = (crc &gt;&gt; 24) &amp; 0xff;
@ -160,15 +160,15 @@ node_id[19] = rand;
 <pre class="literal-block">
 IP           rand  example node ID
 ============ ===== ==========================================
-124.31.75.21   1   <strong>d2a6df</strong> f10c5d6a4ec8a88e4c6ab4c28b95eee4 <strong>01</strong>
-21.75.31.124  86   <strong>51d029</strong> c14e7a08645677bbd1cfe7d8f956d532 <strong>56</strong>
-65.23.51.170  22   <strong>fd334a</strong> 20bc8f112a3d426c84764f8c2a1150e6 <strong>16</strong>
-84.124.73.14  65   <strong>6aa169</strong> dd1bb1fe518101ceef99462b947a01ff <strong>41</strong>
-43.213.53.83  90   <strong>eb6434</strong> bf5b7c4be0237986d5243b87aa6d5130 <strong>5a</strong>
+124.31.75.21   1   <strong>5fbfbf</strong> f10c5d6a4ec8a88e4c6ab4c28b95eee4 <strong>01</strong>
+21.75.31.124  86   <strong>5a3ce9</strong> c14e7a08645677bbd1cfe7d8f956d532 <strong>56</strong>
+65.23.51.170  22   <strong>a5d432</strong> 20bc8f112a3d426c84764f8c2a1150e6 <strong>16</strong>
+84.124.73.14  65   <strong>1b0321</strong> dd1bb1fe518101ceef99462b947a01ff <strong>41</strong>
+43.213.53.83  90   <strong>e56f6c</strong> bf5b7c4be0237986d5243b87aa6d5130 <strong>5a</strong>
 </pre>
 <p>The bold parts of the node ID are the important parts. The rest are
 random numbers. The last bold number of each row has only its most significant
-bit pulled from the CRC function. The lower 3 bits are random.</p>
+bit pulled from the CRC32C function. The lower 3 bits are random.</p>
 </div>
 <div class="section" id="bootstrapping">
 <h1>bootstrapping</h1>
@ -191,17 +191,19 @@ nodes, from separate searches, tells you your node ID is incorrect.</p>
 </div>
 <div class="section" id="rationale">
 <h1>rationale</h1>
-<p>The choice of using CRC32 instead of a more traditional cryptographic hash
+<p>The choice of using CRC32C instead of a more traditional cryptographic hash
 function is justified primarily of these reasons:</p>
 <ol class="arabic simple">
 <li>it is a fast function</li>
 <li>produces well distributed results</li>
 <li>there is no need for the hash function to be one-way (the input set is
 so small that any hash function could be reversed).</li>
+<li>CRC32C (Castagnoli) is supported in hardware by SSE 4.2, which can
+significantly speed up computation</li>
 </ol>
-<p>There are primarily two tests run on SHA-1 and CRC32 to establish the
+<p>There are primarily two tests run on SHA-1 and CRC32C to establish the
 distribution of results. The first one is the number of bits in the output
-set that contain every possible combination of bits. The CRC function
+set that contain every possible combination of bits. The CRC32C function
 has a longer such prefix in its output than SHA-1. This means nodes will still
 have well uniformly distributed IDs, even when IP addresses in use are not
 uniformly distributed.</p>
@ -213,7 +215,7 @@ reserved for local networks, multicast and other things. It also takes into
 account that some /8 blocks are not in use by end-users and exremely unlikely
 to ever run a DHT node. This makes the results likely to be very similar to
 what we would see in the wild.</p>
-<p>These results indicate that CRC32 provides the best uniformity in the results
+<p>These results indicate that CRC32C provides the best uniformity in the results
 in terms of bit prefixes where all possibilities are represented, and that
 no more than 21 bits should be used from the result. If more than 21 bits
 were to be used, there would be certain node IDs that would be impossible to
@ -223,9 +225,13 @@ The target space (32 bit interger) is divided up into 1000 buckets. Every valid
 IP and <tt class="docutils literal">r</tt> input is run through the algorithm and the result is put in the
 bucket it falls in. The expectation is that each bucket has roughly an equal
 number of results falling into it. The following graph shows the resulting
-histogram, comparing SHA-1 and CRC32.</p>
+histogram, comparing SHA-1 and CRC32C.</p>
 <img alt="hash_distribution.png" src="hash_distribution.png" />
 <p>The source code for these tests can be found <a class="reference external" href="https://github.com/arvidn/hash_complete_prefix">here</a>.</p>
+<p>The reason to use CRC32C instead of the CRC32 implemented by zlib is that
+Intel CPUs have hardware support for the CRC32C calculations. The input
+being exactly 4 bytes is also deliberate, to make it fit in a single
+instruction.</p>
 </div>
 <div class="section" id="enforcement">
 <h1>enforcement</h1>
--- a/docs/dht_sec.rst
+++ b/docs/dht_sec.rst
@ -64,25 +64,23 @@ of IPs, as well as allowing more than one node ID per external IP, the node
 ID can be restricted at each class level of the IP.

 Another important property of the restriction put on node IDs is that the
-distribution of the IDs remoain uniform. This is why CRC32 was chosen
-as the hash function. See `comparisons of hash functions`__.
-
-__ http://blog.libtorrent.org/2012/12/dht-security/
+distribution of the IDs remoain uniform. This is why CRC32C (Castagnoli) was
+chosen as the hash function.

 The expression to calculate a valid ID prefix (from an IPv4 address) is::

-	crc32((ip & 0x030f3fff) .. r)
+	crc32c((ip & 0x030f3fff) | (r << 29))

 And for an IPv6 address (``ip`` is the high 64 bits of the address)::

-	crc32((ip & 0x0103070f1f3f7fff) ..  r)
+	crc32c((ip & 0x0103070f1f3f7fff) | (r << 61))

 ``r`` is a random number in the range [0, 7]. The resulting integer,
 representing the masked IP address is supposed to be big-endian before
-hashed. The ".." means concatenation.
+hashed. The "|" operator means bit-wise OR.

 The details of implementing this is to evaluate the expression, store the
-result in a big endian 64 bit integer and hash those 8 bytes with CRC32.
+result in a big endian 64 bit integer and hash those 8 bytes with CRC32C.

 The first (most significant) 21 bits of the node ID used in the DHT MUST
 match the first 21 bits of the resulting hash. The last byte of the hash MUST
@ -106,10 +104,10 @@ Example code code for calculating a valid node ID::

 	uint32_t rand = std::rand() & 0xff;
 	uint8_t r = rand & 0x7;
+	ip[0] |= r << 5;

-	uint32_t crc = crc32(0, nullptr, 0);
-	crc = crc32(crc, ip, num_octets);
-	crc = crc32(crc, &r, 1);
+	uint32_t crc = 0;
+	crc = crc32c(crc, ip, num_octets);

 	// only take the top 21 bits from crc
 	node_id[0] = (crc >> 24) & 0xff;
@ -124,15 +122,15 @@ test vectors:

 	IP           rand  example node ID
 	============ ===== ==========================================
-	124.31.75.21   1   **d2a6df** f10c5d6a4ec8a88e4c6ab4c28b95eee4 **01**
-	21.75.31.124  86   **51d029** c14e7a08645677bbd1cfe7d8f956d532 **56**
-	65.23.51.170  22   **fd334a** 20bc8f112a3d426c84764f8c2a1150e6 **16**
-	84.124.73.14  65   **6aa169** dd1bb1fe518101ceef99462b947a01ff **41**
-	43.213.53.83  90   **eb6434** bf5b7c4be0237986d5243b87aa6d5130 **5a**
+	124.31.75.21   1   **5fbfbf** f10c5d6a4ec8a88e4c6ab4c28b95eee4 **01**
+	21.75.31.124  86   **5a3ce9** c14e7a08645677bbd1cfe7d8f956d532 **56**
+	65.23.51.170  22   **a5d432** 20bc8f112a3d426c84764f8c2a1150e6 **16**
+	84.124.73.14  65   **1b0321** dd1bb1fe518101ceef99462b947a01ff **41**
+	43.213.53.83  90   **e56f6c** bf5b7c4be0237986d5243b87aa6d5130 **5a**

 The bold parts of the node ID are the important parts. The rest are
 random numbers. The last bold number of each row has only its most significant
-bit pulled from the CRC function. The lower 3 bits are random.
+bit pulled from the CRC32C function. The lower 3 bits are random.

 bootstrapping
 -------------
@ -160,17 +158,19 @@ nodes, from separate searches, tells you your node ID is incorrect.
 rationale
 ---------

-The choice of using CRC32 instead of a more traditional cryptographic hash
+The choice of using CRC32C instead of a more traditional cryptographic hash
 function is justified primarily of these reasons:

 1. it is a fast function
 2. produces well distributed results
 3. there is no need for the hash function to be one-way (the input set is
   so small that any hash function could be reversed).
+4. CRC32C (Castagnoli) is supported in hardware by SSE 4.2, which can
+   significantly speed up computation

-There are primarily two tests run on SHA-1 and CRC32 to establish the
+There are primarily two tests run on SHA-1 and CRC32C to establish the
 distribution of results. The first one is the number of bits in the output
-set that contain every possible combination of bits. The CRC function
+set that contain every possible combination of bits. The CRC32C function
 has a longer such prefix in its output than SHA-1. This means nodes will still
 have well uniformly distributed IDs, even when IP addresses in use are not
 uniformly distributed.
@ -186,7 +186,7 @@ account that some /8 blocks are not in use by end-users and exremely unlikely
 to ever run a DHT node. This makes the results likely to be very similar to
 what we would see in the wild.

-These results indicate that CRC32 provides the best uniformity in the results
+These results indicate that CRC32C provides the best uniformity in the results
 in terms of bit prefixes where all possibilities are represented, and that
 no more than 21 bits should be used from the result. If more than 21 bits
 were to be used, there would be certain node IDs that would be impossible to
@ -197,7 +197,7 @@ The target space (32 bit interger) is divided up into 1000 buckets. Every valid
 IP and ``r`` input is run through the algorithm and the result is put in the
 bucket it falls in. The expectation is that each bucket has roughly an equal
 number of results falling into it. The following graph shows the resulting
-histogram, comparing SHA-1 and CRC32.
+histogram, comparing SHA-1 and CRC32C.

 .. image:: hash_distribution.png

@ -205,6 +205,11 @@ The source code for these tests can be found here_.

 .. _here: https://github.com/arvidn/hash_complete_prefix

+The reason to use CRC32C instead of the CRC32 implemented by zlib is that
+Intel CPUs have hardware support for the CRC32C calculations. The input
+being exactly 4 bytes is also deliberate, to make it fit in a single
+instruction.
+
 enforcement
 -----------