initial support for DHT RSS feeds

This commit is contained in:
Arvid Norberg 2011-01-19 05:57:44 +00:00
parent 3095b2367e
commit ba0aed2282
15 changed files with 1253 additions and 429 deletions

View File

@ -1,3 +1,4 @@
* added support for DHT rss feeds (storing only)
* added support for RSS feeds
* fixed up some edge cases in DHT routing table and improved unit test of it
* added error category and error codes for HTTP errors

View File

@ -78,6 +78,8 @@ DOCS_PAGES = \
docs/projects.rst \
docs/python_binding.html \
docs/python_binding.rst \
docs/dht_rss.html \
docs/dht_rss.rst \
docs/running_tests.html \
docs/running_tests.rst \
docs/tuning.html \

View File

@ -183,6 +183,8 @@ void bind_session_settings()
.def_readwrite("service_port", &dht_settings::service_port)
#endif
.def_readwrite("max_fail_count", &dht_settings::max_fail_count)
.def_readwrite("max_torrents", &dht_settings::max_torrents)
.def_readwrite("max_feed_items", &dht_settings::max_feed_items)
.def_readwrite("restrict_routing_ips", &dht_settings::restrict_routing_ips)
.def_readwrite("restrict_search_ips", &dht_settings::restrict_search_ips)
;

View File

@ -57,16 +57,24 @@
<div class="contents topic" id="table-of-contents">
<p class="topic-title first">Table of contents</p>
<ul class="simple">
<li><a class="reference internal" href="#target-id" id="id1">target ID</a></li>
<li><a class="reference internal" href="#messages" id="id2">messages</a><ul>
<li><a class="reference internal" href="#requesting-items" id="id3">requesting items</a></li>
<li><a class="reference internal" href="#request-item-response" id="id4">request item response</a></li>
<li><a class="reference internal" href="#announcing-items" id="id5">announcing items</a></li>
<li><a class="reference internal" href="#example" id="id6">example</a></li>
<li><a class="reference internal" href="#terminology" id="id1">terminology</a></li>
<li><a class="reference internal" href="#linked-lists" id="id2">linked lists</a></li>
<li><a class="reference internal" href="#skip-lists" id="id3">skip lists</a></li>
<li><a class="reference internal" href="#list-head" id="id4">list-head</a></li>
<li><a class="reference internal" href="#messages" id="id5">messages</a><ul>
<li><a class="reference internal" href="#requesting-items" id="id6">requesting items</a></li>
<li><a class="reference internal" href="#request-item-response" id="id7">request item response</a></li>
<li><a class="reference internal" href="#announcing-items" id="id8">announcing items</a></li>
</ul>
</li>
<li><a class="reference internal" href="#uri-scheme" id="id7">URI scheme</a></li>
<li><a class="reference internal" href="#rationale" id="id8">rationale</a></li>
<li><a class="reference internal" href="#re-announcing" id="id9">re-announcing</a></li>
<li><a class="reference internal" href="#timeouts" id="id10">timeouts</a></li>
<li><a class="reference internal" href="#rss-feeds" id="id11">RSS feeds</a><ul>
<li><a class="reference internal" href="#example" id="id12">example</a></li>
</ul>
</li>
<li><a class="reference internal" href="#rss-feed-uri-scheme" id="id13">RSS feed URI scheme</a></li>
<li><a class="reference internal" href="#rationale" id="id14">rationale</a></li>
</ul>
</div>
<p>This is a proposal for an extension to the BitTorrent DHT to allow
@ -84,195 +92,303 @@ as the validity of the signatures of the items. Every recipient of a
message with feed items in it MUST verify both the validity of the public
key against the target ID it is stored under, as well as the validity of
the signatures of each individual item.</p>
<p>Any peer who is subscribing to a DHT feed SHOULD also participate in
regularly re-announcing items that it knows about. Every participant
SHOULD store items in long term storage, across sessions, in order to
keep items alive for as long as possible, with as few sources as possible.</p>
<p>As with normal DHT announces, the write-token mechanism is used to
prevent spoof attacks.</p>
prevent IP spoof attacks.</p>
<p>There are two new proposed messages, <tt class="docutils literal"><span class="pre">announce_item</span></tt> and <tt class="docutils literal"><span class="pre">get_item</span></tt>.
Every valid item that is announced, should be stored. In a request to get items,
as many items as can fit in a normal UDP packet size should be returned. If
there are more items than can fit, a random sub-set should be returned.</p>
<p><em>Is there a better heuristic here? Should there be a bias towards newer items?
If so, there needs to be a signed timestamp as well, which might get messy</em></p>
<div class="section" id="target-id">
<h1>target ID</h1>
<p>The target, i.e. the ID in the DHT key space feeds are announced to, MUST always
be SHA-1(<em>feed_name</em> + <em>public_key</em>). Any request where this condition is not met,
MUST be dropped.</p>
<p>Using the feed name as part of the target means a feed publisher only needs one
public-private keypair for any number of feeds, as long as the feeds have different
names.</p>
Every valid item that is announced, should be stored.</p>
<div class="section" id="terminology">
<h1>terminology</h1>
<p>In this document, a <em>storage node</em> refers to the node in the DHT to which
an item is being announce. A <em>subscribing node</em> refers to a node which
makes look ups in the DHT to find the storage nodes, to request items
from them.</p>
</div>
<div class="section" id="linked-lists">
<h1>linked lists</h1>
<p>Items are chained together in a geneal singly linked list. A linked
list does not necessarily contain RSS items, and no RSS related items
are mandatory. However, RSS items will be used as examples in this BEP:</p>
<pre class="literal-block">
key = SHA1(name + key)
+---------+
| head | key = SHA1(bencode(item))
| +---------+ +---------+
| | next |--------&gt;| item | key = SHA1(bencode(item))
| | key | | +---------+ +---------+
| | name | | | next |-------&gt;| item |
| | seq | | | key | | +---------+
| | ... | | | ... | | | next |---&gt;0
| +---------+ | +---------+ | | key |
| sig | | sig | | | ... |
+---------+ +---------+ | +---------+
| sig |
+---------+
</pre>
<p>The <tt class="docutils literal"><span class="pre">next</span></tt> pointer is at least 20 byte ID in the DHT key space pointing to where the next
item in the list is announced. The list is terminated with an ID of all zeroes.</p>
<p>The ID an items is announced to is determined by the SHA1 hash of the bencoded representation
of the item iteself. This contains all fields in the item, except the signature.
The only mandatory fields in an item are <tt class="docutils literal"><span class="pre">next</span></tt>, <tt class="docutils literal"><span class="pre">key</span></tt> and <tt class="docutils literal"><span class="pre">sig</span></tt>.</p>
<p>The <tt class="docutils literal"><span class="pre">key</span></tt> field MUST match the public key of the list head node. The <tt class="docutils literal"><span class="pre">sig</span></tt> field
MUST be the signature of the bencoded representation of <tt class="docutils literal"><span class="pre">item</span></tt> or <tt class="docutils literal"><span class="pre">head</span></tt> (whichever
is included in the message).</p>
<p>All subscribers MUST verify that the item is announced under the correct DHT key
and MUST verify the signature is valid and MUST verify the public key is the same
as the list-head. If a node fails any of these checks, it must be ignored and the
chain of items considered terminated.</p>
<p>Each item holds a bencoded dictionary with arbitrary keys, except two mandatory keys:
<tt class="docutils literal"><span class="pre">next</span></tt> and <tt class="docutils literal"><span class="pre">key</span></tt>. The signature <tt class="docutils literal"><span class="pre">sig</span></tt> is transferred outside of this dictionary
and is the signature of all of it. An implementation should stora any arbitrary keys that
are announced to an item, within reasonable restriction such as nesting, size and numeric
range of integers.</p>
</div>
<div class="section" id="skip-lists">
<h1>skip lists</h1>
<p>The <tt class="docutils literal"><span class="pre">next</span></tt> key stored in the list head and the items is a string of at least length
20 bytes, it may be any length divisible by 20. Each 20 bytes are the ID of the next
item in the list, the item 2 hops away, 4 hops away, 8 hops away, and so on. For
simplicity, only the first ID (1 hop) in the <tt class="docutils literal"><span class="pre">next</span></tt> field is illustrated above.</p>
<p>A publisher of an item SHOULD include as many IDs in the <tt class="docutils literal"><span class="pre">next</span></tt> field as the remaining
size of the list warrants, within reason.</p>
<p>These skip lists allow for parallelized lookups of items and also makes it more efficient
to search for specific items. It also mitigates breaking lists missing some items.</p>
<p>Figure of the skip list in the first list item:</p>
<pre class="literal-block">
n Item0 Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9 Item10
0 O-----&gt;
20 O------------&gt;
40 O--------------------------&gt;
60 O------------------------------------------------------&gt;
</pre>
<p><em>n</em> refers to the byte offset into the <tt class="docutils literal"><span class="pre">next</span></tt> field.</p>
</div>
<div class="section" id="list-head">
<h1>list-head</h1>
<p>The list head item is special in that it can be updated, without changing its
DHT key. This is required to prepend new items to the linked list. To authenticate
that only the original publisher can update the head, the whole linked list head
is signed. In order to avoid a malicious node to overwrite the list head with an old
version, the sequence number <tt class="docutils literal"><span class="pre">seq</span></tt> must be monotonically increasing for each update,
and a node hosting the list node MUST not downgrade a list head from a higher sequence
number to a lower one, only upgrade.</p>
<p>The list head's DHT key (which it is announced to) MUST be the SHA1 hash of the name
(<tt class="docutils literal"><span class="pre">n</span></tt>) and <tt class="docutils literal"><span class="pre">key</span></tt> fields concatenated.</p>
<p>Any node MUST reject any list head which is announced under any other ID.</p>
</div>
<div class="section" id="messages">
<h1>messages</h1>
<p>These are the proposed new message formats.</p>
<p>These are the messages to deal with linked lists.</p>
<p>The <tt class="docutils literal"><span class="pre">id</span></tt> field in these messages has the same semantics as the standard DHT messages,
i.e. the node ID of the node sending the message, to maintain the structure of the DHT
network.</p>
<p>The <tt class="docutils literal"><span class="pre">token</span></tt> field also has the same semantics as the standard DHT message <tt class="docutils literal"><span class="pre">get_peers</span></tt>
and <tt class="docutils literal"><span class="pre">announce_peer</span></tt>, when requesting an item and to write an item respectively.</p>
<p><tt class="docutils literal"><span class="pre">nodes</span></tt> and <tt class="docutils literal"><span class="pre">nodes6</span></tt> has the same semantics as in its <tt class="docutils literal"><span class="pre">get_peers</span></tt> response.</p>
<div class="section" id="requesting-items">
<h2>requesting items</h2>
<p>This message can be used to request both a list head and a list item. When requesting
a list head, the <tt class="docutils literal"><span class="pre">n</span></tt> (name) field MUST be specified. When requesting a list item the
<tt class="docutils literal"><span class="pre">n</span></tt> field is not required.</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;filter&quot;: <em>&lt;variable size bloom-filter&gt;</em>,
&quot;id&quot;: <em>&lt;20 byte id of origin node&gt;</em>,
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this feed&gt;</em>,
&quot;n&quot;: <em>&lt;feed-name&gt;</em>
&quot;target&quot;: <em>&lt;target-id as derived from public key&gt;</em>
},
&quot;q&quot;: &quot;get_item&quot;,
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>,
&quot;y&quot;: &quot;q&quot;,
&quot;a&quot;:
{
&quot;id&quot;: <em>&lt;20 byte ID of sending node&gt;</em>,
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this list&gt;</em>,
&quot;n&quot;: <em>&lt;list name&gt;</em>
&quot;target&quot;: <em>&lt;target-id for 'head' or 'item'&gt;</em>
},
&quot;q&quot;: &quot;get_item&quot;,
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>,
&quot;y&quot;: &quot;q&quot;,
}
</pre>
<p>The <tt class="docutils literal"><span class="pre">target</span></tt> MUST always be SHA-1(<em>feed_name</em> + <em>public_key</em>). Any request where
this condition is not met, MUST be dropped.</p>
<p>The <tt class="docutils literal"><span class="pre">n</span></tt> field is the name of this feed. It MUST be UTF-8 encoded string and it
MUST match the name of the feed in the receiving node.</p>
<p>The bloom filter argument (<tt class="docutils literal"><span class="pre">filter</span></tt>) in the <tt class="docutils literal"><span class="pre">get_item</span></tt> requests is optional.
If included in a request, it represents info-hashes that should be excluded from
the response. In this case, the response should be a random subset of the non-excluded
items, or all of the non-excluded items if they all fit within a packet size.</p>
<p>If the bloom filter is specified, its size MUST be an even multiple of 8 bits. The size
is implied by the length of the string. For each info-hash to exclude from the response,</p>
<p>There are no hash functions for the bloom filter. Since the info-hash is already a
hash digest, each pair of bytes, starting with the first bytes (MSB), are used as the
results from the imaginary hash functions for the bloom filter. k is 3 in this bloom
filter. This means the first 6 bytes of the info-hash is used to set 3 bits in the bloom
filter. The pairs of bytes pulled out from the info-hash are interpreted as a big-endian
16 bit value.</p>
<p>Bits are indexed in bytes from left to right, and within bytes from LSB to MSB. i.e., to
set bit 12: <tt class="docutils literal"><span class="pre">bitfield[12/8]</span> <span class="pre">|=</span> <span class="pre">(12</span> <span class="pre">%</span> <span class="pre">8)</span></tt>.</p>
<dl class="docutils">
<dt>Example:</dt>
<dd>To indicate that you are not interested in knowing about the info-hash that
starts with 0x4f7d25a... and you choose a bloom filter of size 80 bits. Set bits
(0x4f % 80), (0x7d % 80) and (0x25 % 80) in the bloom filter bitmask.</dd>
</dl>
<p>When requesting a list-head the <tt class="docutils literal"><span class="pre">target</span></tt> MUST always be SHA-1(<em>feed_name</em> + <em>public_key</em>).
<tt class="docutils literal"><span class="pre">target</span></tt> is the target node ID the item was written to.</p>
<p>The <tt class="docutils literal"><span class="pre">n</span></tt> field is the name of the list. If specified, It MUST be UTF-8 encoded string
and it MUST match the name of the feed in the receiving node.</p>
</div>
<div class="section" id="request-item-response">
<h2>request item response</h2>
<p>This is the format of a response of a list head:</p>
<pre class="literal-block">
{
&quot;r&quot;:
{
&quot;ih&quot;:
[
<em>&lt;n * 20 byte(s) info-hash&gt;</em>,
...
],
&quot;sig&quot;:
[
<em>&lt;64 byte curve25519 signature of info-hash&gt;</em>,
...
],
&quot;id&quot;: <em>&lt;20 byte id of origin node&gt;</em>,
&quot;token&quot;: <em>&lt;write-token&gt;</em>
&quot;nodes&quot;: <em>&lt;n * compact IPv4-port pair&gt;</em>
&quot;nodes6&quot;: <em>&lt;n * compact IPv6-port pair&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
&quot;r&quot;:
{
&quot;head&quot;:
{
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this list&gt;</em>,
&quot;next&quot;: <em>&lt;20 bytes item ID&gt;</em>,
&quot;n&quot;: <em>&lt;name of the linked list&gt;</em>,
&quot;seq&quot;: <em>&lt;monotonically increasing sequence number&gt;</em>
},
&quot;sig&quot;: <em>&lt;curve25519 signature of 'head' entry (in bencoded form)&gt;</em>,
&quot;id&quot;: <em>&lt;20 byte id of sending node&gt;</em>,
&quot;token&quot;: <em>&lt;write-token&gt;</em>,
&quot;nodes&quot;: <em>&lt;n * compact IPv4-port pair&gt;</em>,
&quot;nodes6&quot;: <em>&lt;n * compact IPv6-port pair&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
}
</pre>
<p>Since the data that's being signed by the public key already is a hash (i.e.
an info-hash), the signature of each hash-entry is simply the hash encrypted
by the feed's private key.</p>
<p>The <tt class="docutils literal"><span class="pre">ih</span></tt> and <tt class="docutils literal"><span class="pre">sig</span></tt> lists MUST have equal number of items. Each item in <tt class="docutils literal"><span class="pre">sig</span></tt>
is the signature of the full string in the corresponding item in the <tt class="docutils literal"><span class="pre">ih</span></tt> list.</p>
<p>Each item in the <tt class="docutils literal"><span class="pre">ih</span></tt> list may contain any positive number of 20 byte info-hashes.</p>
<p>The rationale behind using lists of strings where the strings contain multiple
info-hashes is to allow the publisher of a feed to sign multiple info-hashes
together, and thus saving space in the UDP packets, allowing nodes to transfer more
info-hashes per packet. Original publishers of a feed MAY re-announce items lumped
together over time to make the feed more efficient.</p>
<p>A client receiving a <tt class="docutils literal"><span class="pre">get_item</span></tt> response MUST verify each signature in the <tt class="docutils literal"><span class="pre">sig</span></tt>
list against each corresponding item in the <tt class="docutils literal"><span class="pre">ih</span></tt> list using the feed's public key.
Any item whose signature</p>
<p><tt class="docutils literal"><span class="pre">nodes</span></tt> and <tt class="docutils literal"><span class="pre">nodes6</span></tt> are optional and have the same semantics as the standard
<tt class="docutils literal"><span class="pre">get_peers</span></tt> request. The intention is to be able to use this <tt class="docutils literal"><span class="pre">get_item</span></tt> request
in the same way, searching for the nodes responsible for the feed.</p>
<p>This is the format of a response of a list item:</p>
<pre class="literal-block">
{
&quot;r&quot;:
{
&quot;item&quot;:
{
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this list&gt;</em>,
&quot;next&quot;: <em>&lt;20 bytes item ID&gt;</em>,
...
},
&quot;sig&quot;: <em>&lt;curve25519 signature of 'item' entry (in bencoded form)&gt;</em>,
&quot;id&quot;: <em>&lt;20 byte id of sending node&gt;</em>,
&quot;token&quot;: <em>&lt;write-token&gt;</em>,
&quot;nodes&quot;: <em>&lt;n * compact IPv4-port pair&gt;</em>,
&quot;nodes6&quot;: <em>&lt;n * compact IPv6-port pair&gt;</em>
},
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>,
&quot;y&quot;: &quot;r&quot;,
}
</pre>
<p>A client receiving a <tt class="docutils literal"><span class="pre">get_item</span></tt> response MUST verify the signature in the <tt class="docutils literal"><span class="pre">sig</span></tt>
field against the bencoded representation of the <tt class="docutils literal"><span class="pre">item</span></tt> field, using the <tt class="docutils literal"><span class="pre">key</span></tt> as
the public key. The <tt class="docutils literal"><span class="pre">key</span></tt> MUST match the public key of the feed.</p>
<p>The <tt class="docutils literal"><span class="pre">item</span></tt> dictionary MAY contain arbitrary keys, and all keys MUST be stored for
items.</p>
</div>
<div class="section" id="announcing-items">
<h2>announcing items</h2>
<p>The message format for announcing a list head:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;ih&quot;:
[
<em>&lt;n * 20 byte info-hash(es)&gt;</em>,
...
],
&quot;sig&quot;:
[
<em>&lt;64 byte curve25519 signature of info-hash(es)&gt;</em>,
...
],
&quot;id&quot;: <em>&lt;20 byte node-id of origin node&gt;</em>,
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this feed&gt;</em>,
&quot;n&quot;: <em>&lt;feed name&gt;</em>
&quot;target&quot;: <em>&lt;target-id as derived from public key&gt;</em>,
&quot;token&quot;: <em>&lt;write-token as obtained by previous req.&gt;</em>
},
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;announce_item&quot;,
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>
&quot;a&quot;:
{
&quot;head&quot;:
{
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this list&gt;</em>,
&quot;next&quot;: <em>&lt;20 bytes item ID&gt;</em>,
&quot;n&quot;: <em>&lt;name of the linked list&gt;</em>,
&quot;seq&quot;: <em>&lt;monotonically increasing sequence number&gt;</em>
},
&quot;sig&quot;: <em>&lt;curve25519 signature of 'head' entry (in bencoded form)&gt;</em>,
&quot;id&quot;: <em>&lt;20 byte node-id of origin node&gt;</em>,
&quot;target&quot;: <em>&lt;target-id as derived from public key and name&gt;</em>,
&quot;token&quot;: <em>&lt;write-token as obtained by previous request&gt;</em>
},
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;announce_item&quot;,
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>
}
</pre>
<p>An announce can include any number of items, as long as they fit in a packet.</p>
<p>Subscribers to a feed SHOULD also announce items that they know of, to the feed.
In order to make the repository of torrents as reliable as possible, subscribers
SHOULD announce random items from their local repository of items. When re-announcing
items, a random subset of all known items should be announced, randomized
independently for each node it's announced to. This makes it a little bit harder
to determine the IP address an item originated from, since it's a matter of
seeing the first announce, and knowing that it wasn't announced anywhere else
first.</p>
<p>Any subscriber and publisher SHOULD re-announce items every 30 minutes. If
a feed does not receive any announced items in 60 minutes, a peer MAY time
it out and remove it.</p>
<p>Subscribers and publishers SHOULD announce random items.</p>
<p>The message format for announcing a list item:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;item&quot;:
{
&quot;key&quot;: <em>&lt;64 byte public curve25519 key for this list&gt;</em>,
&quot;next&quot;: <em>&lt;20 bytes item ID&gt;</em>,
...
},
&quot;sig&quot;: <em>&lt;curve25519 signature of 'item' entry (in bencoded form)&gt;</em>,
&quot;id&quot;: <em>&lt;20 byte node-id of origin node&gt;</em>,
&quot;target&quot;: <em>&lt;target-id as derived from item dict&gt;</em>,
&quot;token&quot;: <em>&lt;write-token as obtained by previous request&gt;</em>
},
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;announce_item&quot;,
&quot;t&quot;: <em>&lt;transaction-id&gt;</em>
}
</pre>
<p>A storage node MAY reject items and heads whose bencoded representation is
greater than 1024 bytes.</p>
</div>
</div>
<div class="section" id="re-announcing">
<h1>re-announcing</h1>
<p>In order to keep feeds alive, subscriber nodes SHOULD help out in announcing
items they have downloaded to the DHT.</p>
<p>Every subscriber node SHOULD store items in long term storage, across sessions,
in order to keep items alive for as long as possible, with as few sources as possible.</p>
<p>Subscribers to a feed SHOULD also announce items that they know of, to the feed.
Since a feed may have many subscribers and many items, subscribers should re-announce
items according to the following algorithm.</p>
<pre class="literal-block">
1. pick one random item (<em>i</em>) from the local repository (except
items already announced this round)
2. If all items in the local repository have been announced
2.1 terminate
3. look up item <em>i</em> in the DHT
4. If fewer than 8 nodes returned the item
4.1 announce <em>i</em> to the DHT
4.2 goto 1
</pre>
<p>This ensures a balanced load on the DHT while still keeping items alive</p>
</div>
<div class="section" id="timeouts">
<h1>timeouts</h1>
<p>Items SHOULD be announced to the DHT every 30 minutes. A storage node MAY time
out an item after 60 minutes of no one announcing it.</p>
<p>A storing node MAY extend the timeout when it receives a request for it. Since
items are immutable, the data doesn't go stale. Therefore it doesn't matter if
the storing node no longer is in the set of the 8 closest nodes.</p>
</div>
<div class="section" id="rss-feeds">
<h1>RSS feeds</h1>
<p>For RSS feeds, following keys are mandatory in the list item's <tt class="docutils literal"><span class="pre">item</span></tt> dictionary.</p>
<dl class="docutils">
<dt>ih</dt>
<dd>The torrent's info hash</dd>
<dt>size</dt>
<dd>The size (in bytes) of all files the torrent</dd>
<dt>n</dt>
<dd>name of the torrent</dd>
</dl>
<div class="section" id="example">
<h2>example</h2>
<p>This is an example of an <tt class="docutils literal"><span class="pre">announce_item</span></tt> message:</p>
<pre class="literal-block">
{
&quot;a&quot;:
{
&quot;ih&quot;:
[
&quot;7ea94c240691311dc0916a2a91eb7c3db2c6f3e4&quot;,
&quot;0d92ad53c052ac1f49cf4434afffafa4712dc062e4168d940a48e45a45a0b10808014dc267549624&quot;
],
&quot;sig&quot;:
[
&quot;980774404e404941b81aa9da1da0101cab54e670cff4f0054aa563c3b5abcb0fe3c6df5dac1ea25266035f09040bf2a24ae5f614787f1fe7404bf12fee5e6101&quot;,
&quot;3fee52abea47e4d43e957c02873193fb9aec043756845946ec29cceb1f095f03d876a7884e38c53cd89a8041a2adfb2d9241b5ec5d70268714d168b9353a2c01&quot;
],
&quot;id&quot;: &quot;b46989156404e8e0acdb751ef553b210ef77822e&quot;,
&quot;key&quot;: &quot;6bc1de5443d1a7c536cdf69433ac4a7163d3c63e2f9c92d78f6011cf63dbcd5b638bbc2119cdad0c57e4c61bc69ba5e2c08b918c2db8d1848cf514bd9958d307&quot;,
&quot;n&quot;: &quot;my stuff&quot;
&quot;target&quot;: &quot;b4692ef0005639e86d7165bf378474107bf3a762&quot;
&quot;token&quot;: &quot;23ba&quot;
},
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;announce_item&quot;,
&quot;t&quot;: &quot;a421&quot;
&quot;a&quot;:
{
&quot;item&quot;:
{
&quot;key&quot;: &quot;6bc1de5443d1a7c536cdf69433ac4a7163d3c63e2f9c92d
78f6011cf63dbcd5b638bbc2119cdad0c57e4c61bc69ba5e2c08
b918c2db8d1848cf514bd9958d307&quot;,
&quot;info-hash&quot;: &quot;7ea94c240691311dc0916a2a91eb7c3db2c6f3e4&quot;,
&quot;size&quot;: 24315329,
&quot;n&quot;: &quot;my stuff&quot;,
&quot;next&quot;: &quot;c68f29156404e8e0aas8761ef5236bcagf7f8f2e&quot;
}
&quot;sig&quot;: <em>&lt;signature&gt;</em>
&quot;id&quot;: &quot;b46989156404e8e0acdb751ef553b210ef77822e&quot;,
&quot;target&quot;: &quot;b4692ef0005639e86d7165bf378474107bf3a762&quot;
&quot;token&quot;: &quot;23ba&quot;
},
&quot;y&quot;: &quot;q&quot;,
&quot;q&quot;: &quot;announce_item&quot;,
&quot;t&quot;: &quot;a421&quot;
}
</pre>
<p>Strings are printed in hex for printability, but actual encoding is binary. The
response contains 3 feed items, starting with &quot;7ea94c&quot;, &quot;0d92ad&quot; and &quot;e4168d&quot;.
These 3 items are not published optimally. If they were to be merged into a single
string in the <tt class="docutils literal"><span class="pre">ih</span></tt> list, more than 64 bytes would be saved (because of having
one less signature).</p>
<p>Note that <tt class="docutils literal"><span class="pre">target</span></tt> is in fact SHA1('my stuff' + 'key'). The private key
used in this example is 980f4cd7b812ae3430ea05af7c09a7e430275f324f42275ca534d9f7c6d06f5b.</p>
<p>Strings are printed in hex for printability, but actual encoding is binary.</p>
<p>Note that <tt class="docutils literal"><span class="pre">target</span></tt> is in fact SHA1 hash of the same data the signature <tt class="docutils literal"><span class="pre">sig</span></tt>
is the signature of, i.e.:</p>
<pre class="literal-block">
d9:info-hash20:7ea94c240691311dc0916a2a91eb7c3db2c6f3e43:key64:6bc1de5443d1
a7c536cdf69433ac4a7163d3c63e2f9c92d78f6011cf63dbcd5b638bbc2119cdad0c57e4c61
bc69ba5e2c08b918c2db8d1848cf514bd9958d3071:n8:my stuff4:next20:c68f29156404
e8e0aas8761ef5236bcagf7f8f2e4:sizei24315329ee
</pre>
<p>(note that binary data is printed as hex)</p>
</div>
</div>
<div class="section" id="uri-scheme">
<h1>URI scheme</h1>
<div class="section" id="rss-feed-uri-scheme">
<h1>RSS feed URI scheme</h1>
<p>The proposed URI scheme for DHT feeds is:</p>
<pre class="literal-block">
magnet:?xt=btfd:<em>&lt;base16-curve25519-public-key&gt;</em> &amp;dn= <em>&lt;feed name&gt;</em>
@ -284,10 +400,9 @@ calculating the target ID.</p>
</div>
<div class="section" id="rationale">
<h1>rationale</h1>
<p>The reason to use <a class="reference external" href="http://cr.yp.to/ecdh.html">curve25519</a> instead of, for instance, RSA is to fit more signatures
(i.e. items) in a single DHT packet. One packet is typically restricted to between
1280 - 1480 bytes. According to <a class="reference external" href="http://cr.yp.to/">http://cr.yp.to/</a>, curve25519 is free from patent claims
and there are open implementations in both C and Java.</p>
<p>The reason to use <a class="reference external" href="http://cr.yp.to/ecdh.html">curve25519</a> instead of, for instance, RSA is compactness. According to
<a class="reference external" href="http://cr.yp.to/">http://cr.yp.to/</a>, curve25519 is free from patent claims and there are open implementations
in both C and Java.</p>
</div>
</div>
<div id="footer">

View File

@ -27,224 +27,350 @@ message with feed items in it MUST verify both the validity of the public
key against the target ID it is stored under, as well as the validity of
the signatures of each individual item.
Any peer who is subscribing to a DHT feed SHOULD also participate in
regularly re-announcing items that it knows about. Every participant
SHOULD store items in long term storage, across sessions, in order to
keep items alive for as long as possible, with as few sources as possible.
As with normal DHT announces, the write-token mechanism is used to
prevent spoof attacks.
prevent IP spoof attacks.
There are two new proposed messages, ``announce_item`` and ``get_item``.
Every valid item that is announced, should be stored. In a request to get items,
as many items as can fit in a normal UDP packet size should be returned. If
there are more items than can fit, a random sub-set should be returned.
Every valid item that is announced, should be stored.
*Is there a better heuristic here? Should there be a bias towards newer items?
If so, there needs to be a signed timestamp as well, which might get messy*
terminology
-----------
target ID
In this document, a *storage node* refers to the node in the DHT to which
an item is being announce. A *subscribing node* refers to a node which
makes look ups in the DHT to find the storage nodes, to request items
from them.
linked lists
------------
Items are chained together in a geneal singly linked list. A linked
list does not necessarily contain RSS items, and no RSS related items
are mandatory. However, RSS items will be used as examples in this BEP::
key = SHA1(name + key)
+---------+
| head | key = SHA1(bencode(item))
| +---------+ +---------+
| | next |-------->| item | key = SHA1(bencode(item))
| | key | | +---------+ +---------+
| | name | | | next |------->| item |
| | seq | | | key | | +---------+
| | ... | | | ... | | | next |--->0
| +---------+ | +---------+ | | key |
| sig | | sig | | | ... |
+---------+ +---------+ | +---------+
| sig |
+---------+
The ``next`` pointer is at least 20 byte ID in the DHT key space pointing to where the next
item in the list is announced. The list is terminated with an ID of all zeroes.
The ID an items is announced to is determined by the SHA1 hash of the bencoded representation
of the item iteself. This contains all fields in the item, except the signature.
The only mandatory fields in an item are ``next``, ``key`` and ``sig``.
The ``key`` field MUST match the public key of the list head node. The ``sig`` field
MUST be the signature of the bencoded representation of ``item`` or ``head`` (whichever
is included in the message).
All subscribers MUST verify that the item is announced under the correct DHT key
and MUST verify the signature is valid and MUST verify the public key is the same
as the list-head. If a node fails any of these checks, it must be ignored and the
chain of items considered terminated.
Each item holds a bencoded dictionary with arbitrary keys, except two mandatory keys:
``next`` and ``key``. The signature ``sig`` is transferred outside of this dictionary
and is the signature of all of it. An implementation should stora any arbitrary keys that
are announced to an item, within reasonable restriction such as nesting, size and numeric
range of integers.
skip lists
----------
The ``next`` key stored in the list head and the items is a string of at least length
20 bytes, it may be any length divisible by 20. Each 20 bytes are the ID of the next
item in the list, the item 2 hops away, 4 hops away, 8 hops away, and so on. For
simplicity, only the first ID (1 hop) in the ``next`` field is illustrated above.
A publisher of an item SHOULD include as many IDs in the ``next`` field as the remaining
size of the list warrants, within reason.
These skip lists allow for parallelized lookups of items and also makes it more efficient
to search for specific items. It also mitigates breaking lists missing some items.
Figure of the skip list in the first list item::
n Item0 Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9 Item10
0 O----->
20 O------------>
40 O-------------------------->
60 O------------------------------------------------------>
*n* refers to the byte offset into the ``next`` field.
list-head
---------
The target, i.e. the ID in the DHT key space feeds are announced to, MUST always
be SHA-1(*feed_name* + *public_key*). Any request where this condition is not met,
MUST be dropped.
The list head item is special in that it can be updated, without changing its
DHT key. This is required to prepend new items to the linked list. To authenticate
that only the original publisher can update the head, the whole linked list head
is signed. In order to avoid a malicious node to overwrite the list head with an old
version, the sequence number ``seq`` must be monotonically increasing for each update,
and a node hosting the list node MUST not downgrade a list head from a higher sequence
number to a lower one, only upgrade.
Using the feed name as part of the target means a feed publisher only needs one
public-private keypair for any number of feeds, as long as the feeds have different
names.
The list head's DHT key (which it is announced to) MUST be the SHA1 hash of the name
(``n``) and ``key`` fields concatenated.
Any node MUST reject any list head which is announced under any other ID.
messages
--------
These are the proposed new message formats.
These are the messages to deal with linked lists.
The ``id`` field in these messages has the same semantics as the standard DHT messages,
i.e. the node ID of the node sending the message, to maintain the structure of the DHT
network.
The ``token`` field also has the same semantics as the standard DHT message ``get_peers``
and ``announce_peer``, when requesting an item and to write an item respectively.
``nodes`` and ``nodes6`` has the same semantics as in its ``get_peers`` response.
requesting items
................
This message can be used to request both a list head and a list item. When requesting
a list head, the ``n`` (name) field MUST be specified. When requesting a list item the
``n`` field is not required.
.. parsed-literal::
{
"a":
{
"filter": *<variable size bloom-filter>*,
"id": *<20 byte id of origin node>*,
"key": *<64 byte public curve25519 key for this feed>*,
"n": *<feed-name>*
"target": *<target-id as derived from public key>*
},
"q": "get_item",
"t": *<transaction-id>*,
"y": "q",
"a":
{
"id": *<20 byte ID of sending node>*,
"key": *<64 byte public curve25519 key for this list>*,
"n": *<list name>*
"target": *<target-id for 'head' or 'item'>*
},
"q": "get_item",
"t": *<transaction-id>*,
"y": "q",
}
The ``target`` MUST always be SHA-1(*feed_name* + *public_key*). Any request where
this condition is not met, MUST be dropped.
The ``n`` field is the name of this feed. It MUST be UTF-8 encoded string and it
MUST match the name of the feed in the receiving node.
The bloom filter argument (``filter``) in the ``get_item`` requests is optional.
If included in a request, it represents info-hashes that should be excluded from
the response. In this case, the response should be a random subset of the non-excluded
items, or all of the non-excluded items if they all fit within a packet size.
If the bloom filter is specified, its size MUST be an even multiple of 8 bits. The size
is implied by the length of the string. For each info-hash to exclude from the response,
There are no hash functions for the bloom filter. Since the info-hash is already a
hash digest, each pair of bytes, starting with the first bytes (MSB), are used as the
results from the imaginary hash functions for the bloom filter. k is 3 in this bloom
filter. This means the first 6 bytes of the info-hash is used to set 3 bits in the bloom
filter. The pairs of bytes pulled out from the info-hash are interpreted as a big-endian
16 bit value.
Bits are indexed in bytes from left to right, and within bytes from LSB to MSB. i.e., to
set bit 12: ``bitfield[12/8] |= (12 % 8)``.
Example:
To indicate that you are not interested in knowing about the info-hash that
starts with 0x4f7d25a... and you choose a bloom filter of size 80 bits. Set bits
(0x4f % 80), (0x7d % 80) and (0x25 % 80) in the bloom filter bitmask.
When requesting a list-head the ``target`` MUST always be SHA-1(*feed_name* + *public_key*).
``target`` is the target node ID the item was written to.
The ``n`` field is the name of the list. If specified, It MUST be UTF-8 encoded string
and it MUST match the name of the feed in the receiving node.
request item response
.....................
This is the format of a response of a list head:
.. parsed-literal::
{
"r":
{
"ih":
[
*<n * 20 byte(s) info-hash>*,
...
],
"sig":
[
*<64 byte curve25519 signature of info-hash>*,
...
],
"id": *<20 byte id of origin node>*,
"token": *<write-token>*
"nodes": *<n * compact IPv4-port pair>*
"nodes6": *<n * compact IPv6-port pair>*
},
"t": *<transaction-id>*,
"y": "r",
"r":
{
"head":
{
"key": *<64 byte public curve25519 key for this list>*,
"next": *<20 bytes item ID>*,
"n": *<name of the linked list>*,
"seq": *<monotonically increasing sequence number>*
},
"sig": *<curve25519 signature of 'head' entry (in bencoded form)>*,
"id": *<20 byte id of sending node>*,
"token": *<write-token>*,
"nodes": *<n * compact IPv4-port pair>*,
"nodes6": *<n * compact IPv6-port pair>*
},
"t": *<transaction-id>*,
"y": "r",
}
Since the data that's being signed by the public key already is a hash (i.e.
an info-hash), the signature of each hash-entry is simply the hash encrypted
by the feed's private key.
This is the format of a response of a list item:
The ``ih`` and ``sig`` lists MUST have equal number of items. Each item in ``sig``
is the signature of the full string in the corresponding item in the ``ih`` list.
.. parsed-literal::
Each item in the ``ih`` list may contain any positive number of 20 byte info-hashes.
{
"r":
{
"item":
{
"key": *<64 byte public curve25519 key for this list>*,
"next": *<20 bytes item ID>*,
...
},
"sig": *<curve25519 signature of 'item' entry (in bencoded form)>*,
"id": *<20 byte id of sending node>*,
"token": *<write-token>*,
"nodes": *<n * compact IPv4-port pair>*,
"nodes6": *<n * compact IPv6-port pair>*
},
"t": *<transaction-id>*,
"y": "r",
}
The rationale behind using lists of strings where the strings contain multiple
info-hashes is to allow the publisher of a feed to sign multiple info-hashes
together, and thus saving space in the UDP packets, allowing nodes to transfer more
info-hashes per packet. Original publishers of a feed MAY re-announce items lumped
together over time to make the feed more efficient.
A client receiving a ``get_item`` response MUST verify the signature in the ``sig``
field against the bencoded representation of the ``item`` field, using the ``key`` as
the public key. The ``key`` MUST match the public key of the feed.
A client receiving a ``get_item`` response MUST verify each signature in the ``sig``
list against each corresponding item in the ``ih`` list using the feed's public key.
Any item whose signature
``nodes`` and ``nodes6`` are optional and have the same semantics as the standard
``get_peers`` request. The intention is to be able to use this ``get_item`` request
in the same way, searching for the nodes responsible for the feed.
The ``item`` dictionary MAY contain arbitrary keys, and all keys MUST be stored for
items.
announcing items
................
The message format for announcing a list head:
.. parsed-literal::
{
"a":
{
"ih":
[
*<n * 20 byte info-hash(es)>*,
...
],
"sig":
[
*<64 byte curve25519 signature of info-hash(es)>*,
...
],
"id": *<20 byte node-id of origin node>*,
"key": *<64 byte public curve25519 key for this feed>*,
"n": *<feed name>*
"target": *<target-id as derived from public key>*,
"token": *<write-token as obtained by previous req.>*
},
"y": "q",
"q": "announce_item",
"t": *<transaction-id>*
"a":
{
"head":
{
"key": *<64 byte public curve25519 key for this list>*,
"next": *<20 bytes item ID>*,
"n": *<name of the linked list>*,
"seq": *<monotonically increasing sequence number>*
},
"sig": *<curve25519 signature of 'head' entry (in bencoded form)>*,
"id": *<20 byte node-id of origin node>*,
"target": *<target-id as derived from public key and name>*,
"token": *<write-token as obtained by previous request>*
},
"y": "q",
"q": "announce_item",
"t": *<transaction-id>*
}
An announce can include any number of items, as long as they fit in a packet.
The message format for announcing a list item:
.. parsed-literal::
{
"a":
{
"item":
{
"key": *<64 byte public curve25519 key for this list>*,
"next": *<20 bytes item ID>*,
...
},
"sig": *<curve25519 signature of 'item' entry (in bencoded form)>*,
"id": *<20 byte node-id of origin node>*,
"target": *<target-id as derived from item dict>*,
"token": *<write-token as obtained by previous request>*
},
"y": "q",
"q": "announce_item",
"t": *<transaction-id>*
}
A storage node MAY reject items and heads whose bencoded representation is
greater than 1024 bytes.
re-announcing
-------------
In order to keep feeds alive, subscriber nodes SHOULD help out in announcing
items they have downloaded to the DHT.
Every subscriber node SHOULD store items in long term storage, across sessions,
in order to keep items alive for as long as possible, with as few sources as possible.
Subscribers to a feed SHOULD also announce items that they know of, to the feed.
In order to make the repository of torrents as reliable as possible, subscribers
SHOULD announce random items from their local repository of items. When re-announcing
items, a random subset of all known items should be announced, randomized
independently for each node it's announced to. This makes it a little bit harder
to determine the IP address an item originated from, since it's a matter of
seeing the first announce, and knowing that it wasn't announced anywhere else
first.
Since a feed may have many subscribers and many items, subscribers should re-announce
items according to the following algorithm.
Any subscriber and publisher SHOULD re-announce items every 30 minutes. If
a feed does not receive any announced items in 60 minutes, a peer MAY time
it out and remove it.
.. parsed-literal::
Subscribers and publishers SHOULD announce random items.
1. pick one random item (*i*) from the local repository (except
items already announced this round)
2. If all items in the local repository have been announced
2.1 terminate
3. look up item *i* in the DHT
4. If fewer than 8 nodes returned the item
4.1 announce *i* to the DHT
4.2 goto 1
This ensures a balanced load on the DHT while still keeping items alive
timeouts
--------
Items SHOULD be announced to the DHT every 30 minutes. A storage node MAY time
out an item after 60 minutes of no one announcing it.
A storing node MAY extend the timeout when it receives a request for it. Since
items are immutable, the data doesn't go stale. Therefore it doesn't matter if
the storing node no longer is in the set of the 8 closest nodes.
RSS feeds
---------
For RSS feeds, following keys are mandatory in the list item's ``item`` dictionary.
ih
The torrent's info hash
size
The size (in bytes) of all files the torrent
n
name of the torrent
example
.......
This is an example of an ``announce_item`` message::
This is an example of an ``announce_item`` message:
.. parsed-literal::
{
"a":
{
"ih":
[
"7ea94c240691311dc0916a2a91eb7c3db2c6f3e4",
"0d92ad53c052ac1f49cf4434afffafa4712dc062e4168d940a48e45a45a0b10808014dc267549624"
],
"sig":
[
"980774404e404941b81aa9da1da0101cab54e670cff4f0054aa563c3b5abcb0fe3c6df5dac1ea25266035f09040bf2a24ae5f614787f1fe7404bf12fee5e6101",
"3fee52abea47e4d43e957c02873193fb9aec043756845946ec29cceb1f095f03d876a7884e38c53cd89a8041a2adfb2d9241b5ec5d70268714d168b9353a2c01"
],
"id": "b46989156404e8e0acdb751ef553b210ef77822e",
"key": "6bc1de5443d1a7c536cdf69433ac4a7163d3c63e2f9c92d78f6011cf63dbcd5b638bbc2119cdad0c57e4c61bc69ba5e2c08b918c2db8d1848cf514bd9958d307",
"n": "my stuff"
"target": "b4692ef0005639e86d7165bf378474107bf3a762"
"token": "23ba"
},
"y": "q",
"q": "announce_item",
"t": "a421"
"a":
{
"item":
{
"key": "6bc1de5443d1a7c536cdf69433ac4a7163d3c63e2f9c92d
78f6011cf63dbcd5b638bbc2119cdad0c57e4c61bc69ba5e2c08
b918c2db8d1848cf514bd9958d307",
"info-hash": "7ea94c240691311dc0916a2a91eb7c3db2c6f3e4",
"size": 24315329,
"n": "my stuff",
"next": "c68f29156404e8e0aas8761ef5236bcagf7f8f2e"
}
"sig": *<signature>*
"id": "b46989156404e8e0acdb751ef553b210ef77822e",
"target": "b4692ef0005639e86d7165bf378474107bf3a762"
"token": "23ba"
},
"y": "q",
"q": "announce_item",
"t": "a421"
}
Strings are printed in hex for printability, but actual encoding is binary. The
response contains 3 feed items, starting with "7ea94c", "0d92ad" and "e4168d".
These 3 items are not published optimally. If they were to be merged into a single
string in the ``ih`` list, more than 64 bytes would be saved (because of having
one less signature).
Strings are printed in hex for printability, but actual encoding is binary.
Note that ``target`` is in fact SHA1('my stuff' + 'key'). The private key
used in this example is 980f4cd7b812ae3430ea05af7c09a7e430275f324f42275ca534d9f7c6d06f5b.
Note that ``target`` is in fact SHA1 hash of the same data the signature ``sig``
is the signature of, i.e.::
d9:info-hash20:7ea94c240691311dc0916a2a91eb7c3db2c6f3e43:key64:6bc1de5443d1
a7c536cdf69433ac4a7163d3c63e2f9c92d78f6011cf63dbcd5b638bbc2119cdad0c57e4c61
bc69ba5e2c08b918c2db8d1848cf514bd9958d3071:n8:my stuff4:next20:c68f29156404
e8e0aas8761ef5236bcagf7f8f2e4:sizei24315329ee
URI scheme
----------
(note that binary data is printed as hex)
RSS feed URI scheme
--------------------
The proposed URI scheme for DHT feeds is:
@ -261,10 +387,9 @@ calculating the target ID.
rationale
---------
The reason to use curve25519_ instead of, for instance, RSA is to fit more signatures
(i.e. items) in a single DHT packet. One packet is typically restricted to between
1280 - 1480 bytes. According to http://cr.yp.to/, curve25519 is free from patent claims
and there are open implementations in both C and Java.
The reason to use curve25519_ instead of, for instance, RSA is compactness. According to
http://cr.yp.to/, curve25519 is free from patent claims and there are open implementations
in both C and Java.
.. _curve25519: http://cr.yp.to/ecdh.html

View File

@ -1388,6 +1388,7 @@ struct dht_settings
int max_peers_reply;
int search_branching;
int max_fail_count;
int max_torrents;
bool restrict_routing_ips;
bool restrict_search_ips;
};
@ -1402,6 +1403,12 @@ before it is removed from the routing table. If there are known working nodes
that are ready to replace a failing node, it will be replaced immediately,
this limit is only used to clear out nodes that don't have any node that can
replace them.</p>
<p><tt class="docutils literal"><span class="pre">max_torrents</span></tt> is the total number of torrents to track from the DHT. This
is simply an upper limit to make sure malicious DHT nodes cannot make us allocate
an unbounded amount of memory.</p>
<p><tt class="docutils literal"><span class="pre">max_feed_items</span></tt> is the total number of feed items to store from the DHT. This
is simply an upper limit to make sure malicious DHT nodes cannot make us allocate
an unbounded amount of memory.</p>
<p><tt class="docutils literal"><span class="pre">restrict_routing_ips</span></tt> determines if the routing table entries should restrict
entries to one per IP. This defaults to true, which helps mitigate some attacks
on the DHT. It prevents adding multiple nodes with IPs with a very close CIDR
@ -4178,7 +4185,6 @@ struct session_settings
bool low_prio_disk;
int local_service_announce_interval;
int dht_announce_interval;
int dht_max_torrents;
int udp_tracker_token_expiry;
bool volatile_read_cache;

View File

@ -1188,6 +1188,7 @@ struct has the following members::
int max_peers_reply;
int search_branching;
int max_fail_count;
int max_torrents;
bool restrict_routing_ips;
bool restrict_search_ips;
};
@ -1205,6 +1206,14 @@ that are ready to replace a failing node, it will be replaced immediately,
this limit is only used to clear out nodes that don't have any node that can
replace them.
``max_torrents`` is the total number of torrents to track from the DHT. This
is simply an upper limit to make sure malicious DHT nodes cannot make us allocate
an unbounded amount of memory.
``max_feed_items`` is the total number of feed items to store from the DHT. This
is simply an upper limit to make sure malicious DHT nodes cannot make us allocate
an unbounded amount of memory.
``restrict_routing_ips`` determines if the routing table entries should restrict
entries to one per IP. This defaults to true, which helps mitigate some attacks
on the DHT. It prevents adding multiple nodes with IPs with a very close CIDR
@ -4173,7 +4182,6 @@ session_settings
bool low_prio_disk;
int local_service_announce_interval;
int dht_announce_interval;
int dht_max_torrents;
int udp_tracker_token_expiry;
bool volatile_read_cache;

View File

@ -48,6 +48,7 @@ POSSIBILITY OF SUCH DAMAGE.
#include <libtorrent/session_settings.hpp>
#include <libtorrent/assert.hpp>
#include <libtorrent/thread.hpp>
#include <libtorrent/bloom_filter.hpp>
#include <boost/cstdint.hpp>
#include <boost/ref.hpp>
@ -55,10 +56,7 @@ POSSIBILITY OF SUCH DAMAGE.
#include "libtorrent/socket.hpp"
namespace libtorrent {
namespace aux { struct session_impl; }
struct session_status;
struct alert_manager;
}
namespace libtorrent { namespace dht
@ -77,7 +75,22 @@ struct key_desc_t
int size;
int flags;
enum { optional = 1};
enum {
// this argument is optional, parsing will not
// fail if it's not present
optional = 1,
// for dictionaries, the following entries refer
// to child nodes to this node, up until and including
// the next item that has the last_child flag set.
// these flags are nestable
parse_children = 2,
// this is the last item in a child dictionary
last_child = 4,
// the size argument refers to that the size
// has to be divisible by the number, instead
// of having that exact size
size_divisible = 8
};
};
bool TORRENT_EXPORT verify_message(lazy_entry const* msg, key_desc_t const desc[], lazy_entry const* ret[]
@ -99,6 +112,22 @@ struct torrent_entry
std::set<peer_entry> peers;
};
struct feed_item
{
feed_item() : sequence_number(0), num_announcers(0) {}
enum { list_head, list_item } type;
size_type sequence_number;
std::string name;
unsigned char signature[64];
entry item;
ptime last_seen;
// this counts the number of IPs we have seen
// announcing this item, this is used to determine
// popularity if we reach the limit of items to store
bloom_filter<8> ips;
int num_announcers;
};
// this is the entry for a torrent that has been published
// in the DHT.
struct TORRENT_EXPORT search_torrent_entry
@ -177,12 +206,16 @@ struct count_peers
class node_impl : boost::noncopyable
{
typedef std::map<node_id, torrent_entry> table_t;
typedef std::map<node_id, feed_item> feed_table_t;
typedef std::map<std::pair<node_id, sha1_hash>, search_torrent_entry> search_table_t;
public:
node_impl(libtorrent::aux::session_impl& ses
typedef boost::function3<void, address, int, address> external_ip_fun;
node_impl(libtorrent::alert_manager& alerts
, bool (*f)(void*, entry const&, udp::endpoint const&, int)
, dht_settings const& settings, node_id nid
, void* userdata);
, dht_settings const& settings, node_id nid, address const& external_address
, external_ip_fun ext_ip, void* userdata);
virtual ~node_impl() {}
@ -292,6 +325,7 @@ public:
private:
table_t m_map;
feed_table_t m_feeds;
search_table_t m_search_map;
ptime m_last_tracker_tick;
@ -299,7 +333,7 @@ private:
// secret random numbers used to create write tokens
int m_secret[2];
libtorrent::aux::session_impl& m_ses;
libtorrent::alert_manager& m_alerts;
bool (*m_send)(void*, entry const&, udp::endpoint const&, int);
void* m_userdata;
};

View File

@ -37,6 +37,7 @@ POSSIBILITY OF SUCH DAMAGE.
#include <map>
#include <boost/cstdint.hpp>
#include <boost/pool/pool.hpp>
#include <boost/function/function3.hpp>
#include <libtorrent/socket.hpp>
#include <libtorrent/entry.hpp>
@ -68,10 +69,11 @@ class rpc_manager
{
public:
typedef bool (*send_fun)(void* userdata, entry const&, udp::endpoint const&, int);
typedef boost::function3<void, address, int, address> external_ip_fun;
rpc_manager(node_id const& our_id
, routing_table& table, send_fun const& sf
, void* userdata, aux::session_impl& ses);
, void* userdata, external_ip_fun ext_ip);
~rpc_manager();
void unreachable(udp::endpoint const& ep);
@ -113,7 +115,7 @@ private:
node_id m_random_number;
int m_allocated_observers;
bool m_destructing;
aux::session_impl& m_ses;
external_ip_fun m_ext_ip;
};
} } // namespace libtorrent::dht

View File

@ -206,7 +206,6 @@ namespace libtorrent
, low_prio_disk(true)
, local_service_announce_interval(5 * 60)
, dht_announce_interval(15 * 60)
, dht_max_torrents(3000)
, udp_tracker_token_expiry(60)
, volatile_read_cache(false)
, guided_read_cache(true)
@ -800,9 +799,6 @@ namespace libtorrent
// torrents. Defaults to 15 minutes
int dht_announce_interval;
// this is the max number of torrents the DHT will track
int dht_max_torrents;
// the number of seconds a connection ID received
// from a UDP tracker is valid for. This is specified
// as 60 seconds
@ -1032,6 +1028,8 @@ namespace libtorrent
, service_port(0)
#endif
, max_fail_count(20)
, max_torrents(3000)
, max_feed_items(3000)
, max_torrent_search_reply(20)
, restrict_routing_ips(true)
, restrict_search_ips(true)
@ -1055,6 +1053,12 @@ namespace libtorrent
// in a row before it is removed from the table.
int max_fail_count;
// this is the max number of torrents the DHT will track
int max_torrents;
// max number of feed items the DHT will store
int max_feed_items;
// the max number of torrents to return in a
// torrent search query to the DHT
int max_torrent_search_reply;

View File

@ -208,7 +208,10 @@ namespace libtorrent { namespace dht
// unit and connecting them together.
dht_tracker::dht_tracker(libtorrent::aux::session_impl& ses, rate_limited_udp_socket& sock
, dht_settings const& settings, entry const* state)
: m_dht(ses, &send_callback, settings, extract_node_id(state), this)
: m_dht(ses.m_alerts, &send_callback, settings, extract_node_id(state)
, ses.external_address()
, boost::bind(&aux::session_impl::set_external_address, &ses, _1, _2, _3)
, this)
, m_ses(ses)
, m_sock(sock)
, m_last_new_key(time_now() - minutes(key_refresh))

View File

@ -39,6 +39,7 @@ POSSIBILITY OF SUCH DAMAGE.
#include "libtorrent/io.hpp"
#include "libtorrent/hasher.hpp"
#include "libtorrent/alert_types.hpp"
#include "libtorrent/alert.hpp"
#include "libtorrent/socket.hpp"
#include "libtorrent/aux_/session_impl.hpp"
#include "libtorrent/kademlia/node_id.hpp"
@ -174,20 +175,16 @@ void purge_peers(std::set<peer_entry>& peers)
void nop() {}
// TODO: the session_impl argument could be an alert reference
// instead, and make the dht_tracker less dependent on session_impl
// which would make it simpler to unit test
node_impl::node_impl(libtorrent::aux::session_impl& ses
node_impl::node_impl(libtorrent::alert_manager& alerts
, bool (*f)(void*, entry const&, udp::endpoint const&, int)
, dht_settings const& settings
, node_id nid
, void* userdata)
, dht_settings const& settings, node_id nid, address const& external_address
, external_ip_fun ext_ip, void* userdata)
: m_settings(settings)
, m_id(nid == (node_id::min)() || !verify_id(nid, ses.external_address()) ? generate_id(ses.external_address()) : nid)
, m_id(nid == (node_id::min)() || !verify_id(nid, external_address) ? generate_id(external_address) : nid)
, m_table(m_id, 8, settings)
, m_rpc(m_id, m_table, f, userdata, ses)
, m_rpc(m_id, m_table, f, userdata, ext_ip)
, m_last_tracker_tick(time_now())
, m_ses(ses)
, m_alerts(alerts)
, m_send(f)
, m_userdata(userdata)
{
@ -436,6 +433,16 @@ time_duration node_impl::connection_timeout()
if (now - m_last_tracker_tick < minutes(2)) return d;
m_last_tracker_tick = now;
for (feed_table_t::iterator i = m_feeds.begin(); i != m_feeds.end();)
{
if (i->second.last_seen + minutes(60) > now)
{
++i;
continue;
}
m_feeds.erase(i);
}
// look through all peers and see if any have timed out
for (table_t::iterator i = m_map.begin(), end(m_map.end()); i != end;)
{
@ -475,8 +482,8 @@ void node_impl::status(session_status& s)
bool node_impl::lookup_torrents(sha1_hash const& target
, entry& reply, char* tags) const
{
// if (m_ses.m_alerts.should_post<dht_find_torrents_alert>())
// m_ses.m_alerts.post_alert(dht_find_torrents_alert(info_hash));
// if (m_alerts.should_post<dht_find_torrents_alert>())
// m_alerts.post_alert(dht_find_torrents_alert(info_hash));
search_table_t::const_iterator first, last;
first = m_search_map.lower_bound(std::make_pair(target, (sha1_hash::min)()));
@ -521,8 +528,8 @@ bool node_impl::lookup_torrents(sha1_hash const& target
bool node_impl::lookup_peers(sha1_hash const& info_hash, int prefix, entry& reply) const
{
if (m_ses.m_alerts.should_post<dht_get_peers_alert>())
m_ses.m_alerts.post_alert(dht_get_peers_alert(info_hash));
if (m_alerts.should_post<dht_get_peers_alert>())
m_alerts.post_alert(dht_get_peers_alert(info_hash));
table_t::const_iterator i = m_map.lower_bound(info_hash);
if (i == m_map.end()) return false;
@ -616,14 +623,24 @@ bool verify_message(lazy_entry const* msg, key_desc_t const desc[], lazy_entry c
// clear the return buffer
memset(ret, 0, sizeof(ret[0]) * size);
// when parsing child nodes, this is the stack
// of lazy_entry pointers to return to
lazy_entry const* stack[5];
int stack_ptr = -1;
if (msg->type() != lazy_entry::dict_t)
{
snprintf(error, error_size, "not a dictionary");
return false;
}
++stack_ptr;
stack[stack_ptr] = msg;
for (int i = 0; i < size; ++i)
{
key_desc_t const& k = desc[i];
// fprintf(stderr, "looking for %s in %s\n", k.name, print_entry(*msg).c_str());
ret[i] = msg->dict_find(k.name);
if (ret[i] && ret[i]->type() != k.type) ret[i] = 0;
if (ret[i] == 0 && (k.flags & key_desc_t::optional) == 0)
@ -635,17 +652,50 @@ bool verify_message(lazy_entry const* msg, key_desc_t const desc[], lazy_entry c
if (k.size > 0
&& ret[i]
&& k.type == lazy_entry::string_t
&& ret[i]->string_length() != k.size)
&& k.type == lazy_entry::string_t)
{
// the string was not of the required size
ret[i] = 0;
if ((k.flags & key_desc_t::optional) == 0)
bool invalid = false;
if (k.flags & key_desc_t::size_divisible)
invalid = (ret[i]->string_length() % k.size) != 0;
else
invalid = ret[i]->string_length() != k.size;
if (invalid)
{
snprintf(error, error_size, "invalid value for '%s'", k.name);
return false;
// the string was not of the required size
ret[i] = 0;
if ((k.flags & key_desc_t::optional) == 0)
{
snprintf(error, error_size, "invalid value for '%s'", k.name);
return false;
}
}
}
if (k.flags & key_desc_t::parse_children)
{
TORRENT_ASSERT(k.type == lazy_entry::dict_t);
if (ret[i])
{
++stack_ptr;
TORRENT_ASSERT(stack_ptr < sizeof(stack)/sizeof(stack[0]));
msg = ret[i];
stack[stack_ptr] = msg;
}
else
{
// skip all children
while (i < size && (desc[i].flags & key_desc_t::last_child) == 0) ++i;
// if this assert is hit, desc is incorrect
TORRENT_ASSERT(i < size);
}
}
else if (k.flags & key_desc_t::last_child)
{
TORRENT_ASSERT(stack_ptr > 0);
--stack_ptr;
msg = stack[stack_ptr];
}
}
return true;
}
@ -785,14 +835,14 @@ void node_impl::incoming_request(msg const& m, entry& e)
#ifdef TORRENT_DHT_VERBOSE_LOGGING
++g_failed_announces;
#endif
incoming_error(e, "invalid 'port' in announce");
incoming_error(e, "invalid port");
return;
}
sha1_hash info_hash(msg_keys[0]->string_ptr());
if (m_ses.m_alerts.should_post<dht_announce_alert>())
m_ses.m_alerts.post_alert(dht_announce_alert(
if (m_alerts.should_post<dht_announce_alert>())
m_alerts.post_alert(dht_announce_alert(
m.addr.address(), port, info_hash));
if (!verify_token(msg_keys[2]->string_value(), msg_keys[0]->string_ptr(), m.addr))
@ -800,7 +850,7 @@ void node_impl::incoming_request(msg const& m, entry& e)
#ifdef TORRENT_DHT_VERBOSE_LOGGING
++g_failed_announces;
#endif
incoming_error(e, "invalid token in announce");
incoming_error(e, "invalid token");
return;
}
@ -809,7 +859,7 @@ void node_impl::incoming_request(msg const& m, entry& e)
// the table get a chance to add it.
m_table.node_seen(id, m.addr);
if (!m_map.empty() && m_map.size() >= m_ses.settings().dht_max_torrents)
if (!m_map.empty() && m_map.size() >= m_settings.max_torrents)
{
// we need to remove some. Remove the ones with the
// fewest peers
@ -872,6 +922,187 @@ void node_impl::incoming_request(msg const& m, entry& e)
lookup_torrents(target, reply, (char*)msg_keys[1]->string_cstr());
}
*/
else if (strcmp(query, "announce_item") == 0)
{
feed_item add_item;
const static key_desc_t msg_desc[] = {
{"target", lazy_entry::string_t, 20, 0},
{"token", lazy_entry::string_t, 0, 0},
{"sig", lazy_entry::string_t, sizeof(add_item.signature), 0},
{"head", lazy_entry::dict_t, 0, key_desc_t::optional | key_desc_t::parse_children},
{"n", lazy_entry::string_t, 0, 0},
{"key", lazy_entry::string_t, 64, 0},
{"seq", lazy_entry::int_t, 0, 0},
{"next", lazy_entry::string_t, 20, key_desc_t::last_child | key_desc_t::size_divisible},
{"item", lazy_entry::dict_t, 0, key_desc_t::optional | key_desc_t::parse_children},
{"key", lazy_entry::string_t, 64, 0},
{"next", lazy_entry::string_t, 20, key_desc_t::last_child | key_desc_t::size_divisible},
};
// attempt to parse the message
lazy_entry const* msg_keys[11];
if (!verify_message(arg_ent, msg_desc, msg_keys, 11, error_string, sizeof(error_string)))
{
incoming_error(e, error_string);
return;
}
sha1_hash target(msg_keys[0]->string_ptr());
// verify the write-token
if (!verify_token(msg_keys[1]->string_value(), msg_keys[0]->string_ptr(), m.addr))
{
incoming_error(e, "invalid token");
return;
}
sha1_hash expected_target;
sha1_hash item_hash;
std::pair<char const*, int> buf;
if (msg_keys[3])
{
// we found the "head" entry
add_item.type = feed_item::list_head;
add_item.item = *msg_keys[3];
add_item.name = msg_keys[4]->string_value();
add_item.sequence_number = msg_keys[6]->int_value();
buf = msg_keys[3]->data_section();
item_hash = hasher(buf.first, buf.second).final();
hasher h;
h.update(add_item.name);
h.update((const char*)msg_keys[5]->string_ptr(), msg_keys[5]->string_length());
expected_target = h.final();
}
else if (msg_keys[8])
{
// we found the "item" entry
add_item.type = feed_item::list_item;
add_item.item = *msg_keys[8];
buf = msg_keys[8]->data_section();
item_hash = hasher(buf.first, buf.second).final();
expected_target = item_hash;
}
else
{
incoming_error(e, "missing head or item");
return;
}
if (buf.second > 1024)
{
incoming_error(e, "message too big");
return;
}
// verify that the key matches the target
if (expected_target != target)
{
incoming_error(e, "invalid target");
return;
}
memcpy(add_item.signature, msg_keys[2]->string_ptr(), sizeof(add_item.signature));
// #error verify signature by comparing it to item_hash
m_table.node_seen(id, m.addr);
feed_table_t::iterator i = m_feeds.find(target);
if (i == m_feeds.end())
{
// make sure we don't add too many items
if (m_feeds.size() >= m_settings.max_feed_items)
{
// delete the least important one (i.e. the one
// the fewest peers are announcing)
i = std::min_element(m_feeds.begin(), m_feeds.end()
, boost::bind(&feed_item::num_announcers
, boost::bind(&feed_table_t::value_type::second, _1)));
TORRENT_ASSERT(i != m_feeds.end());
// std::cerr << " removing: " << i->second.item << std::endl;
m_feeds.erase(i);
}
boost::tie(i, boost::tuples::ignore) = m_feeds.insert(std::make_pair(target, add_item));
}
feed_item& f = i->second;
if (f.type != add_item.type) return;
f.last_seen = time_now();
if (add_item.sequence_number > f.sequence_number)
{
f.item.swap(add_item.item);
f.name.swap(add_item.name);
f.sequence_number = add_item.sequence_number;
memcpy(f.signature, add_item.signature, sizeof(f.signature));
}
// maybe increase num_announcers if we haven't seen this IP before
sha1_hash iphash;
hash_address(m.addr.address(), iphash);
if (!f.ips.find(iphash))
{
f.ips.set(iphash);
++f.num_announcers;
}
}
else if (strcmp(query, "get_item") == 0)
{
key_desc_t msg_desc[] = {
{"target", lazy_entry::string_t, 20, 0},
{"key", lazy_entry::string_t, 64, 0},
{"n", lazy_entry::string_t, 0, key_desc_t::optional},
};
// attempt to parse the message
lazy_entry const* msg_keys[3];
if (!verify_message(arg_ent, msg_desc, msg_keys, 3, error_string, sizeof(error_string)))
{
incoming_error(e, error_string);
return;
}
sha1_hash target(msg_keys[0]->string_ptr());
// verify that the key matches the target
// we can only do this for list heads, where
// we have the name.
if (msg_keys[2])
{
hasher h;
h.update(msg_keys[2]->string_ptr(), msg_keys[2]->string_length());
h.update(msg_keys[1]->string_ptr(), msg_keys[1]->string_length());
if (h.final() != target)
{
incoming_error(e, "invalid target");
return;
}
}
reply["token"] = generate_token(m.addr, msg_keys[0]->string_ptr());
nodes_t n;
// always return nodes as well as peers
m_table.find_node(target, n, 0);
write_nodes_entry(reply, n);
feed_table_t::iterator i = m_feeds.find(target);
if (i != m_feeds.end())
{
feed_item const& f = i->second;
if (f.type == feed_item::list_head)
reply["head"] = f.item;
else
reply["item"] = f.item;
reply["sig"] = std::string((char*)f.signature, sizeof(f.signature));
}
}
/*
else if (strcmp(query, "announce_torrent") == 0)
{
key_desc_t msg_desc[] = {
@ -889,8 +1120,8 @@ void node_impl::incoming_request(msg const& m, entry& e)
return;
}
// if (m_ses.m_alerts.should_post<dht_announce_torrent_alert>())
// m_ses.m_alerts.post_alert(dht_announce_torrent_alert(
// if (m_alerts.should_post<dht_announce_torrent_alert>())
// m_alerts.post_alert(dht_announce_torrent_alert(
// m.addr.address(), name, tags, info_hash));
if (!verify_token(msg_keys[4]->string_value(), msg_keys[0]->string_ptr(), m.addr))

View File

@ -161,7 +161,8 @@ enum { observer_size = max3<
rpc_manager::rpc_manager(node_id const& our_id
, routing_table& table, send_fun const& sf
, void* userdata, aux::session_impl& ses)
, void* userdata
, external_ip_fun ext_ip)
: m_pool_allocator(observer_size, 10)
, m_send(sf)
, m_userdata(userdata)
@ -171,7 +172,7 @@ rpc_manager::rpc_manager(node_id const& our_id
, m_random_number(generate_id())
, m_allocated_observers(0)
, m_destructing(false)
, m_ses(ses)
, m_ext_ip(ext_ip)
{
std::srand(time(0));
@ -342,7 +343,7 @@ bool rpc_manager::incoming(msg const& m, node_id* id)
// this node claims we use the wrong node-ID!
address_v4::bytes_type b;
memcpy(&b[0], ext_ip->string_ptr(), 4);
m_ses.set_external_address(address_v4(b), aux::session_impl::source_dht, m.addr.address());
m_ext_ip(address_v4(b), aux::session_impl::source_dht, m.addr.address());
}
#if TORRENT_USE_IPV6
else if (ext_ip && ext_ip->string_length() == 16)
@ -350,7 +351,7 @@ bool rpc_manager::incoming(msg const& m, node_id* id)
// this node claims we use the wrong node-ID!
address_v6::bytes_type b;
memcpy(&b[0], ext_ip->string_ptr(), 16);
m_ses.set_external_address(address_v6(b), aux::session_impl::source_dht, m.addr.address());
m_ext_ip(address_v6(b), aux::session_impl::source_dht, m.addr.address());
}
#endif

View File

@ -40,72 +40,251 @@ POSSIBILITY OF SUCH DAMAGE.
#include "test.hpp"
using namespace libtorrent;
using namespace libtorrent::dht;
int dht_port = 48199;
std::list<std::pair<udp::endpoint, entry> > g_responses;
void send_dht_msg(datagram_socket& sock, char const* msg, lazy_entry* reply
, char const* t = "10", char const* info_hash = 0, char const* name = 0
, char const* token = 0, int port = 0)
bool our_send(void* user, entry const& msg, udp::endpoint const& ep, int flags)
{
g_responses.push_back(std::make_pair(ep, msg));
return true;
}
address rand_v4()
{
return address_v4((rand() << 16 | rand()) & 0xffffffff);
}
sha1_hash generate_next()
{
sha1_hash ret;
for (int i = 0; i < 20; ++i) ret[i] = rand();
return ret;
}
boost::array<char, 64> generate_key()
{
boost::array<char, 64> ret;
for (int i = 0; i < 64; ++i) ret[i] = rand();
return ret;
}
void send_dht_msg(node_impl& node, char const* msg, udp::endpoint const& ep
, lazy_entry* reply, char const* t = "10", char const* info_hash = 0
, char const* name = 0, std::string const* token = 0, int port = 0
, std::string const* target = 0, entry const* item = 0, std::string const* signature = 0
, std::string const* key = 0, std::string const* id = 0)
{
entry e;
e["q"] = msg;
e["t"] = t;
e["y"] = "q";
entry::dictionary_type& a = e["a"].dict();
a["id"] = "00000000000000000000";
a["id"] = id == 0 ? generate_next().to_string() : *id;
if (info_hash) a["info_hash"] = info_hash;
if (name) a["n"] = name;
if (token) a["token"] = token;
if (token) a["token"] = *token;
if (port) a["port"] = port;
if (target) a["target"] = *target;
if (item) a["item"] = *item;
if (signature) a["sig"] = *signature;
if (key) a["key"] = *key;
char msg_buf[1500];
int size = bencode(msg_buf, e);
// std::cerr << "sending: " << e << "\n";
error_code ec;
sock.send_to(asio::buffer(msg_buf, size)
, udp::endpoint(address::from_string("127.0.0.1"), dht_port), 0, ec);
TEST_CHECK(!ec);
if (ec) std::cout << ec.message() << std::endl;
lazy_entry decoded;
lazy_bdecode(msg_buf, msg_buf + size, decoded);
dht::msg m(decoded, ep);
node.incoming(m);
// by now the node should have invoked the send function and put the
// response in g_responses
std::list<std::pair<udp::endpoint, entry> >::iterator i
= std::find_if(g_responses.begin(), g_responses.end()
, boost::bind(&std::pair<udp::endpoint, entry>::first, _1) == ep);
if (i == g_responses.end())
{
TEST_ERROR("not response from DHT node");
return;
}
static char inbuf[1500];
udp::endpoint ep;
size = sock.receive_from(asio::buffer(inbuf, sizeof(inbuf)), ep, 0, ec);
TEST_CHECK(!ec);
if (ec) std::cout << ec.message() << std::endl;
int ret = lazy_bdecode(inbuf, inbuf + size, *reply, ec);
char* ptr = inbuf;
int len = bencode(inbuf, i->second);
g_responses.erase(i);
error_code ec;
int ret = lazy_bdecode(inbuf, inbuf + len, *reply, ec);
TEST_CHECK(ret == 0);
}
struct announce_item
{
sha1_hash next;
boost::array<char, 64> key;
int num_peers;
entry ent;
sha1_hash target;
void gen()
{
ent["next"] = next.to_string();
ent["key"] = std::string(&key[0], 64);
ent["A"] = "a";
ent["B"] = "b";
ent["num_peers"] = num_peers;
char buf[512];
char* ptr = buf;
int len = bencode(ptr, ent);
target = hasher(buf, len).final();
}
};
void announce_items(node_impl& node, udp::endpoint const* eps
, node_id const* ids, announce_item const* items, int num_items)
{
std::string tokens[1000];
for (int i = 0; i < 1000; ++i)
{
for (int j = 0; j < num_items; ++j)
{
if ((i % items[j].num_peers) == 0) continue;
lazy_entry response;
send_dht_msg(node, "get_item", eps[i], &response, "10", 0
, 0, 0, 0, &items[j].target.to_string(), 0, 0
, &std::string(&items[j].key[0], 64), &ids[i].to_string());
key_desc_t desc[] =
{
{ "r", lazy_entry::dict_t, 0, key_desc_t::parse_children },
{ "id", lazy_entry::string_t, 20, 0},
{ "token", lazy_entry::string_t, 0, 0},
{ "ip", lazy_entry::string_t, 0, key_desc_t::optional | key_desc_t::last_child},
{ "y", lazy_entry::string_t, 1, 0},
};
lazy_entry const* parsed[5];
char error_string[200];
// fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
int ret = verify_message(&response, desc, parsed, 5, error_string, sizeof(error_string));
if (ret)
{
TEST_EQUAL(parsed[4]->string_value(), "r");
tokens[i] = parsed[2]->string_value();
}
else
{
fprintf(stderr, " invalid get_item response: %s\n", error_string);
TEST_ERROR(error_string);
}
if (parsed[3])
{
address_v4::bytes_type b;
memcpy(&b[0], parsed[3]->string_ptr(), b.size());
address_v4 addr(b);
TEST_EQUAL(addr, eps[i].address());
}
send_dht_msg(node, "announce_item", eps[i], &response, "10", 0
, 0, &tokens[i], 0, &items[j].target.to_string(), &items[j].ent
, &std::string("0123456789012345678901234567890123456789012345678901234567890123"));
key_desc_t desc2[] =
{
{ "y", lazy_entry::string_t, 1, 0 }
};
// fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
ret = verify_message(&response, desc2, parsed, 1, error_string, sizeof(error_string));
if (ret)
{
TEST_EQUAL(parsed[0]->string_value(), "r");
}
else
{
fprintf(stderr, " invalid announce_item response: %s\n", error_string);
TEST_ERROR(error_string);
}
}
}
std::set<int> items_num;
for (int j = 0; j < num_items; ++j)
{
lazy_entry response;
send_dht_msg(node, "get_item", eps[0], &response, "10", 0
, 0, 0, 0, &items[j].target.to_string(), 0, 0
, &std::string(&items[j].key[0], 64), &ids[0].to_string());
key_desc_t desc[] =
{
{ "r", lazy_entry::dict_t, 0, key_desc_t::parse_children },
{ "item", lazy_entry::dict_t, 0, key_desc_t::parse_children},
{ "A", lazy_entry::string_t, 1, 0},
{ "B", lazy_entry::string_t, 1, 0},
{ "num_peers", lazy_entry::int_t, 0, key_desc_t::last_child},
{ "id", lazy_entry::string_t, 20, key_desc_t::last_child},
{ "y", lazy_entry::string_t, 1, 0},
};
lazy_entry const* parsed[7];
char error_string[200];
fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
int ret = verify_message(&response, desc, parsed, 7, error_string, sizeof(error_string));
if (ret)
{
TEST_EQUAL(parsed[6]->string_value(), "r");
TEST_EQUAL(parsed[2]->string_value(), "a");
TEST_EQUAL(parsed[3]->string_value(), "b");
items_num.insert(items_num.begin(), parsed[4]->int_value());
}
}
TEST_EQUAL(items_num.size(), 4);
// items_num should contain 1,2 and 3
// #error this doesn't quite hold
// TEST_CHECK(items_num.find(1) != items_num.end());
// TEST_CHECK(items_num.find(2) != items_num.end());
// TEST_CHECK(items_num.find(3) != items_num.end());
}
void nop(address, int, address) {}
int test_main()
{
session ses(fingerprint("LT", 0, 1, 0, 0), std::make_pair(dht_port, 49000));
io_service ios;
alert_manager al(ios);
dht_settings sett;
sett.max_torrents = 4;
sett.max_feed_items = 4;
address ext = address::from_string("236.0.0.1");
dht::node_impl node(al, &our_send, sett, node_id(0), ext, boost::bind(nop, _1, _2, _3), 0);
// DHT should be running on port 48199 now
io_service ios;
error_code ec;
datagram_socket sock(ios);
sock.open(udp::v4(), ec);
TEST_CHECK(!ec);
if (ec) std::cout << ec.message() << std::endl;
lazy_entry response;
lazy_entry const* parsed[5];
char error_string[200];
bool ret;
/*
// ====== ping ======
send_dht_msg(sock, "ping", &response, "10");
udp::endpoint source(address::from_string("10.0.0.1"), 20);
send_dht_msg(node, "ping", source, &response, "10");
dht::key_desc_t pong_desc[] = {
{"y", lazy_entry::string_t, 1, 0},
{"t", lazy_entry::string_t, 2, 0},
{"r", lazy_entry::dict_t, 0, key_desc_t::parse_children},
{"id", lazy_entry::string_t, 20, key_desc_t::last_child},
};
fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
ret = dht::verify_message(&response, pong_desc, parsed, 2, error_string, sizeof(error_string));
ret = dht::verify_message(&response, pong_desc, parsed, 4, error_string, sizeof(error_string));
TEST_CHECK(ret);
if (ret)
{
@ -114,27 +293,27 @@ int test_main()
}
else
{
fprintf(stderr, "invalid ping response: %s\n", error_string);
fprintf(stderr, " invalid ping response: %s\n", error_string);
}
// ====== invalid message ======
send_dht_msg(sock, "find_node", &response, "10");
send_dht_msg(node, "find_node", source, &response, "10");
dht::key_desc_t err_desc[] = {
{"y", lazy_entry::string_t, 1, 0},
{"e", lazy_entry::list_t, 0, 0},
{"e", lazy_entry::list_t, 2, 0},
{"r", lazy_entry::dict_t, 0, key_desc_t::parse_children},
{"id", lazy_entry::string_t, 20, key_desc_t::last_child},
};
fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
ret = dht::verify_message(&response, err_desc, parsed, 2, error_string, sizeof(error_string));
ret = dht::verify_message(&response, err_desc, parsed, 4, error_string, sizeof(error_string));
TEST_CHECK(ret);
if (ret)
{
TEST_CHECK(parsed[0]->string_value() == "e");
TEST_CHECK(parsed[1]->list_size() >= 2);
if (parsed[1]->list_size() >= 2
&& parsed[1]->list_at(0)->type() == lazy_entry::int_t
if (parsed[1]->list_at(0)->type() == lazy_entry::int_t
&& parsed[1]->list_at(1)->type() == lazy_entry::string_t)
{
TEST_CHECK(parsed[1]->list_at(1)->string_value() == "missing 'target' key");
@ -146,21 +325,22 @@ int test_main()
}
else
{
fprintf(stderr, "invalid error response: %s\n", error_string);
fprintf(stderr, " invalid error response: %s\n", error_string);
}
// ====== get_peers ======
send_dht_msg(sock, "get_peers", &response, "10", "01010101010101010101");
send_dht_msg(node, "get_peers", source, &response, "10", "01010101010101010101");
dht::key_desc_t peer1_desc[] = {
{"y", lazy_entry::string_t, 1, 0},
{"r", lazy_entry::dict_t, 0, 0},
{"r", lazy_entry::dict_t, 0, key_desc_t::parse_children},
{"id", lazy_entry::string_t, 20, key_desc_t::last_child},
};
std::string token;
fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
ret = dht::verify_message(&response, peer1_desc, parsed, 2, error_string, sizeof(error_string));
ret = dht::verify_message(&response, peer1_desc, parsed, 3, error_string, sizeof(error_string));
TEST_CHECK(ret);
if (ret)
{
@ -169,19 +349,21 @@ int test_main()
}
else
{
fprintf(stderr, "invalid get_peers response: %s\n", error_string);
fprintf(stderr, " invalid get_peers response: %s\n", error_string);
}
// ====== announce ======
send_dht_msg(sock, "announce_peer", &response, "10", "01010101010101010101", "test", token.c_str(), 8080);
send_dht_msg(node, "announce_peer", source, &response, "10", "01010101010101010101", "test", token.c_str(), 8080);
dht::key_desc_t ann_desc[] = {
{"y", lazy_entry::string_t, 1, 0},
{"r", lazy_entry::dict_t, 0, key_desc_t::parse_children},
{"id", lazy_entry::string_t, 20, key_desc_t::last_child},
};
fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
ret = dht::verify_message(&response, ann_desc, parsed, 1, error_string, sizeof(error_string));
ret = dht::verify_message(&response, ann_desc, parsed, 3, error_string, sizeof(error_string));
TEST_CHECK(ret);
if (ret)
{
@ -189,20 +371,21 @@ int test_main()
}
else
{
fprintf(stderr, "invalid announce response: %s\n", error_string);
fprintf(stderr, " invalid announce response: %s\n", error_string);
}
// ====== get_peers ======
send_dht_msg(sock, "get_peers", &response, "10", "01010101010101010101");
send_dht_msg(node, "get_peers", source, &response, "10", "01010101010101010101");
dht::key_desc_t peer2_desc[] = {
{"y", lazy_entry::string_t, 1, 0},
{"r", lazy_entry::dict_t, 0, 0},
{"r", lazy_entry::dict_t, 0, key_desc_t::parse_children},
{"id", lazy_entry::string_t, 20, key_desc_t::last_child},
};
fprintf(stderr, "msg: %s\n", print_entry(response).c_str());
ret = dht::verify_message(&response, peer2_desc, parsed, 2, error_string, sizeof(error_string));
ret = dht::verify_message(&response, peer2_desc, parsed, 3, error_string, sizeof(error_string));
TEST_CHECK(ret);
if (ret)
{
@ -211,8 +394,36 @@ int test_main()
}
else
{
fprintf(stderr, "invalid get_peers response: %s\n", error_string);
fprintf(stderr, " invalid get_peers response: %s\n", error_string);
}
*/
// ====== announce_item ======
udp::endpoint eps[1000];
node_id ids[1000];
for (int i = 0; i < 1000; ++i)
{
eps[i] = udp::endpoint(rand_v4(), (rand() % 16534) + 1);
ids[i] = generate_next();
}
announce_item items[] =
{
{ generate_next(), generate_key(), 1 },
{ generate_next(), generate_key(), 2 },
{ generate_next(), generate_key(), 3 },
{ generate_next(), generate_key(), 4 },
{ generate_next(), generate_key(), 5 },
{ generate_next(), generate_key(), 6 },
{ generate_next(), generate_key(), 7 },
{ generate_next(), generate_key(), 8 }
};
for (int i = 0; i < sizeof(items)/sizeof(items[0]); ++i)
items[i].gen();
announce_items(node, eps, ids, items, sizeof(items)/sizeof(items[0]));
return 0;
}

View File

@ -390,9 +390,88 @@ address rand_v4()
int test_main()
{
using namespace libtorrent;
using namespace libtorrent::dht;
error_code ec;
int ret = 0;
// test verify_message
const static key_desc_t msg_desc[] = {
{"A", lazy_entry::string_t, 4, 0},
{"B", lazy_entry::dict_t, 0, key_desc_t::optional | key_desc_t::parse_children},
{"B1", lazy_entry::string_t, 0, 0},
{"B2", lazy_entry::string_t, 0, key_desc_t::last_child},
{"C", lazy_entry::dict_t, 0, key_desc_t::optional | key_desc_t::parse_children},
{"C1", lazy_entry::string_t, 0, 0},
{"C2", lazy_entry::string_t, 0, key_desc_t::last_child},
};
lazy_entry const* msg_keys[7];
lazy_entry ent;
char const test_msg[] = "d1:A4:test1:Bd2:B15:test22:B25:test3ee";
lazy_bdecode(test_msg, test_msg + sizeof(test_msg)-1, ent, ec);
fprintf(stderr, "%s\n", print_entry(ent).c_str());
char error_string[200];
ret = verify_message(&ent, msg_desc, msg_keys, 7, error_string, sizeof(error_string));
TEST_CHECK(ret);
TEST_CHECK(msg_keys[0]);
if (msg_keys[0]) TEST_EQUAL(msg_keys[0]->string_value(), "test");
TEST_CHECK(msg_keys[1]);
TEST_CHECK(msg_keys[2]);
if (msg_keys[2]) TEST_EQUAL(msg_keys[2]->string_value(), "test2");
TEST_CHECK(msg_keys[3]);
if (msg_keys[3]) TEST_EQUAL(msg_keys[3]->string_value(), "test3");
TEST_CHECK(msg_keys[4] == 0);
TEST_CHECK(msg_keys[5] == 0);
TEST_CHECK(msg_keys[6] == 0);
char const test_msg2[] = "d1:A4:test1:Cd2:C15:test22:C25:test3ee";
lazy_bdecode(test_msg2, test_msg2 + sizeof(test_msg2)-1, ent, ec);
fprintf(stderr, "%s\n", print_entry(ent).c_str());
ret = verify_message(&ent, msg_desc, msg_keys, 7, error_string, sizeof(error_string));
TEST_CHECK(ret);
TEST_CHECK(msg_keys[0]);
if (msg_keys[0]) TEST_EQUAL(msg_keys[0]->string_value(), "test");
TEST_CHECK(msg_keys[1] == 0);
TEST_CHECK(msg_keys[2] == 0);
TEST_CHECK(msg_keys[3] == 0);
TEST_CHECK(msg_keys[4]);
TEST_CHECK(msg_keys[5]);
if (msg_keys[5]) TEST_EQUAL(msg_keys[5]->string_value(), "test2");
TEST_CHECK(msg_keys[6]);
if (msg_keys[6]) TEST_EQUAL(msg_keys[6]->string_value(), "test3");
char const test_msg3[] = "d1:Cd2:C15:test22:C25:test3ee";
lazy_bdecode(test_msg3, test_msg3 + sizeof(test_msg3)-1, ent, ec);
fprintf(stderr, "%s\n", print_entry(ent).c_str());
ret = verify_message(&ent, msg_desc, msg_keys, 7, error_string, sizeof(error_string));
TEST_CHECK(!ret);
fprintf(stderr, "%s\n", error_string);
TEST_EQUAL(error_string, std::string("missing 'A' key"));
char const test_msg4[] = "d1:A6:foobare";
lazy_bdecode(test_msg4, test_msg4 + sizeof(test_msg4)-1, ent, ec);
fprintf(stderr, "%s\n", print_entry(ent).c_str());
ret = verify_message(&ent, msg_desc, msg_keys, 7, error_string, sizeof(error_string));
TEST_CHECK(!ret);
fprintf(stderr, "%s\n", error_string);
TEST_EQUAL(error_string, std::string("invalid value for 'A'"));
char const test_msg5[] = "d1:A4:test1:Cd2:C15:test2ee";
lazy_bdecode(test_msg5, test_msg5 + sizeof(test_msg5)-1, ent, ec);
fprintf(stderr, "%s\n", print_entry(ent).c_str());
ret = verify_message(&ent, msg_desc, msg_keys, 7, error_string, sizeof(error_string));
TEST_CHECK(!ret);
fprintf(stderr, "%s\n", error_string);
TEST_EQUAL(error_string, std::string("missing 'C2' key"));
// test external ip voting
aux::session_impl* ses = new aux::session_impl(std::pair<int, int>(0,0)
, fingerprint("LT", 0, 0, 0, 0), "0.0.0.0"