<p>This is a proposal for an extension to the BitTorrent DHT to allow
for decentralized RSS feed like functionality.</p>
<p>The intention is to allow the creation of repositories of torrents
where only a single identity has the authority to add new content. For
this repository to be robust against network failures and resilient
to attacks at the source.</p>
<p>The target ID under which the repository is stored in the DHT, is the
SHA-1 hash of a feed name and the 512 bit public key. This private key
in this pair MUST be used to sign every item stored in the repository.
Every message that contain signed items MUST also include this key, to
allow the receiver to verify the key itself against the target ID as well
as the validity of the signatures of the items. Every recipient of a
message with feed items in it MUST verify both the validity of the public
key against the target ID it is stored under, as well as the validity of
the signatures of each individual item.</p>
<p>Any peer who is subscribing to a DHT feed SHOULD also participate in
regularly re-announcing items that it knows about. Every participant
SHOULD store items in long term storage, across sessions, in order to
keep items alive for as long as possible, with as few sources as possible.</p>
<p>As with normal DHT announces, the write-token mechanism is used to
prevent spoof attacks.</p>
<p>There are two new proposed messages, <ttclass="docutils literal"><spanclass="pre">announce_item</span></tt> and <ttclass="docutils literal"><spanclass="pre">get_item</span></tt>.
Every valid item that is announced, should be stored. In a request to get items,
as many items as can fit in a normal UDP packet size should be returned. If
there are more items than can fit, a random sub-set should be returned.</p>
<p><em>Is there a better heuristic here? Should there be a bias towards newer items?
If so, there needs to be a signed timestamp as well, which might get messy</em></p>
<divclass="section"id="target-id">
<h1>target ID</h1>
<p>The target, i.e. the ID in the DHT key space feeds are announced to, MUST always
be SHA-1(<em>feed_name</em> + <em>public_key</em>). Any request where this condition is not met,
MUST be dropped.</p>
<p>Using the feed name as part of the target means a feed publisher only needs one
public-private keypair for any number of feeds, as long as the feeds have different
names.</p>
</div>
<divclass="section"id="messages">
<h1>messages</h1>
<p>These are the proposed new message formats.</p>
"id": <em><20 byte id of origin node></em>,
"key": <em><64 byte public curve25519 key for this feed></em>,
"n": <em><feed-name></em>
"target": <em><target-id as derived from public key></em>
},
"q": "get_item",
"t": <em><transaction-id></em>,
"y": "q",
}
</pre>
<p>The <ttclass="docutils literal"><spanclass="pre">target</span></tt> MUST always be SHA-1(<em>feed_name</em> + <em>public_key</em>). Any request where
this condition is not met, MUST be dropped.</p>
<p>The <ttclass="docutils literal"><spanclass="pre">n</span></tt> field is the name of this feed. It MUST be UTF-8 encoded string and it
MUST match the name of the feed in the receiving node.</p>
<p>The bloom filter argument (<ttclass="docutils literal"><spanclass="pre">filter</span></tt>) in the <ttclass="docutils literal"><spanclass="pre">get_item</span></tt> requests is optional.
If included in a request, it represents info-hashes that should be excluded from
the response. In this case, the response should be a random subset of the non-excluded
items, or all of the non-excluded items if they all fit within a packet size.</p>
<p>If the bloom filter is specified, its size MUST be an even multiple of 8 bits. The size
is implied by the length of the string. For each info-hash to exclude from the response,</p>
<p>There are no hash functions for the bloom filter. Since the info-hash is already a
hash digest, each pair of bytes, starting with the first bytes (MSB), are used as the
results from the imaginary hash functions for the bloom filter. k is 3 in this bloom
filter. This means the first 6 bytes of the info-hash is used to set 3 bits in the bloom
filter. The pairs of bytes pulled out from the info-hash are interpreted as a big-endian
16 bit value.</p>
<p>Bits are indexed in bytes from left to right, and within bytes from LSB to MSB. i.e., to
set bit 12: <ttclass="docutils literal"><spanclass="pre">bitfield[12/8]</span><spanclass="pre">|=</span><spanclass="pre">(12</span><spanclass="pre">%</span><spanclass="pre">8)</span></tt>.</p>
<dlclass="docutils">
<dt>Example:</dt>
<dd>To indicate that you are not interested in knowing about the info-hash that
starts with 0x4f7d25a... and you choose a bloom filter of size 80 bits. Set bits
(0x4f % 80), (0x7d % 80) and (0x25 % 80) in the bloom filter bitmask.</dd>
<p>Since the data that's being signed by the public key already is a hash (i.e.
an info-hash), the signature of each hash-entry is simply the hash encrypted
by the feed's private key.</p>
<p>The <ttclass="docutils literal"><spanclass="pre">ih</span></tt> and <ttclass="docutils literal"><spanclass="pre">sig</span></tt> lists MUST have equal number of items. Each item in <ttclass="docutils literal"><spanclass="pre">sig</span></tt>
is the signature of the full string in the corresponding item in the <ttclass="docutils literal"><spanclass="pre">ih</span></tt> list.</p>
<p>Each item in the <ttclass="docutils literal"><spanclass="pre">ih</span></tt> list may contain any positive number of 20 byte info-hashes.</p>
<p>The rationale behind using lists of strings where the strings contain multiple
info-hashes is to allow the publisher of a feed to sign multiple info-hashes
together, and thus saving space in the UDP packets, allowing nodes to transfer more
info-hashes per packet. Original publishers of a feed MAY re-announce items lumped
together over time to make the feed more efficient.</p>
<p>A client receiving a <ttclass="docutils literal"><spanclass="pre">get_item</span></tt> response MUST verify each signature in the <ttclass="docutils literal"><spanclass="pre">sig</span></tt>
list against each corresponding item in the <ttclass="docutils literal"><spanclass="pre">ih</span></tt> list using the feed's public key.
Any item whose signature</p>
<p><ttclass="docutils literal"><spanclass="pre">nodes</span></tt> and <ttclass="docutils literal"><spanclass="pre">nodes6</span></tt> are optional and have the same semantics as the standard
<ttclass="docutils literal"><spanclass="pre">get_peers</span></tt> request. The intention is to be able to use this <ttclass="docutils literal"><spanclass="pre">get_item</span></tt> request
in the same way, searching for the nodes responsible for the feed.</p>
<p>Note that a difference from regular torrent magnet links is the <strong>btfd</strong>
versus <strong>btih</strong> used in regular magnet links to torrents.</p>
<p>The <em>feed name</em> is mandatory since it is used in the request and when
calculating the target ID.</p>
</div>
<divclass="section"id="rationale">
<h1>rationale</h1>
<p>The reason to use <aclass="reference external"href="http://cr.yp.to/ecdh.html">curve25519</a> instead of, for instance, RSA is to fit more signatures
(i.e. items) in a single DHT packet. One packet is typically restricted to between
1280 - 1480 bytes. According to <aclass="reference external"href="http://cr.yp.to/">http://cr.yp.to/</a>, curve25519 is free from patent claims
and there are open implementations in both C and Java.</p>