forked from premiere/premiere-libtorrent
245 lines
6.5 KiB
ReStructuredText
245 lines
6.5 KiB
ReStructuredText
|
============================================
|
||
|
BitTorrent extension for arbitrary DHT store
|
||
|
============================================
|
||
|
|
||
|
:Author: Arvid Norberg, arvid@rasterbar.com
|
||
|
:Version: Draft
|
||
|
|
||
|
.. contents:: Table of contents
|
||
|
:depth: 2
|
||
|
:backlinks: none
|
||
|
|
||
|
This is a proposal for an extension to the BitTorrent DHT to allow
|
||
|
storing and retrieving of arbitrary data.
|
||
|
|
||
|
It supports both storing *immutable* items, where the key is
|
||
|
the SHA-1 hash of the data itself, and *mutable* items, where
|
||
|
the key is the public key of the key pair used to sign the data.
|
||
|
|
||
|
There are two new proposed messages, ``put`` and ``get``.
|
||
|
|
||
|
terminology
|
||
|
-----------
|
||
|
|
||
|
In this document, a *storage node* refers to the node in the DHT to which
|
||
|
an item is being announced and stored on. A *subscribing node* refers to
|
||
|
a node which makes look-ups in the DHT to find the storage nodes, to
|
||
|
request items from them, and possibly re-announce those items to keep them
|
||
|
alive.
|
||
|
|
||
|
messages
|
||
|
--------
|
||
|
|
||
|
The proposed new messages ``get`` and ``put`` are similar to the existing ``get_peers``
|
||
|
and ``announce_peer``.
|
||
|
|
||
|
Responses to ``get`` should always include ``nodes`` and ``nodes6`` has the same
|
||
|
semantics as in its ``get_peers`` response. It should also include a write token,
|
||
|
``token``, with the same semantics as ``get_peers``.
|
||
|
|
||
|
The ``id`` field in these messages has the same semantics as the standard DHT messages,
|
||
|
i.e. the node ID of the node sending the message, to maintain the structure of the DHT
|
||
|
network.
|
||
|
|
||
|
The ``token`` field also has the same semantics as the standard DHT message ``get_peers``
|
||
|
and ``announce_peer``, when requesting an item and to write an item respectively.
|
||
|
|
||
|
The distinction between storing mutable and immutable items is the inclusion
|
||
|
of a public key, a sequence number and signature (``k``, ``seq`` and ``sig``).
|
||
|
The distinction betwewn retrieving a mutable and immutable item is the inclusion of
|
||
|
the public key spill-over (``k``) in the ``get`` request.
|
||
|
|
||
|
The ``v`` key is the *value* to be stored. It is allowed to be any bencoded type (list,
|
||
|
dict, string or integer). When it's being hashed (for verifying its signature or to calculate
|
||
|
its key), its flattened, bencoded, form is used.
|
||
|
|
||
|
Storing nodes are SHOULD reject ``put`` requests where the bencoded form of ``v`` is longer
|
||
|
than 767 bytes.
|
||
|
|
||
|
immutable items
|
||
|
---------------
|
||
|
|
||
|
Immutable items are stored under their SHA-1 hash, and since they cannot be modified,
|
||
|
there is no need to authenticate the origin of them. This makes immutable items simple.
|
||
|
|
||
|
put message
|
||
|
...........
|
||
|
|
||
|
Request:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"a":
|
||
|
{
|
||
|
"id": *<20 byte id of sending node (string)>*,
|
||
|
"v": *<any bencoded type, whose encoded size < 768>*
|
||
|
},
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "q",
|
||
|
"q": "put"
|
||
|
}
|
||
|
|
||
|
Response:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"r": { "id": *<20 byte id of sending node (string)>* },
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "r",
|
||
|
}
|
||
|
|
||
|
get message
|
||
|
...........
|
||
|
|
||
|
Request:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"a":
|
||
|
{
|
||
|
"id": *<20 byte id of sending node (string)>*,
|
||
|
"target": *<SHA-1 hash of item (string)>*,
|
||
|
},
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "q",
|
||
|
"q": "get"
|
||
|
}
|
||
|
|
||
|
Response:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"r":
|
||
|
{
|
||
|
"id": *<20 byte id of sending node (string)>*,
|
||
|
"token": *<write token (string)>*,
|
||
|
"v": *<any bencoded type whose SHA-1 hash matches 'target'>*,
|
||
|
"nodes": *<IPv4 nodes close to 'target'>*
|
||
|
"nodes6": *<IPv6 nodes close to 'target'>*
|
||
|
},
|
||
|
"t": *<transaction-id>*,
|
||
|
"y": "r",
|
||
|
}
|
||
|
|
||
|
|
||
|
mutable items
|
||
|
-------------
|
||
|
|
||
|
Mutable items can be updated, without changing their DHT keys. To authenticate
|
||
|
that only the original publisher can update an item, it is signed by a private key
|
||
|
generated by the original publisher.
|
||
|
|
||
|
In order to avoid a malicious node to overwrite the list head with an old
|
||
|
version, the sequence number ``seq`` must be monotonically increasing for each update,
|
||
|
and a node hosting the list node MUST not downgrade a list head from a higher sequence
|
||
|
number to a lower one, only upgrade.
|
||
|
|
||
|
The signature is a 2048 bit RSA signature of the SHA-1 hash of the bencoded sequence
|
||
|
number and ``v`` key. e.g. something like this:: ``3:seqi4e1:v12:Hello world!``.
|
||
|
|
||
|
put message
|
||
|
...........
|
||
|
|
||
|
Request:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"a":
|
||
|
{
|
||
|
"id": *<20 byte id of sending node (string)>*,
|
||
|
"k": *<RSA-2048 public key (268 bytes string)>*,
|
||
|
"seq": *<monotonically increasing sequence number (integer)>*,
|
||
|
"sig": *<RSA-2048 signature (256 bytes string)>*,
|
||
|
"token": *<write-token (string)>*,
|
||
|
"v": *<any bencoded type, whose encoded size < 768>*
|
||
|
},
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "q",
|
||
|
"q": "put"
|
||
|
}
|
||
|
|
||
|
Storing nodes receiving a ``put`` request where ``seq`` is lower than what's already
|
||
|
stored on the node, MUST reject the request.
|
||
|
|
||
|
Response:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"r": { "id": *<20 byte id of sending node (string)>* },
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "r",
|
||
|
}
|
||
|
|
||
|
get message
|
||
|
...........
|
||
|
|
||
|
Request:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"r":
|
||
|
{
|
||
|
"id": *<20 byte id of sending node (string)>*,
|
||
|
"target:" *<first 20 bytes of public key (string)>*,
|
||
|
"k": *<remaining 248 bytes of public key (string)>*
|
||
|
},
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "r",
|
||
|
"q": "get"
|
||
|
}
|
||
|
|
||
|
Response:
|
||
|
|
||
|
.. parsed-literal::
|
||
|
|
||
|
{
|
||
|
"r":
|
||
|
{
|
||
|
"id": *<20 byte id of sending node (string)>*,
|
||
|
"k": *<RSA-2048 public key (268 bytes string)>*,
|
||
|
"seq": *<monotonically increasing sequence number (integer)>*,
|
||
|
"sig": *<RSA-2048 signature (256 bytes string)>*,
|
||
|
"token": *<write-token (string)>*,
|
||
|
"v": *<any bencoded type, whose encoded size < 768>*
|
||
|
|
||
|
},
|
||
|
"t": *<transaction-id (string)>*,
|
||
|
"y": "r",
|
||
|
}
|
||
|
|
||
|
signature verification
|
||
|
----------------------
|
||
|
|
||
|
In order to make it maximally difficult to attack the bencoding parser, signing and verification of the
|
||
|
value and sequence number should be done as follows:
|
||
|
|
||
|
1. encode value and sequence number separately
|
||
|
2. concatenate "3:seqi" ``seq`` "e1:v" and the encoded value.
|
||
|
sequence number 1 of value "Hello World!" would be converted to: 3:seqi1e1:v12:Hello World!
|
||
|
In this way it is not possible to convince a node that part of the length is actually part of the
|
||
|
sequence number even if the parser contains certain bugs. Furthermore it is not possible to have a
|
||
|
verification failure if a bencoding serializer alters the order of entries in the dictionary.
|
||
|
3. hash the concatenated string with SHA-1
|
||
|
4. sign or verify the hash digest.
|
||
|
|
||
|
expiration
|
||
|
----------
|
||
|
|
||
|
Without re-announcement, these items MAY expire in 2 hours. In order
|
||
|
to keep items alive, they SHOULD be re-announced once an hour.
|
||
|
|
||
|
Subscriber nodes MAY help out in announcing items the are interested in to the DHT,
|
||
|
to keep them alive.
|
||
|
|
||
|
test vectors
|
||
|
------------
|
||
|
|
||
|
|