2006-04-25 23:04:48 +02:00
|
|
|
|
:Author: Arvid Norberg, arvid@rasterbar.com
|
|
|
|
|
Ludvig Strigeus, ludde@utorrent.com
|
|
|
|
|
|
|
|
|
|
extension protocol for bittorrent
|
|
|
|
|
=================================
|
|
|
|
|
|
|
|
|
|
The intention of this protocol is to provide a simple and thin transport
|
|
|
|
|
for extensions to the bittorrent protocol. Supporting this protocol makes
|
|
|
|
|
it easy to add new extensions without interfering with the standard
|
|
|
|
|
bittorrent protocol or clients that don't support this extension or the
|
|
|
|
|
one you want to add.
|
|
|
|
|
|
|
|
|
|
To advertise to other clients that you support, one bit from the reserved
|
|
|
|
|
bytes is used.
|
|
|
|
|
|
|
|
|
|
Right now, two bits have known usages.
|
2006-05-20 19:59:17 +02:00
|
|
|
|
|
2006-04-25 23:04:48 +02:00
|
|
|
|
* [7] & 1 is used by Mainline for DHT support
|
|
|
|
|
* [7] & 2 is used by XBT client for peer-exchange support
|
|
|
|
|
|
|
|
|
|
The bit selected for the extension protocol is bit 20 from the right (counting
|
|
|
|
|
starts at 0). So (reserved_byte[5] & 0x10) is the expression to use for checking
|
|
|
|
|
if the client supports extended messaging.
|
|
|
|
|
|
|
|
|
|
Once support for the protocol is established, the client is supposed to
|
|
|
|
|
support 1 new message:
|
|
|
|
|
|
|
|
|
|
+------------------------+----+
|
|
|
|
|
|name | id |
|
|
|
|
|
+========================+====+
|
|
|
|
|
|``extended`` | 20 |
|
|
|
|
|
+------------------------+----+
|
|
|
|
|
|
|
|
|
|
This message is sent as any other bittorrent message, with a 4 byte length
|
|
|
|
|
prefix and a single byte identifying the message (the single byte being 20
|
|
|
|
|
in this case). At the start of the payload of the message, is a single byte
|
|
|
|
|
message identifier. This identifier can refer to different extension messages
|
|
|
|
|
and only one ID is specified, 0. If the ID is 0, the message is a handshake
|
|
|
|
|
message which is described below. The layout of a general ``extended`` message
|
|
|
|
|
follows (including the message headers used by the bittorrent protocol):
|
|
|
|
|
|
|
|
|
|
+----------+---------------------------------------------------------+
|
|
|
|
|
| size | description |
|
|
|
|
|
+==========+=========================================================+
|
|
|
|
|
| uint32_t | length prefix. Specifies the number of bytes for the |
|
|
|
|
|
| | entire message. (Big endian) |
|
|
|
|
|
+----------+---------------------------------------------------------+
|
|
|
|
|
| uint8_t | bittorrent message ID, = 20 |
|
|
|
|
|
+----------+---------------------------------------------------------+
|
|
|
|
|
| uint8_t | extended message ID. 0 = handshake, >0 = extended |
|
|
|
|
|
| | message as specified by the handshake. |
|
|
|
|
|
+----------+---------------------------------------------------------+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
handshake message
|
|
|
|
|
-----------------
|
|
|
|
|
|
|
|
|
|
The payload of the handshake message is a bencoded dictionary. All items
|
|
|
|
|
in the dictionary are optional. Any unknown names should be ignored
|
|
|
|
|
by the client. All parts of the dictionary are case sensitive.
|
|
|
|
|
This is the defined item in the dictionary:
|
|
|
|
|
|
|
|
|
|
+-------+-----------------------------------------------------------+
|
|
|
|
|
| name | description |
|
|
|
|
|
+=======+===========================================================+
|
|
|
|
|
| m | Dictionary of supported extension messages which maps |
|
|
|
|
|
| | names of extensions to identification numbers of each |
|
|
|
|
|
| | extension. The only requirement on the identification |
|
|
|
|
|
| | numbers is that no extensions share the same. Setting |
|
|
|
|
|
| | an extension number to zero means that the extension is |
|
|
|
|
|
| | not supported/disabled. The client should ignore any |
|
|
|
|
|
| | extension names it doesn't recognize. |
|
|
|
|
|
+-------+-----------------------------------------------------------+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Here are two other items that an implementation may choose to support:
|
|
|
|
|
|
|
|
|
|
+-------+-----------------------------------------------------------+
|
|
|
|
|
| name | description |
|
|
|
|
|
+=======+===========================================================+
|
|
|
|
|
| p | Local TCP listen port. Allows each side to learn about |
|
|
|
|
|
| | the TCP port number of the other side. Note that there is |
|
|
|
|
|
| | no need for the receiving side of the connection to send |
|
|
|
|
|
| | this extension message, since its port number is already |
|
|
|
|
|
| | known. |
|
|
|
|
|
+-------+-----------------------------------------------------------+
|
|
|
|
|
| v | Client name and version (as an utf-8 string). |
|
|
|
|
|
| | This is a much more reliable way of identifying the |
|
|
|
|
|
| | client than relying on the peer id encoding. |
|
|
|
|
|
+-------+-----------------------------------------------------------+
|
2006-05-15 00:30:05 +02:00
|
|
|
|
| reqq | An integer, the number of outstanding request messages |
|
|
|
|
|
| | this client supports without dropping any. The default in |
|
|
|
|
|
| | in libtorrent is 250. |
|
|
|
|
|
+-------+-----------------------------------------------------------+
|
2006-04-25 23:04:48 +02:00
|
|
|
|
|
|
|
|
|
The handshake dictionary could also include extended handshake
|
|
|
|
|
information, such as support for encrypted headers or anything
|
|
|
|
|
imaginable.
|
|
|
|
|
|
|
|
|
|
An example of what the payload of a handshake message could look like:
|
|
|
|
|
|
|
|
|
|
+------------------------------------------------------+
|
|
|
|
|
| Dictionary |
|
|
|
|
|
+===================+==================================+
|
|
|
|
|
| ``m`` | +--------------------------+ |
|
|
|
|
|
| | | Dictionary | |
|
|
|
|
|
| | +======================+===+ |
|
|
|
|
|
| | | ``LT_metadata`` | 1 | |
|
|
|
|
|
| | +----------------------+---+ |
|
|
|
|
|
| | | ``<EFBFBD>T_PEX`` | 2 | |
|
|
|
|
|
| | +----------------------+---+ |
|
|
|
|
|
| | |
|
|
|
|
|
+-------------------+----------------------------------+
|
|
|
|
|
| ``p`` | 6881 |
|
|
|
|
|
+-------------------+----------------------------------+
|
|
|
|
|
| ``v`` | "<22>Torrent 1.2" |
|
|
|
|
|
+-------------------+----------------------------------+
|
|
|
|
|
|
|
|
|
|
and in the encoded form:
|
|
|
|
|
|
|
|
|
|
``d1:md11:LT_metadatai1e6:<3A>T_PEXi2ee1:pi6881e1:v13:\xc2\xb5Torrent 1.2e``
|
|
|
|
|
|
|
|
|
|
To make sure the extension names do not collide by mistake, they should be
|
|
|
|
|
prefixed with the two (or one) character code that is used to identify the
|
|
|
|
|
client that introduced the extension. This applies for both the names of
|
|
|
|
|
extension messages, and for any additional information put inside the
|
|
|
|
|
top-level dictionary. All one and two byte identifiers are invalid to use
|
|
|
|
|
unless defined by this specification.
|
|
|
|
|
|
|
|
|
|
This message should be sent immediately after the standard bittorrent handshake
|
|
|
|
|
to any peer that supports this extension protocol. It is valid to send the
|
|
|
|
|
handshake message more than once during the lifetime of a connection,
|
|
|
|
|
the sending client should not be disconnected. An implementation may choose
|
|
|
|
|
to ignore the subsequent handshake messages (or parts of them).
|
|
|
|
|
|
|
|
|
|
Subsequent handshake messages can be used to enable/disable extensions
|
|
|
|
|
without restarting the connection. If a peer supports changing extensions
|
|
|
|
|
at run time, it should note that the ``m`` dictionary is additive.
|
|
|
|
|
It's enough that it contains the actual *CHANGES* to the extension list.
|
|
|
|
|
To disable the support for ``LT_metadata`` at run-time, without affecting
|
|
|
|
|
any other extensions, this message should be sent:
|
|
|
|
|
``d11:LT_metadatai0ee``.
|
|
|
|
|
As specified above, the value 0 is used to turn off an extension.
|
|
|
|
|
|
|
|
|
|
The extension IDs must be stored for every peer, becuase every peer may have
|
|
|
|
|
different IDs for the same extension.
|
|
|
|
|
|
|
|
|
|
This specification, deliberately, does not specify any extensions such as
|
|
|
|
|
peer-exchange or metadata exchange. This protocol is merely a transport
|
|
|
|
|
for the actual extensions to the bittorrent protocol and the extensions
|
|
|
|
|
named in the example above (such as ``p``) are just examples of possible
|
|
|
|
|
extensions.
|
|
|
|
|
|
|
|
|
|
rationale
|
|
|
|
|
---------
|
|
|
|
|
|
|
|
|
|
The reason why the extension messages' IDs would be defined in the handshake
|
|
|
|
|
is to avoid having a global registry somewhere, where ID's are assigned
|
|
|
|
|
global identifiers. Now the extensions have unique names.
|
|
|
|
|
|
|
|
|
|
If the client supporting the extensions can decide which numbers the messages
|
|
|
|
|
it receives will have, it means they are constants within that client. i.e.
|
|
|
|
|
they can be used in ``switch`` statements. It's easy for the other end to
|
|
|
|
|
store an array with the ID's we expect for each message and use that for
|
|
|
|
|
lookups each time it sends an extension message.
|
|
|
|
|
|
|
|
|
|
The reason for having a dictionary instead of having an array (using
|
|
|
|
|
implicitly assigned index numbers to the extensions) is that if a client
|
|
|
|
|
want to disable some extensions, the ID numbers would change, and it wouldn't
|
|
|
|
|
be able to use constants (and hence, not use them in a ``switch``). If the
|
|
|
|
|
messages IDs would map directly to bittorrent message IDs, It would also make
|
|
|
|
|
it possible to map extensions in the handshake to existing extensions with
|
|
|
|
|
fixed message IDs.
|
|
|
|
|
|
|
|
|
|
The reasoning behind having a single byte as extended message identifier is
|
|
|
|
|
to follow the the bittorrent spec. with its single byte message identifiers.
|
|
|
|
|
It is also considered to be enough. It won't limit the total number of
|
|
|
|
|
extensions, only the number of extensions used simultaneously.
|
|
|
|
|
|
|
|
|
|
The reason for using single byte identifiers for the standardized handshake
|
|
|
|
|
identifiers is 1) The mainline DHT uses single byte identifiers. 2) Saves
|
|
|
|
|
bandwidth. The only advantage of longer messages is that it makes the
|
|
|
|
|
protocol more readable for a human, but the BT protocol wasn't designed to
|
|
|
|
|
be a human readable protocol, so why bother.
|