diff --git a/docs/index.rst b/docs/index.rst index ab9e99a4c..1adee935a 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -27,6 +27,7 @@ Extensions * `uTP`_ * `extensions protocol`_ * `plugin interface`_ +* `streaming`_ * `DHT extensions`_ * `DHT security extension`_ * `DHT store extension`_ @@ -68,6 +69,7 @@ libtorrent .. _`uTP`: utp.html .. _`extensions protocol`: extension_protocol.html .. _`plugin interface`: reference-Plugins.html +.. _`streaming`: streaming.html .. _`DHT extensions`: dht_extensions.html .. _`DHT security extension`: dht_sec.html .. _`DHT store extension`: dht_store.html diff --git a/docs/makefile b/docs/makefile index 3147fc895..e5432888f 100644 --- a/docs/makefile +++ b/docs/makefile @@ -39,6 +39,7 @@ TARGETS = index \ utp \ tuning \ hacking \ + streaming \ $(REFERENCE_TARGETS) FIGURES = read_disk_buffers write_disk_buffers troubleshooting diff --git a/docs/streaming.rst b/docs/streaming.rst new file mode 100644 index 000000000..2222507d3 --- /dev/null +++ b/docs/streaming.rst @@ -0,0 +1,126 @@ +Streaming implementation +======================== + +This documents describes the algorithm libtorrent uses to satisfy time critical +piece requests, i.e. streaming. + +piece picking +------------- + +The standard bittorrent piece picker is peer-centric. A peer unchokes us or we +complte a block from a peer and we want to make another request to that peer. +The piece picker answers the question: which block should we request from this +peer. + +When streaming, we have a number of *time critical* pieces, the ones the video +or audio player will need next to keep up with the stream. To keep the deadlines +of these pieces, we need a mechanism to answer the question: I want to request +blocks from this piece, which peer is the most likely to be able to deliver it +to me the soonest. + +This question is answered by ``torrent::request_time_critical_pieces()`` in +libtorrent. + +At a high level, this algorithm keeps a list of peers, sorted by the estimated +download queue time. That is, the estimated time for a new request to this +peer to be received. The bottom 10th percentile of the peers (the 10% slowest +peers) are ignored and not included in the peer list. Peers that have choked +us, are not interesting, is on parole, disconnecting, have too many outstanding +block requests or is snubbed are also excluded from the peer list. + +The time critical pieces are also kept sorted by their deadline. Pieces with +an earlier deadline first. This list of pieces is iterated, starting at the +top, and blocks are requested from a piece until we cannot make any more +requests from it. We then move on to the next piece and request blocks from it +until we cannot make any more. The peer each request is sent to is the one +with the lowest `download queue time`_. Each time a request is made, this +estimate is updated and the peer is resorted in this list. + +Any peer that doesn't have the piece is ignored until we move on to the next +piece. + +If the top peer's download queue time is more than 2 seconds, the loop is +terminated. This is to not over-request. ``request_time_critical_pieces()`` +is called once per second, so this will keep the queue full with margin. + +download queue time +------------------- + +Each peer maintains the number of bytes that have been requested from it but +not yet been received. This is referred to as ``outstanding_bytes``. This number +is incremented by the size of each outgoing request and decremented for each +*payload* byte received. + +This counter is divided by an estimated download rate from the peer to form +the estimated *download queue time*. That is, the estimated time it will take +any new request to this peer to begin being received. + +The estimated download rate of a peer is not trivial. There may not be any +outstanding requests to the peer, in which case the payload download rate +will be zero. That would not be a reasonable estimate of the rate we would see +once we make a request. + +If we have not received any payload from a peer in the last 30 seconds, we +must use an alternative estimate of the download rate. If we have received +payload from this peer previously, we can use the peak download rate. + +If we have received less than 2 blocks (32 kiB) and we have been unchoked for +less than 5 seconds ago, use the average download rate of all peers (that have +outstanding requests). + +timeouts +-------- + +An observation that is useful to keep in mind when streaming is that your +download capacity is likely to be saturated by your peers. In this case, if the +swarm is well seeded, most peers will send data to you at close to the same +rate. This makes it important to support streaming from many slow peers. For +instance, this means you can't make assumptions about the download time of a +block being less than some absolute time. You may be downloading at well above +the bitrate of the video, but each individual peer only transfers at 5 kiB/s. + +In this state, your download rate is a zero-sum-game. Any block you request +that is not urgent, will take away from the bandwidth you get for peers that +are urgent. Make sure to limit requests to useful blocks only. + +Some requests will stall. It appears to be very hard to have enough accuracy in +the prediction of download queue time such that all requests come back within a +reasonable amount of time. + +To support adaptive timeuts, each torrent maintains a running average of how +long it takes to complete a piece. There is also a running average of the +deviation from the mean download time. + +This download time is used as the benchmark to determine when blocks have +timed out, and should be re-requested from another peer. + +If any time-critical piece has taken more than the average piece download +time + a half average deviation form that, the piece is considered to have +timed out. This means we are allowed to double-request blocks. Subsequent +passes over this piece will make sure that any blocks we don't already have +are requested one more time. + +In fact, this scales to multiple time-outs. The time since a download was +started is divided by average download time + average deviation time / 2. +The resulting integer is the number if *times* the piece has timed out. + +Each time a piece times out, another *busy request* is allowed to try to make +it complete sooner. A busy request is where a block is requested from a peer +even though it has already been requested from another peer. + +This has the effect of getting more and more aggressive in requesting blocks +the longer it takes to complete the piece. If this mechanism is too aggressive, +a significant amount of bandwidht may be lost in redundant download (keep in +mind the zero-sum game). + +It never makes sense to request a block twice from the same peer. There is logic +in place to prevent this. + +optimizations +------------- + +One optimization is to buffer all piece requests while looping over the time- +critical pieces and not send them until one round is complete. This increases +the chances that the request messages are coalesced into the same packet. +This in turn lowers the number of system calls and network overhead. +