add documentation on streaming implementation

2014-06-07 16:43:14 +00:00 · 2014-06-07 16:43:14 +00:00 · aa85d3c35c
parent dd2e605796
commit aa85d3c35c
3 changed files with 129 additions and 0 deletions
--- a/docs/index.rst
+++ b/docs/index.rst
@ -27,6 +27,7 @@ Extensions
 * `uTP`_
 * `extensions protocol`_
 * `plugin interface`_
+* `streaming`_
 * `DHT extensions`_
 * `DHT security extension`_
 * `DHT store extension`_
@ -68,6 +69,7 @@ libtorrent
 .. _`uTP`: utp.html
 .. _`extensions protocol`: extension_protocol.html
 .. _`plugin interface`: reference-Plugins.html
+.. _`streaming`: streaming.html
 .. _`DHT extensions`: dht_extensions.html
 .. _`DHT security extension`: dht_sec.html
 .. _`DHT store extension`: dht_store.html
--- a/docs/makefile
+++ b/docs/makefile
@ -39,6 +39,7 @@ TARGETS = index \
 	utp \
 	tuning \
 	hacking \
+	streaming \
 	$(REFERENCE_TARGETS)

 FIGURES = read_disk_buffers write_disk_buffers troubleshooting
--- a/docs/streaming.rst
+++ b/docs/streaming.rst
@ -0,0 +1,126 @@
+Streaming implementation
+========================
+
+This documents describes the algorithm libtorrent uses to satisfy time critical
+piece requests, i.e. streaming.
+
+piece picking
+-------------
+
+The standard bittorrent piece picker is peer-centric. A peer unchokes us or we
+complte a block from a peer and we want to make another request to that peer.
+The piece picker answers the question: which block should we request from this
+peer.
+
+When streaming, we have a number of *time critical* pieces, the ones the video
+or audio player will need next to keep up with the stream. To keep the deadlines
+of these pieces, we need a mechanism to answer the question: I want to request
+blocks from this piece, which peer is the most likely to be able to deliver it
+to me the soonest.
+
+This question is answered by ``torrent::request_time_critical_pieces()`` in
+libtorrent.
+
+At a high level, this algorithm keeps a list of peers, sorted by the estimated
+download queue time. That is, the estimated time for a new request to this
+peer to be received. The bottom 10th percentile of the peers (the 10% slowest
+peers) are ignored and not included in the peer list. Peers that have choked
+us, are not interesting, is on parole, disconnecting, have too many outstanding
+block requests or is snubbed are also excluded from the peer list.
+
+The time critical pieces are also kept sorted by their deadline. Pieces with
+an earlier deadline first. This list of pieces is iterated, starting at the
+top, and blocks are requested from a piece until we cannot make any more
+requests from it. We then move on to the next piece and request blocks from it
+until we cannot make any more. The peer each request is sent to is the one
+with the lowest `download queue time`_. Each time a request is made, this
+estimate is updated and the peer is resorted in this list.
+
+Any peer that doesn't have the piece is ignored until we move on to the next
+piece.
+
+If the top peer's download queue time is more than 2 seconds, the loop is
+terminated. This is to not over-request. ``request_time_critical_pieces()``
+is called once per second, so this will keep the queue full with margin.
+
+download queue time
+-------------------
+
+Each peer maintains the number of bytes that have been requested from it but
+not yet been received. This is referred to as ``outstanding_bytes``. This number
+is incremented by the size of each outgoing request and decremented for each
+*payload* byte received.
+
+This counter is divided by an estimated download rate from the peer to form
+the estimated *download queue time*. That is, the estimated time it will take
+any new request to this peer to begin being received.
+
+The estimated download rate of a peer is not trivial. There may not be any
+outstanding requests to the peer, in which case the payload download rate
+will be zero. That would not be a reasonable estimate of the rate we would see
+once we make a request.
+
+If we have not received any payload from a peer in the last 30 seconds, we
+must use an alternative estimate of the download rate. If we have received
+payload from this peer previously, we can use the peak download rate.
+
+If we have received less than 2 blocks (32 kiB) and we have been unchoked for
+less than 5 seconds ago, use the average download rate of all peers (that have
+outstanding requests).
+
+timeouts
+--------
+
+An observation that is useful to keep in mind when streaming is that your
+download capacity is likely to be saturated by your peers. In this case, if the
+swarm is well seeded, most peers will send data to you at close to the same
+rate. This makes it important to support streaming from many slow peers. For
+instance, this means you can't make assumptions about the download time of a
+block being less than some absolute time. You may be downloading at well above
+the bitrate of the video, but each individual peer only transfers at 5 kiB/s.
+
+In this state, your download rate is a zero-sum-game. Any block you request
+that is not urgent, will take away from the bandwidth you get for peers that
+are urgent. Make sure to limit requests to useful blocks only.
+
+Some requests will stall. It appears to be very hard to have enough accuracy in
+the prediction of download queue time such that all requests come back within a
+reasonable amount of time.
+
+To support adaptive timeuts, each torrent maintains a running average of how
+long it takes to complete a piece. There is also a running average of the
+deviation from the mean download time.
+
+This download time is used as the benchmark to determine when blocks have
+timed out, and should be re-requested from another peer.
+
+If any time-critical piece has taken more than the average piece download
+time + a half average deviation form that, the piece is considered to have
+timed out. This means we are allowed to double-request blocks. Subsequent
+passes over this piece will make sure that any blocks we don't already have
+are requested one more time.
+
+In fact, this scales to multiple time-outs. The time since a download was
+started is divided by average download time + average deviation time / 2.
+The resulting integer is the number if *times* the piece has timed out.
+
+Each time a piece times out, another *busy request* is allowed to try to make
+it complete sooner. A busy request is where a block is requested from a peer
+even though it has already been requested from another peer.
+
+This has the effect of getting more and more aggressive in requesting blocks
+the longer it takes to complete the piece. If this mechanism is too aggressive,
+a significant amount of bandwidht may be lost in redundant download (keep in
+mind the zero-sum game).
+
+It never makes sense to request a block twice from the same peer. There is logic
+in place to prevent this.
+
+optimizations
+-------------
+
+One optimization is to buffer all piece requests while looping over the time-
+critical pieces and not send them until one round is complete. This increases
+the chances that the request messages are coalesced into the same packet.
+This in turn lowers the number of system calls and network overhead.
+