add documentation on streaming implementation

2014-06-07 16:43:14 +00:00 · 2014-06-07 16:43:14 +00:00 · aa85d3c35c
parent dd2e605796
commit aa85d3c35c
3 changed files with 129 additions and 0 deletions
--- a/docs/index.rst
+++ b/docs/index.rst
@ -27,6 +27,7 @@ Extensions
 * `uTP`_
 * `extensions protocol`_
 * `plugin interface`_
 * `streaming`_
 * `DHT extensions`_
 * `DHT security extension`_
 * `DHT store extension`_
@ -68,6 +69,7 @@ libtorrent
 .. _`uTP`: utp.html
 .. _`extensions protocol`: extension_protocol.html
 .. _`plugin interface`: reference-Plugins.html
 .. _`streaming`: streaming.html
 .. _`DHT extensions`: dht_extensions.html
 .. _`DHT security extension`: dht_sec.html
 .. _`DHT store extension`: dht_store.html
--- a/docs/makefile
+++ b/docs/makefile
@ -39,6 +39,7 @@ TARGETS = index \
 	utp \
 	tuning \
 	hacking \
 	streaming \
 	$(REFERENCE_TARGETS)
 FIGURES = read_disk_buffers write_disk_buffers troubleshooting
--- a/docs/streaming.rst
+++ b/docs/streaming.rst
@ -0,0 +1,126 @@
 Streaming implementation
 ========================
 This documents describes the algorithm libtorrent uses to satisfy time critical
 piece requests, i.e. streaming.
 piece picking
 -------------
 The standard bittorrent piece picker is peer-centric. A peer unchokes us or we
 complte a block from a peer and we want to make another request to that peer.
 The piece picker answers the question: which block should we request from this
 peer.
 When streaming, we have a number of *time critical* pieces, the ones the video
 or audio player will need next to keep up with the stream. To keep the deadlines
 of these pieces, we need a mechanism to answer the question: I want to request
 blocks from this piece, which peer is the most likely to be able to deliver it
 to me the soonest.
 This question is answered by ``torrent::request_time_critical_pieces()`` in
 libtorrent.
 At a high level, this algorithm keeps a list of peers, sorted by the estimated
 download queue time. That is, the estimated time for a new request to this
 peer to be received. The bottom 10th percentile of the peers (the 10% slowest
 peers) are ignored and not included in the peer list. Peers that have choked
 us, are not interesting, is on parole, disconnecting, have too many outstanding
 block requests or is snubbed are also excluded from the peer list.
 The time critical pieces are also kept sorted by their deadline. Pieces with
 an earlier deadline first. This list of pieces is iterated, starting at the
 top, and blocks are requested from a piece until we cannot make any more
 requests from it. We then move on to the next piece and request blocks from it
 until we cannot make any more. The peer each request is sent to is the one
 with the lowest `download queue time`_. Each time a request is made, this
 estimate is updated and the peer is resorted in this list.
 Any peer that doesn't have the piece is ignored until we move on to the next
 piece.
 If the top peer's download queue time is more than 2 seconds, the loop is
 terminated. This is to not over-request. ``request_time_critical_pieces()``
 is called once per second, so this will keep the queue full with margin.
 download queue time
 -------------------
 Each peer maintains the number of bytes that have been requested from it but
 not yet been received. This is referred to as ``outstanding_bytes``. This number
 is incremented by the size of each outgoing request and decremented for each
 *payload* byte received.
 This counter is divided by an estimated download rate from the peer to form
 the estimated *download queue time*. That is, the estimated time it will take
 any new request to this peer to begin being received.
 The estimated download rate of a peer is not trivial. There may not be any
 outstanding requests to the peer, in which case the payload download rate
 will be zero. That would not be a reasonable estimate of the rate we would see
 once we make a request.
 If we have not received any payload from a peer in the last 30 seconds, we
 must use an alternative estimate of the download rate. If we have received
 payload from this peer previously, we can use the peak download rate.
 If we have received less than 2 blocks (32 kiB) and we have been unchoked for
 less than 5 seconds ago, use the average download rate of all peers (that have
 outstanding requests).
 timeouts
 --------
 An observation that is useful to keep in mind when streaming is that your
 download capacity is likely to be saturated by your peers. In this case, if the
 swarm is well seeded, most peers will send data to you at close to the same
 rate. This makes it important to support streaming from many slow peers. For
 instance, this means you can't make assumptions about the download time of a
 block being less than some absolute time. You may be downloading at well above
 the bitrate of the video, but each individual peer only transfers at 5 kiB/s.
 In this state, your download rate is a zero-sum-game. Any block you request
 that is not urgent, will take away from the bandwidth you get for peers that
 are urgent. Make sure to limit requests to useful blocks only.
 Some requests will stall. It appears to be very hard to have enough accuracy in
 the prediction of download queue time such that all requests come back within a
 reasonable amount of time.
 To support adaptive timeuts, each torrent maintains a running average of how
 long it takes to complete a piece. There is also a running average of the
 deviation from the mean download time.
 This download time is used as the benchmark to determine when blocks have
 timed out, and should be re-requested from another peer.
 If any time-critical piece has taken more than the average piece download
 time + a half average deviation form that, the piece is considered to have
 timed out. This means we are allowed to double-request blocks. Subsequent
 passes over this piece will make sure that any blocks we don't already have
 are requested one more time.
 In fact, this scales to multiple time-outs. The time since a download was
 started is divided by average download time + average deviation time / 2.
 The resulting integer is the number if *times* the piece has timed out.
 Each time a piece times out, another *busy request* is allowed to try to make
 it complete sooner. A busy request is where a block is requested from a peer
 even though it has already been requested from another peer.
 This has the effect of getting more and more aggressive in requesting blocks
 the longer it takes to complete the piece. If this mechanism is too aggressive,
 a significant amount of bandwidht may be lost in redundant download (keep in
 mind the zero-sum game).
 It never makes sense to request a block twice from the same peer. There is logic
 in place to prevent this.
 optimizations
 -------------
 One optimization is to buffer all piece requests while looping over the time-
 critical pieces and not send them until one round is complete. This increases
 the chances that the request messages are coalesced into the same packet.
 This in turn lowers the number of system calls and network overhead.