flodgatt/src/response/redis/manager.rs

252 lines
9.1 KiB
Rust
Raw Normal View History

2019-07-08 13:31:42 +02:00
//! Receives data from Redis, sorts it by `ClientAgent`, and stores it until
//! polled by the correct `ClientAgent`. Also manages sububscriptions and
//! unsubscriptions to/from Redis.
mod err;
pub use err::Error;
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
use super::msg::{RedisParseErr, RedisParseOutput};
use super::{Event, RedisCmd, RedisConn};
use crate::config;
use crate::request::{Subscription, Timeline};
pub(self) use super::EventErr;
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
use futures::{Async, Poll, Stream};
use hashbrown::{HashMap, HashSet};
2020-04-28 20:37:51 +02:00
use lru::LruCache;
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
use std::convert::{TryFrom, TryInto};
use std::str;
use std::sync::{Arc, Mutex, MutexGuard, PoisonError};
use std::time::{Duration, Instant};
use tokio::sync::mpsc::Sender;
type Result<T> = std::result::Result<T, Error>;
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
type EventChannel = Sender<Arc<Event>>;
/// The item that streams from Redis and is polled by the `ClientAgent`
pub struct Manager {
pub redis_conn: RedisConn,
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
timelines: HashMap<Timeline, HashMap<u32, EventChannel>>,
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
ping_time: Instant,
channel_id: u32,
pub unread_idx: (usize, usize),
2020-04-28 20:37:51 +02:00
tag_id_cache: LruCache<String, i64>,
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
}
impl Stream for Manager {
type Item = (Timeline, Arc<Event>);
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
type Error = Error;
fn poll(&mut self) -> Poll<Option<Self::Item>, Error> {
let input = &self.redis_conn.input[self.unread_idx.0..self.unread_idx.1];
let (valid, invalid) = str::from_utf8(input)
.map(|v| (v, &b""[..]))
.unwrap_or_else(|e| {
// NOTE - this bounds check occurs more often than necessary; it could occur only when
// polling Redis. However, benchmarking with Criterion shows it to be *very*
// inexpensive (<1 us) and thus not worth removing (doing so would require `unsafe`).
let (valid, invalid) = input.split_at(e.valid_up_to());
(str::from_utf8(valid).expect("split_at"), invalid)
});
if !valid.is_empty() {
use RedisParseOutput::*;
match RedisParseOutput::try_from(valid) {
Ok(Msg(msg)) => {
// If we get a message and it matches the redis_namespace, get the msg's
// Event and send it to all channels matching the msg's Timeline
if let Some(tl) = msg.timeline_matching_ns(&self.redis_conn.namespace) {
self.unread_idx.0 =
self.unread_idx.1 - msg.leftover_input.len() - invalid.len();
let tl = Timeline::from_redis_text(tl, &mut self.tag_id_cache)?;
let event: Arc<Event> = Arc::new(msg.event_txt.try_into()?);
Ok(Async::Ready(Some((tl, event))))
} else {
Ok(Async::Ready(None))
}
}
Ok(NonMsg(leftover_input)) => {
self.unread_idx.0 = self.unread_idx.1 - leftover_input.len();
Ok(Async::Ready(None))
}
Err(RedisParseErr::Incomplete) => {
self.copy_partial_msg();
Ok(Async::NotReady)
}
Err(e) => Err(Error::RedisParseErr(e, valid.to_string()))?,
}
} else {
self.unread_idx = (0, 0);
Ok(Async::NotReady)
}
}
}
impl Manager {
// untested
pub fn send_msgs(&mut self) -> Poll<(), Error> {
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
if self.ping_time.elapsed() > Duration::from_secs(30) {
self.send_pings()?
}
while let Ok(Async::Ready(Some(msg_len))) = self.redis_conn.poll_redis(self.unread_idx.1) {
self.unread_idx.1 += msg_len;
while let Ok(Async::Ready(msg)) = self.poll() {
if let Some((tl, event)) = msg {
for channel in self.timelines.entry(tl).or_default().values_mut() {
if let Ok(Async::NotReady) = channel.poll_ready() {
log::warn!("{:?} channel full\ncan't send:{:?}", tl, event);
self.rewind_to_prev_msg();
return Ok(Async::NotReady);
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
}
let _ = channel.try_send(event.clone()); // err just means channel will be closed
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
}
}
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
}
}
Ok(Async::Ready(()))
}
fn rewind_to_prev_msg(&mut self) {
self.unread_idx.0 = loop {
let input = &self.redis_conn.input[..self.unread_idx.0];
let input = str::from_utf8(input).unwrap_or_else(|e| {
str::from_utf8(input.split_at(e.valid_up_to()).0).expect("guaranteed by `split_at`")
});
let index = if let Some(i) = input.rfind("\r\n*") {
i + "\r\n".len()
} else {
0
};
self.unread_idx.0 = index;
if let Ok(Async::Ready(Some(_))) = self.poll() {
break index;
2020-04-28 20:37:51 +02:00
}
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
}
}
2020-04-28 20:37:51 +02:00
fn copy_partial_msg(&mut self) {
if self.unread_idx.0 == 0 {
// msg already first; no copying needed
} else if self.unread_idx.0 >= (self.unread_idx.1 - self.unread_idx.0) {
let (read, unread) =
self.redis_conn.input[..self.unread_idx.1].split_at_mut(self.unread_idx.0);
for (i, b) in unread.iter().enumerate() {
read[i] = *b;
}
} else {
// Less efficient, but should never occur in production
log::warn!("Moving partial input requires heap allocation");
self.redis_conn.input = self.redis_conn.input[self.unread_idx.0..].into();
}
self.unread_idx = (0, self.unread_idx.1 - self.unread_idx.0);
&self.unread_idx;
2020-04-28 20:37:51 +02:00
}
/// Create a new `Manager`, with its own Redis connections (but no active subscriptions).
pub fn try_from(redis_cfg: &config::Redis) -> Result<Self> {
Ok(Self {
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
redis_conn: RedisConn::new(redis_cfg)?,
timelines: HashMap::new(),
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
ping_time: Instant::now(),
channel_id: 0,
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
unread_idx: (0, 0),
2020-04-28 20:37:51 +02:00
tag_id_cache: LruCache::new(1000),
})
}
pub fn into_arc(self) -> Arc<Mutex<Self>> {
Arc::new(Mutex::new(self))
}
2019-05-10 07:47:29 +02:00
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
pub fn subscribe(&mut self, subscription: &Subscription, channel: EventChannel) {
let (tag, tl) = (subscription.hashtag_name.clone(), subscription.timeline);
if let (Some(hashtag), Some(id)) = (tag, tl.tag()) {
2020-04-28 20:37:51 +02:00
self.tag_id_cache.put(hashtag.clone(), id);
self.redis_conn.tag_name_cache.put(id, hashtag);
};
let channels = self.timelines.entry(tl).or_default();
channels.insert(self.channel_id, channel);
self.channel_id += 1;
if channels.len() == 1 {
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
self.redis_conn
.send_cmd(RedisCmd::Subscribe, &[tl])
.unwrap_or_else(|e| log::error!("Could not subscribe to the Redis channel: {}", e));
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
log::info!("Subscribed to {:?}", tl);
};
}
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
fn send_pings(&mut self) -> Result<()> {
// NOTE: this takes two cycles to close a connection after the client times out: on
// the first cycle, this successfully sends the Event to the response::Ws thread but
// that thread fatally errors sending to the client. On the *second* cycle, this
// gets the error. This isn't ideal, but is harmless.
self.ping_time = Instant::now();
let mut subscriptions_to_close = HashSet::new();
self.timelines.retain(|tl, channels| {
channels.retain(|_, chan| chan.try_send(Arc::new(Event::Ping)).is_ok());
if channels.is_empty() {
subscriptions_to_close.insert(*tl);
false
} else {
true
}
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
});
if !subscriptions_to_close.is_empty() {
let timelines: Vec<_> = subscriptions_to_close.into_iter().collect();
&self
.redis_conn
.send_cmd(RedisCmd::Unsubscribe, &timelines[..])?;
log::info!("Unsubscribed from {:?}", timelines);
}
Ok(())
}
pub fn recover(poisoned: PoisonError<MutexGuard<Self>>) -> MutexGuard<Self> {
log::error!("{}", &poisoned);
poisoned.into_inner()
}
pub fn count(&self) -> String {
2020-04-05 16:54:42 +02:00
format!(
"Current connections: {}",
self.timelines.values().map(HashMap::len).sum::<usize>()
2020-04-05 16:54:42 +02:00
)
}
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be *theoretically* possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a *significantly* resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature.
2020-04-27 22:03:05 +02:00
pub fn backpresure(&self) -> String {
format!(
"Input buffer size: {} KiB",
(self.unread_idx.1 - self.unread_idx.0) / 1024
)
}
pub fn list(&self) -> String {
2020-04-05 16:54:42 +02:00
let max_len = self
.timelines
2020-04-05 16:54:42 +02:00
.keys()
.fold(0, |acc, el| acc.max(format!("{:?}:", el).len()));
self.timelines
2020-04-05 16:54:42 +02:00
.iter()
.map(|(tl, channel_map)| {
2020-04-05 16:54:42 +02:00
let tl_txt = format!("{:?}:", tl);
format!("{:>1$} {2}\n", tl_txt, max_len, channel_map.len())
2020-04-05 16:54:42 +02:00
})
.chain(std::iter::once(
"\n*may include recently disconnected clients".to_string(),
))
2020-04-05 16:54:42 +02:00
.collect()
}
}
#[cfg(test)]
mod test;