flodgatt/src/lib.rs

57 lines
2.9 KiB
Rust
Raw Normal View History

2019-07-06 02:08:50 +02:00
//! Streaming server for Mastodon
//!
//!
//! This server provides live, streaming updates for Mastodon clients. Specifically, when a
//! server is running this sever, Mastodon clients can use either Server Sent Events or
//! WebSockets to connect to the server with the API described [in Mastodon's public API
2019-07-06 02:08:50 +02:00
//! documentation](https://docs.joinmastodon.org/api/streaming/).
//!
//! # Data Flow
//! * **Parsing the client request** When the client request first comes in, it is
//! parsed based on the endpoint it targets (for server sent events), its query parameters,
//! and its headers (for WebSocket). Based on this data, we authenticate the user, retrieve
//! relevant user data from Postgres, and determine the timeline targeted by the request.
//! Successfully parsing the client request results in generating a `User` corresponding to
//! the request. If any requests are invalid/not authorized, we reject them in this stage.
//! * **Streaming update from Redis to the client**: After the user request is parsed, we pass
//! the `User` data on to the `ClientAgent`. The `ClientAgent` is responsible for
//! communicating the user's request to the `Receiver`, polling the `Receiver` for any
//! updates, and then for wording those updates on to the client. The `Receiver`, in tern, is
//! responsible for managing the Redis subscriptions, periodically polling Redis, and sorting
//! the replies from Redis into queues for when it is polled by the `ClientAgent`.
2019-07-06 02:08:50 +02:00
//!
//! # Concurrency
//! The `Receiver` is created when the server is first initialized, and there is only one
//! `Receiver`. Thus, the `Receiver` is a potential bottleneck. On the other hand, each
//! client request results in a new green thread, which spawns its own `ClientAgent`. Thus,
//! their will be many `ClientAgent`s polling a single `Receiver`. Accordingly, it is very
//! important that polling the `Receiver` remain as fast as possible. It is less important
//! that the `Receiver`'s poll of Redis be fast, since there will only ever be one
//! `Receiver`.
//!
//! # Configuration By default, the server uses config values from the `config.rs` module;
//! these values can be overwritten with environmental variables or in the `.env` file. The
//! most important settings for performance control the frequency with which the `ClientAgent`
//! polls the `Receiver` and the frequency with which the `Receiver` polls Redis.
2019-07-06 02:08:50 +02:00
//!
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
#![warn(clippy::pedantic)]
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
#![allow(clippy::try_err, clippy::match_bool)]
#![allow(clippy::large_enum_variant)]
pub use err::Error;
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
2019-07-06 02:08:50 +02:00
pub mod config;
mod err;
pub mod request;
pub mod response;
/// A user ID.
///
/// Internally, Mastodon IDs are i64s, but are sent to clients as string because
/// JavaScript numbers don't support i64s. This newtype serializes to/from a string, but
/// keeps the i64 as the "true" value for internal use.
#[derive(Debug, Copy, Clone, PartialEq, Eq, Hash)]
#[doc(hidden)]
pub struct Id(pub i64);