flodgatt/src/messages/event/dynamic_event.rs

88 lines
3.4 KiB
Rust
Raw Normal View History

use crate::parse_client_request::Blocks;
use hashbrown::HashSet;
use serde::{Deserialize, Serialize};
use serde_json::Value;
#[derive(Deserialize, Serialize, Debug, Clone, PartialEq)]
pub struct DynamicEvent {
pub event: String,
pub payload: Value,
queued_at: Option<i64>,
}
impl DynamicEvent {
/// Returns `true` if the status is filtered out based on its language
pub fn language_not(&self, allowed_langs: &HashSet<String>) -> bool {
const ALLOW: bool = false;
const REJECT: bool = true;
if allowed_langs.is_empty() {
return ALLOW; // listing no allowed_langs results in allowing all languages
}
match self.payload["language"].as_str() {
Some(toot_language) if allowed_langs.contains(toot_language) => ALLOW,
None => ALLOW, // If toot language is unknown, toot is always allowed
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
Some(empty) if empty == String::new() => ALLOW,
Some(_toot_language) => REJECT,
}
}
/// Returns `true` if the toot contained in this Event originated from a blocked domain,
/// is from an account that has blocked the current user, or if the User's list of
/// blocked/muted users includes a user involved in the toot.
///
/// A user is involved in the toot if they:
/// * Are mentioned in this toot
/// * Wrote this toot
/// * Wrote a toot that this toot is replying to (if any)
/// * Wrote the toot that this toot is boosting (if any)
pub fn involves_any(&self, blocks: &Blocks) -> bool {
const ALLOW: bool = false;
const REJECT: bool = true;
let Blocks {
blocked_users,
blocking_users,
blocked_domains,
} = blocks;
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
let id = self.payload["account"]["id"].as_str().expect("TODO");
let username = self.payload["account"]["acct"].as_str().expect("TODO");
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
if self.involves(blocked_users) || blocking_users.contains(&id.parse().expect("TODO")) {
REJECT
} else {
let full_username = &username;
match full_username.split('@').nth(1) {
Some(originating_domain) if blocked_domains.contains(originating_domain) => REJECT,
Some(_) | None => ALLOW, // None means the local instance, which can't be blocked
}
}
}
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
// involved_users = mentioned_users + author + replied-to user + boosted user
fn involves(&self, blocked_users: &HashSet<i64>) -> bool {
// mentions
let mentions = self.payload["mentions"].as_array().expect("TODO");
let mut involved_users: HashSet<i64> = mentions
.iter()
.map(|mention| mention["id"].as_str().expect("TODO").parse().expect("TODO"))
.collect();
// author
let author_id = self.payload["account"]["id"].as_str().expect("TODO");
involved_users.insert(author_id.parse::<i64>().expect("TODO"));
// replied-to user
let replied_to_user = self.payload["in_reply_to_account_id"].as_str();
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
if let Some(user_id) = replied_to_user {
involved_users.insert(user_id.parse().expect("TODO"));
}
// boosted user
let id_of_boosted_user = self.payload["reblog"]["account"]["id"]
.as_str()
.expect("TODO");
involved_users.insert(id_of_boosted_user.parse().expect("TODO"));
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).
2020-04-09 19:32:36 +02:00
!involved_users.is_disjoint(blocked_users)
}
}