flodgatt/Cargo.toml

[package]
name = "flodgatt"
description = "A blazingly fast drop-in replacement for the Mastodon streaming api server"
version = "0.9.6"
authors = ["Daniel Long Sockwell <daniel@codesections.com", "Julian Laubstein <contact@julianlaubstein.de>"]
edition = "2018"

[dependencies]
log = { version = "0.4.6", features = ["release_max_level_info"] }
futures = "0.1.26"
tokio = "0.1.19"
warp = { git = "https://github.com/seanmonstar/warp.git"}
serde = { version = "1.0.105", features = ["derive"] }
serde_json = "1.0.50"
serde_derive = "1.0.90"
pretty_env_logger = "0.3.0"
postgres = "0.17.0"
dotenv = "0.15.0"
postgres-openssl = { git = "https://github.com/sfackler/rust-postgres.git"}
url = "2.1.0"
strum = "0.16.0"
strum_macros = "0.16.0"
r2d2_postgres = "0.16.0"
r2d2 = "0.8.8"
lru = "0.4.3"
urlencoding = "1.0.0"
hashbrown = "0.7.1"

[dev-dependencies]
criterion = "0.3"


[[bench]]
name = "parse_redis"
harness = false

[features]
default = [ "production" ]
bench = []
stub_status = []
production = []

[profile.release]
lto = "fat"
panic = "abort"
codegen-units = 1
Initial project files 2019-02-11 09:45:14 +01:00			`[package]`
Add tests for websocket routes (#38) * Refactor organazation of SSE This commit refactors how SSE requests are handled to bring them into line with how WS requests are handled and increase consistency. * Add websocket tests * Bump version to 0.2.0 Bump version and update name from ragequit to flodgatt. * Add test for non-existant endpoints * Update documentation for recent changes`` 2019-09-09 19:06:24 +02:00			`name = "flodgatt"`
Added all endpoints 2019-02-11 18:58:51 +01:00			`description = "A blazingly fast drop-in replacement for the Mastodon streaming api server"`
Improve handling of large Redis input (#143) * Implement faster buffered input This commit implements a modified ring buffer for input from Redis. Specifically, Flodgatt now limits the amount of data it fetches from Redis in one syscall to 8 KiB (two pages on most systems). Flodgatt will process all complete messages it receives from Redis and then re-use the same buffer for the next time it retrieves data. If Flodgatt received a partial message, it will copy the partial message to the beginning of the buffer before its next read. This change has little effect on Flodgatt under light load (because it was rare for Redis to have more than 8 KiB of messages available at any one time). However, my hope is that this will significantly reduce memory use on the largest instances. * Improve handling of backpresure This commit alters how Flodgatt behaves if it receives enough messages for a single client to fill that clients channel. (Because the clients regularly send their messages, should only occur if a single client receives a large number of messages nearly simultaneously; this is rare, but could occur, especially on large instances). Previously, Flodgatt would drop messages in the rare case when the client's channel was full. Now, Flodgatt will pause the current Redis poll and yield control back to the client streams, allowing the clients to empty their channels; Flodgatt will then resume polling Redis/sending the messages it previously received. With the approach, Flodgatt will never drop messages. However, the risk to this approach is that, by never dropping messages, Flodgatt does not have any way to reduce the amount of work it needs to do when under heavy load – it delays the work slightly, but doesn't reduce it. What this means is that it would be theoretically possible for Flodgatt to fall increasingly behind, if it is continuously receiving more messages than it can process. Due to how quickly Flodgatt can process messages, though, I suspect this would only come up if an admin were running Flodgatt in a significantly resource constrained environment, but I wanted to mention it for the sake of completeness. This commit also adds a new /status/backpressure endpoint that displays the current length of the Redis input buffer (which should typically be low or 0). Like the other /status endpoints, this endpoint is only enabled when Flodgatt is compiled with the `stub_status` feature. 2020-04-27 22:03:05 +02:00			`version = "0.9.6"`
Remove outdated files and update dependencies 2019-05-10 12:23:07 +02:00			`authors = ["Daniel Long Sockwell <daniel@codesections.com", "Julian Laubstein <contact@julianlaubstein.de>"]`
Initial project files 2019-02-11 09:45:14 +01:00			`edition = "2018"`

			`[dependencies]`
Performance tuning (#108) * Initial implementation WIP * Add Event type for faster parsing * Add tests and benchmarks * Add additional parsing tests 2020-03-25 22:50:32 +01:00			`log = { version = "0.4.6", features = ["release_max_level_info"] }`
Add dependencies neccessary for SSE/Warp 2019-04-15 20:48:09 +02:00			`futures = "0.1.26"`
Add ability for multiple clients to connect to the same pub/sub connection 2019-04-28 23:28:57 +02:00			`tokio = "0.1.19"`
Add WS keepalive ping (#62) * Modify code to use forked Warp * Add keepalive WS ping 2019-10-06 00:18:11 +02:00			`warp = { git = "https://github.com/seanmonstar/warp.git"}`
Handle non conforment events (#117) * Initial implementation of DynamicEvent * Restore early Event parsing 2020-04-03 18:41:53 +02:00			`serde = { version = "1.0.105", features = ["derive"] }`
			`serde_json = "1.0.50"`
Add dependencies neccessary for SSE/Warp 2019-04-15 20:48:09 +02:00			`serde_derive = "1.0.90"`
Add logging with `pretty_env_log` This commit adds basic logging at both the `info` level (for establishing a new stream) and at the `debug` level (for streaming JSON). 2019-04-18 16:10:01 +02:00			`pretty_env_logger = "0.3.0"`
Postgres connection pool (#84) * Upgrade rust-postgres library * Initial postgres connection pool * Update tests * s/pg_conn/pg_pool to match reality 2020-01-10 21:45:16 +01:00			`postgres = "0.17.0"`
Code reorganization (#130) * Reorganize files * Refactor main() * Code reorganization [WIP] * Reorganize code [WIP] * Refacto RedisConn [WIP] * Complete code reorganization 2020-04-13 22:03:06 +02:00			`dotenv = "0.15.0"`
Postgres ssl (#36) * Upgrade postgres dependency to support ssl * Clean up configuration code * Add support for SSL with postgres [WIP] 2019-09-05 03:31:52 +02:00			`postgres-openssl = { git = "https://github.com/sfackler/rust-postgres.git"}`
Redis config (#56) * Add most Redis config variables * Add REDIS_NAMESPACE env var * Fix Clippy lints 2019-10-02 06:03:18 +02:00			`url = "2.1.0"`
Enforce type safety in config (#63) * Add type-safe wrapper types to deployement_cfg * Before deleting redundnat macros * Store error messages as data * Significant progress on type safety * Add type safety to RedisConfig 2019-10-09 02:35:26 +02:00			`strum = "0.16.0"`
			`strum_macros = "0.16.0"`
Postgres connection pool (#84) * Upgrade rust-postgres library * Initial postgres connection pool * Update tests * s/pg_conn/pg_pool to match reality 2020-01-10 21:45:16 +01:00			`r2d2_postgres = "0.16.0"`
			`r2d2 = "0.8.8"`
Fix valid language (#93) * Fix panic on delete events Previously, the code attempted to check the toot's language regardless of event types. That caused a panic for `delete` events, which lack a language. * WIP implementation of Message refactor * Major refactor * Refactor scope managment to use enum * Use Timeline type instead of String * Clean up Receiver's use of Timeline * Make debug output more readable * Block statuses from blocking users This commit fixes an issue where a status from A would be displayed on B's public timelines even when A had B blocked (i.e., it would treat B as though they were muted rather than blocked for the purpose of public timelines). * Fix bug with incorrect parsing of incomming timeline * Disable outdated tests * Bump version 2020-03-19 01:37:10 +01:00			`lru = "0.4.3"`
Fix `DATABASE_URL` parsing (#122) 2020-04-03 22:35:32 +02:00			`urlencoding = "1.0.0"`
Minor performance tune (#127) * Tweak release profile & micro optimizations * Replace std HashMap with hashbrown::HashMap The hashbrown::HashMap is faster than the std::collections::HashMap, though it does not protect as well against malicious hash collisions (e.g., in a DDoS). Since we don't expose the hashing externally, we should switch to the faster implementation. 2020-04-09 00:39:52 +02:00			`hashbrown = "0.7.1"`
Added structures for env variables 2019-02-19 20:29:32 +01:00
Add initial benchmarks (#50) 2019-09-11 23:28:27 +02:00			`[dev-dependencies]`
			`criterion = "0.3"`
Benchmark & performance tune (#132) * Add temporary perf metrics * Add load testing and tune performance 2020-04-17 23:07:10 +02:00
Add initial benchmarks (#50) 2019-09-11 23:28:27 +02:00
			`[[bench]]`
			`name = "parse_redis"`
			`harness = false`

Added structures for env variables 2019-02-19 20:29:32 +01:00			`[features]`
			`default = [ "production" ]`
Handle non conforment events (#117) * Initial implementation of DynamicEvent * Restore early Event parsing 2020-04-03 18:41:53 +02:00			`bench = []`
Stub status (#124) * Add /status API endpoints [WIP] * Finish /status API endpoints This PR enables compiling Flodgatt with the `stub_status` feature. When compiled with `stub_status`, Flodgatt has 3 new API endpoints: /api/v1/streaming/status, /api/v1/streaming/status/per_timeline, and /api/v1/streaming/status/queue. The first endpoint lists the total number of connections, the second lists the number of connections per timeline, and the third lists the length of the longest queue of unsent messages (which should be low or zero when Flodgatt is functioning normally). Note that the number of _connections_ is not equal to the number of connected _clients_. If a user is viewing the local timeline, they would have at least two connections: one for the local timeline, and one for their user timeline. Other users could have even more connections. I decided to make the status endpoints an option you enable at compile time rather than at run time for three reasons: * It keeps the API of the default version of Flodgatt 100% compatible with the Node server's API; * I don't beleive it's an option Flodgatt adminstrators will want to toggle on and off frequently. * Using a compile time option ensures that there is zero runtime cost when the option is disabled. (The runtime cost should be negligible either way, but there is value in being 100% sure that the cost can be eliminated.) However, I'm happy to make it a runtime option instead if other think that would be helpful. 2020-04-05 16:54:42 +02:00			`stub_status = []`
Added structures for env variables 2019-02-19 20:29:32 +01:00			`production = []`
Minor performance tune (#127) * Tweak release profile & micro optimizations * Replace std HashMap with hashbrown::HashMap The hashbrown::HashMap is faster than the std::collections::HashMap, though it does not protect as well against malicious hash collisions (e.g., in a DDoS). Since we don't expose the hashing externally, we should switch to the faster implementation. 2020-04-09 00:39:52 +02:00
			`[profile.release]`
Remove unneeded dependency 2020-04-11 04:56:19 +02:00			`lto = "fat"`
Benchmark & performance tune (#132) * Add temporary perf metrics * Add load testing and tune performance 2020-04-17 23:07:10 +02:00			`panic = "abort"`
Remove unneeded dependency 2020-04-11 04:56:19 +02:00			`codegen-units = 1`
Stream events via a watch channel (#128) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75). 2020-04-09 19:32:36 +02:00
Minor performance tune (#127) * Tweak release profile & micro optimizations * Replace std HashMap with hashbrown::HashMap The hashbrown::HashMap is faster than the std::collections::HashMap, though it does not protect as well against malicious hash collisions (e.g., in a DDoS). Since we don't expose the hashing externally, we should switch to the faster implementation. 2020-04-09 00:39:52 +02:00