A blazingly fast drop-in replacement for the Mastodon streaming API server

Go to file

Daniel Sockwell 1657113c58 Stream events via a watch channel (#128 ) This squashed commit makes a fairly significant structural change to significantly reduce Flodgatt's CPU usage. Flodgatt connects to Redis in a single (green) thread, and then creates a new thread to handle each WebSocket/SSE connection. Previously, each thread was responsible for polling the Redis thread to determine whether it had a message relevant to the connected client. I initially selected this structure both because it was simple and because it minimized memory overhead – no messages are sent to a particular thread unless they are relevant to the client connected to the thread. However, I recently ran some load tests that show this approach to have unacceptable CPU costs when 300+ clients are simultaneously connected. Accordingly, Flodgatt now uses a different structure: the main Redis thread now announces each incoming message via a watch channel connected to every client thread, and each client thread filters out irrelevant messages. In theory, this could lead to slightly higher memory use, but tests I have run so far have not found a measurable increase. On the other hand, Flodgatt's CPU use is now an order of magnitude lower in tests I've run. This approach does run a (very slight) risk of dropping messages under extremely heavy load: because a watch channel only stores the most recent message transmitted, if Flodgatt adds a second message before the thread can read the first message, the first message will be overwritten and never transmitted. This seems unlikely to happen in practice, and we can avoid the issue entirely by changing to a broadcast channel when we upgrade to the most recent Tokio version (see #75).		2020-04-09 13:32:36 -04:00
.github	Create FUNDING.yml	2019-07-10 23:21:18 +02:00
benches	Handle non conforment events (#117 )	2020-04-03 12:41:53 -04:00
src	Stream events via a watch channel (#128 )	2020-04-09 13:32:36 -04:00
.env	Add aditional Postgres config options	2019-08-27 18:31:56 -04:00
.gitignore	Initial project files	2019-02-11 09:45:14 +01:00
.travis.yml	Add Travis CI	2019-07-10 23:17:40 +02:00
Cargo.lock	Stream events via a watch channel (#128 )	2020-04-09 13:32:36 -04:00
Cargo.toml	Stream events via a watch channel (#128 )	2020-04-09 13:32:36 -04:00
LICENSE	Initial commit	2019-02-08 10:35:26 +01:00
README.md	Fix typo and reword a sentence (#126 )	2020-04-06 17:51:21 -04:00
old	Handle non conforment events (#117 )	2020-04-03 12:41:53 -04:00

README.md

Flóðgátt

A blazingly fast drop-in replacement for the Mastodon streaming API server.

Current status: This server is currently a work in progress. However, it is now testable and, if configured properly, would theoretically be usable in production—though production use is not advisable until we have completed further testing. I would greatly appreciate any testing, bug reports, or other feedback you could provide.

Installation

Starting from version 0.3, Flóðgátt can be installed for Linux by installing the pre-built binaries released on GitHub. Simply download the binary (extracting it if necessary), set it to executable (chmod +x) and run it. Note that you will likely need to configure the Postgres connection before you can successfully connect.

Configuration Examples

If you are running Mastodon with its standard Development settings, then you should be able to run flodgatt without any configuration. (You will, of course, need to ensure that the Node streaming server is not running at the same time as Flodgatt. If you normally run the development servers with foreman start, you should edit the Procfile.dev file to remove the line that starts the Node server. To run flodgatt with a production instance of Mastodon, you should ensure that the mastodon-streaming systemd service is not running.)

You will likely wish to use the environmental variable RUST_LOG=warn to enable debugging warnings.

If you are running Mastodon with its standard Production settings and connect to Postgres with the Ident authentication method, then you can use the following procedure to launch Flodgatt.

Change to the user that satisfies the Ident requirement (typically "mastodon" with default settints). For example: su mastodon
Use environmental variables to set the user, database, and host names. For example: DB_NAME="mastodon_production" DB_USER="mastodon" DB_HOST="/var/run/postgresql" RUST_LOG=warn flodgatt

If you have any difficulty connecting, note that, if run with RUST_LOG=warn Flodgatt will print both the environmental variables it received and the parsed configuration variables it generated from those environmental variables. You can use this info to debug the connection.

Flóðgátt is tested against the default Mastodon nginx config and treats that as the known-good configuration.

Advanced Configuration

The streaming server will eventually use the same environment variables as the rest of Mastodon, and currently uses a subset of those variables. Supported variables are listed in /src/config.rs. You can provide any supported environmental variable to Flóðgátt at runtime or through a .env file.

Note that the default values for the postgres connection do not correspond to those typically used in production. Thus, you will need to configure the connection either env vars or a .env file if you intend to connect Flóðgátt to a production database.

If you set the SOCKET environmental variable, you must set the nginx proxy_pass variable to the same socket (with the file prefixed by http://unix:).

Additionally, note that connecting Flóðgátt to Postgres with the ident method requires running Flóðgátt as the user who owns the mastodon database (typically mastodon).

Building from source

Installing from source requires the Rust toolchain. Clone this repository and run cargo build (to build the server), or cargo build --release (to build the server with release optimizations).

Running the built server

You can run the server with cargo run. Alternatively, if you built the sever using cargo build or cargo build --release, you can run the executable produced in the target/build/debug folder or the target/build/release folder.

Building documentation

Build documentation with cargo doc --open, which will build the Markdown docs and open them in your browser. Please consult those docs for a detailed description of the code structure/organization. The documentation also contains additional notes about data flow and options for configuration.

Testing

You can run basic unit tests with cargo test.

Manual testing

Once the streaming server is running, you can also test it manually. You can test it using a browser connected to the relevant Mastodon development server. Or you can test the SSE endpoints with curl, PostMan, or any other HTTP client. Similarly, you can test the WebSocket endpoints with websocat or any other WebSocket client.

Memory/CPU usage

Note that memory usage is higher when running the development version of the streaming server (the one generated with cargo run or cargo build). If you are interested in measuring RAM or CPU usage, you should likely run cargo build --release and test the release version of the executable.

Load testing

I have not yet found a good way to test the streaming server under load. I have experimented with using artillery or other load-testing utilities. However, every utility I am familiar with or have found is built around either HTTP requests or WebSocket connections in which the client sends messages. I have not found a good solution to test receiving SSEs or WebSocket connections where the client does not transmit data after establishing the connection. If you are aware of a good way to do load testing, please let me know.

Contributing

Issues and pull requests are welcome. Flóðgátt is governed by the same Code of Conduct as Mastodon as a whole.