Rework documentation for removal of streaming clustering (#1207)

2025-04-11 22:56:17 +02:00 · 2023-04-27 11:03:18 +02:00 · 2023-04-27 11:03:18 +02:00 · fe555118d6
commit fe555118d6
parent 3630f2de3b
3 changed files with 162 additions and 31 deletions
--- a/content/en/admin/config.md
+++ b/content/en/admin/config.md
@ -205,12 +205,13 @@ Determines the amount of logs generated by Mastodon. Defaults to `info`, which g
 Tells the Mastodon web and streaming processes which IPs act as your trusted reverse proxy (e.g. nginx, Cloudflare). It affects how Mastodon determines the source IP of each request, which is used for important rate limits and security functions. If the value is set incorrectly then Mastodon could use the IP of the reverse proxy instead of the actual source.

 By default the loopback and private network address ranges are trusted. Specifically:
- * `127.0.0.1/8`
- * `::1/128`
- * `10.0.0.0/8`
- * `172.16.0.0/12`
- * `192.168.0.0/16`
- * `fc00::/7`
+
+- `127.0.0.1/8`
+- `::1/128`
+- `10.0.0.0/8`
+- `172.16.0.0/12`
+- `192.168.0.0/16`
+- `fc00::/7`

 If you're using a single reverse proxy and it runs on the same machine or is in the same private network as your Mastodon web and streaming processes then you most likely don't need to modify this setting and can use the default. Or if you're using multiple reverse proxy servers and they're all in the same private network as your Mastodon web and streaming processes then, again, the default should be fine. However, if you're using a reverse proxy server that reaches your Mastodon web and streaming servers via a public IP address (for example if you're using Cloudflare or a similar proxy) then you'll need to set this variable. It should be the IPs of all reverse proxies in use, as a comma-separated list of IPs or IP ranges using [CIDR notation](https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#CIDR_notation). Note that when this variable is set the default ranges (mentioned above) will no longer be trusted, so if you have both an external reverse proxy _and_ a proxy on localhost then you must include the IPs (or IP ranges) of both.

@ -276,7 +277,11 @@ The streaming API can be deployed to a different domain/subdomain. This may impr

 Example value: `wss://streaming.example.com`

-#### `STREAMING_CLUSTER_NUM`
+#### `STREAMING_CLUSTER_NUM` (deprecated) {#streaming_cluster_num}
+
+{{< hint style="danger" >}}
+Deprecated: The streaming server process now only uses a single node.js process, to scale it further, you'll need to follow the documentation in the [scaling guide](/admin/scaling#streaming)
+{{< /hint >}}

 Specific to the streaming API, this variable determines how many different processes the streaming API forks into. Defaults to the number of CPU cores minus one.

@ -403,16 +408,27 @@ If set, all StatsD keys will be prefixed with this. Defaults to `Mastodon.produc
 ### SMTP email delivery {#smtp}

 #### `SMTP_SERVER`
+
 #### `SMTP_PORT`
+
 #### `SMTP_LOGIN`
+
 #### `SMTP_PASSWORD`
+
 #### `SMTP_FROM_ADDRESS`
+
 #### `SMTP_DOMAIN`
+
 #### `SMTP_DELIVERY_METHOD`
+
 #### `SMTP_AUTH_METHOD`
+
 #### `SMTP_CA_FILE`
+
 #### `SMTP_OPENSSL_VERIFY_MODE`
+
 #### `SMTP_ENABLE_STARTTLS_AUTO`
+
 #### `SMTP_ENABLE_STARTTLS`

 Set to `auto` (default), `always`, or `never`.
@ -421,6 +437,7 @@ Set to `auto` (default), `always`, or `never`.
 4.0.0 - added

 #### `SMTP_TLS`
+
 #### `SMTP_SSL`

 ## File storage {#files}
@ -439,9 +456,9 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon

 #### `S3_ALIAS_HOST`

-Similar to `CDN_HOST`, you may serve *user-uploaded* files from a separate host. In fact, if you are using external storage like Amazon S3, Minio or Google Cloud, you will by default be serving files from those services' URLs.
+Similar to `CDN_HOST`, you may serve _user-uploaded_ files from a separate host. In fact, if you are using external storage like Amazon S3, Minio or Google Cloud, you will by default be serving files from those services' URLs.

-It is *extremely recommended* to use your own host instead, for a few reasons:
+It is _extremely recommended_ to use your own host instead, for a few reasons:

 1. Bandwidth on external storage providers is metered and expensive
 2. You may want to switch to a different provider later without breaking old links
@ -457,36 +474,59 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon
 ### Local file storage {#paperclip}

 #### `PAPERCLIP_ROOT_PATH`
+
 #### `PAPERCLIP_ROOT_URL`

 ### Amazon S3 and compatible {#s3}

 #### `S3_ENABLED`
+
 #### `S3_BUCKET`
+
 #### `AWS_ACCESS_KEY_ID`
+
 #### `AWS_SECRET_ACCESS_KEY`
+
 #### `S3_REGION`
+
 #### `S3_PROTOCOL`
+
 #### `S3_HOSTNAME`
+
 #### `S3_ENDPOINT`
+
 #### `S3_SIGNATURE_VERSION`
+
 #### `S3_OVERRIDE_PATH_STYLE`
+
 #### `S3_OPEN_TIMEOUT`
+
 #### `S3_READ_TIMEOUT`
+
 #### `S3_FORCE_SINGLE_REQUEST`

 ### Swift {#swift}

 #### `SWIFT_ENABLED`
+
 #### `SWIFT_USERNAME`
+
 #### `SWIFT_TENANT`
+
 #### `SWIFT_PASSWORD`
+
 #### `SWIFT_PROJECT_ID`
+
 #### `SWIFT_AUTH_URL`
+
 #### `SWIFT_CONTAINER`
+
 #### `SWIFT_OBJECT_URL`
+
 #### `SWIFT_REGION`
+
 #### `SWIFT_DOMAIN_NAME`
+
 #### `SWIFT_CACHE_TTL`

 ## External authentication {#external-authentication}
@ -498,71 +538,125 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon
 ### LDAP {#ldap}

 #### `LDAP_ENABLED`
+
 #### `LDAP_HOST`
+
 #### `LDAP_PORT`
+
 #### `LDAP_METHOD`
+
 #### `LDAP_BASE`
+
 #### `LDAP_BIND_DN`
+
 #### `LDAP_PASSWORD`
+
 #### `LDAP_UID`
+
 #### `LDAP_SEARCH_FILTER`
+
 #### `LDAP_MAIL`
+
 #### `LDAP_UID_CONVERSION_ENABLED`

 ### PAM {#pam}

 #### `PAM_ENABLED`
+
 #### `PAM_EMAIL_DOMAIN`
+
 #### `PAM_DEFAULT_SERVICE`
+
 #### `PAM_CONTROLLED_SERVICE`

 ### CAS {#cas}

 #### `CAS_ENABLED`
+
 #### `CAS_DISPLAY_NAME`
+
 #### `CAS_URL`
+
 #### `CAS_HOST`
+
 #### `CAS_PORT`
+
 #### `CAS_SSL`
+
 #### `CAS_VALIDATE_URL`
+
 #### `CAS_CALLBACK_URL`
+
 #### `CAS_LOGOUT_URL`
+
 #### `CAS_LOGIN_URL`
+
 #### `CAS_UID_FIELD`
+
 #### `CAS_CA_PATH`
+
 #### `CAS_DISABLE_SSL_VERIFICATION`
+
 #### `CAS_UID_KEY`
+
 #### `CAS_NAME_KEY`
+
 #### `CAS_EMAIL_KEY`
+
 #### `CAS_NICKNAME_KEY`
+
 #### `CAS_FIRST_NAME_KEY`
+
 #### `CAS_LAST_NAME_KEY`
+
 #### `CAS_LOCATION_KEY`
+
 #### `CAS_IMAGE_KEY`
+
 #### `CAS_PHONE_KEY`
+
 #### `CAS_SECURITY_ASSUME_EMAIL_IS_VERIFIED`

 ### SAML {#saml}

 #### `SAML_ENABLED`
+
 #### `SAML_ACS_URL`
+
 #### `SAML_ISSUER`
+
 #### `SAML_IDP_SSO_TARGET_URL`
+
 #### `SAML_IDP_CERT`
+
 #### `SAML_IDP_CERT_FINGERPRINT`
+
 #### `SAML_NAME_IDENTIFIER_FORMAT`
+
 #### `SAML_CERT`
+
 #### `SAML_PRIVATE_KEY`
+
 #### `SAML_SECURITY_WANT_ASSERTION_SIGNED`
+
 #### `SAML_SECURITY_WANT_ASSERTION_ENCRYPTED`
+
 #### `SAML_SECURITY_ASSUME_EMAIL_IS_VERIFIED`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_UID`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_EMAIL`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_FULL_NAME`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_FIRST_NAME`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_LAST_NAME`
+
 #### `SAML_UID_ATTRIBUTE`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_VERIFIED`
+
 #### `SAML_ATTRIBUTES_STATEMENTS_VERIFIED_EMAIL`

 ## Hidden services {#hidden-services}
@ -572,7 +666,9 @@ You must serve the files with CORS headers, otherwise some functions of Mastodon
 {{< page-ref page="admin/optional/tor" >}}

 #### `http_proxy`
+
 #### `http_hidden_proxy`
+
 #### `ALLOW_ACCESS_TO_HIDDEN_SERVICE`

 ## Limits {#limits}
@ -620,13 +716,21 @@ This variable only has any effect when running `rake db:migrate` and it is extre
 ### Uncategorized or unsorted

 #### `BUNDLE_GEMFILE`
+
 #### `DEEPL_API_KEY`
+
 #### `DEEPL_PLAN`
+
 #### `LIBRE_TRANSLATE_ENDPOINT`
+
 #### `LIBRE_TRANSLATE_API_KEY`
+
 #### `CACHE_BUSTER_ENABLED`
+
 #### `CACHE_BUSTER_SECRET_HEADER`
+
 #### `CACHE_BUSTER_SECRET`
+
 #### `GITHUB_REPOSITORY`

 Defaults to `mastodon/mastodon`
@ -636,8 +740,11 @@ Defaults to `mastodon/mastodon`
 Defaults to `https://github.com/$GITHUB_REPOSITORY`

 #### `FFMPEG_BINARY`
+
 #### `LOCAL_HTTPS`
+
 #### `PATH`
+
 #### `MAX_FOLLOWS_THRESHOLD`

 Defaults to `7500`
--- a/content/en/admin/scaling.md
+++ b/content/en/admin/scaling.md
@ -11,16 +11,16 @@ menu:

 Mastodon has three types of processes:

-* Web (Puma)
-* Streaming API
-* Background processing (Sidekiq)
+- Web (Puma)
+- Streaming API
+- Background processing (Sidekiq)

 ### Web (Puma) {#web}

 The web process serves short-lived HTTP requests for most of the application. The following environment variables control it:

-* `WEB_CONCURRENCY` controls the number of worker processes
-* `MAX_THREADS` controls the number of threads per process
+- `WEB_CONCURRENCY` controls the number of worker processes
+- `MAX_THREADS` controls the number of threads per process

 Threads share the memory of their parent process. Different processes allocate their own memory, though they share some memory via copy-on-write. A larger number of threads maxes out your CPU first, a larger number of processes maxes out your RAM first.

@ -32,10 +32,36 @@ In terms of throughput, more processes are better than more threads.

 The streaming API handles long-lived HTTP and WebSockets connections, through which clients receive real-time updates. The following environment variables control it:

-* `STREAMING_CLUSTER_NUM` controls the number of worker processes
-* `STREAMING_API_BASE_URL` controls the base URL of the streaming API
+- `STREAMING_API_BASE_URL` controls the base URL of the streaming API
+- `PORT` controls the port the streaming server will listen on, by default 4000. The `BIND` and `SOCKET` environment variables are also able to be used.
+- Additionally the shared [database](/admin/config#postgresql) and [redis](/admin/config#redis) environment variables are used.

-One process can handle a reasonably high number of connections. The streaming API can be hosted on a different subdomain if you want to e.g. avoid the overhead of nginx proxying the connections.
+The streaming API can be use a different subdomain if you want to by setting `STREAMING_API_BASE_URL`, this allows you to have one load balancer for streaming and one for web/API requests.
+
+{{< hint style="warning" >}}
+Previous versions of Mastodon had a `STREAMING_CLUSTER_NUM` environment variable that made the streaming server use clustering, which started mulitple processes (workers) and used node.js to load balance them.
+
+This interacted with the other settings in ways which made capacity planning difficult, especially when it comes to database connections and CPU resources. By default the streaming server would consume resources on all available CPUs which could cause contention with other software running on that server. Another common issue was that misconfiguring the `STREAMING_CLUSTER_NUM` would exhaust your database connections by opening up a connection pool per cluster worker process, so a `STREAMING_CLUSTER_NUM` of `5` and `DB_POOL` of `10` would potentially consume 50 database connections.
+
+Now a single streaming server process will only use at maximum `DB_POOL` PostgreSQL connections, and scaling is handled by running more instances of the streaming server.
+{{< /hint >}}
+
+One process can handle a reasonably high number of connections and throughput, but if you find that a single streaming server process isn't handling your instance's load, you can run multiple processes by varying the `PORT` number of each, and then using nginx to load balance traffic to each of those instances.
+
+{{< hint style="info" >}}
+The more streaming server processes that you run, the more database connections will be consumed on PostgreSQL, so you'll likely want to use PgBouncer, as documented below.
+{{< /hint >}}
+
+An example nginx configuration to route traffic to three different processes on `PORT` 4000, 4001, and 4002 is as follows:
+
+```
+upstream streaming {
+    least_conn;
+    server 127.0.0.1:4000 fail_timeout=0;
+    server 127.0.0.1:4001 fail_timeout=0;
+    server 127.0.0.1:4002 fail_timeout=0;
+}
+```

 ### Background processing (Sidekiq) {#sidekiq}

@ -57,14 +83,14 @@ Would start the sidekiq process with 15 threads. Please mind that each threads n

 Sidekiq uses different queues for tasks of varying importance, where importance is defined by how much it would impact the user experience of your server’s local users if the queue wasn’t working, in order of descending importance:

-| Queue | Significance |
-| :--- | :--- |
-| `default` | All tasks that affect local users |
-| `push` | Delivery of payloads to other servers |
-| `mailers` | Delivery of e-mails |
-| `pull` | Lower priority tasks such as handling imports, backups, resolving threads, deleting users, forwarding replies |
-| `scheduler` | Doing cron jobs like refreshing trending hashtags and cleaning up logs |
-| `ingress` | Incoming remote activities. Lower priority than the default queue so local users still see their posts when the server is under load |
+| Queue       | Significance                                                                                                                         |
+| :---------- | :----------------------------------------------------------------------------------------------------------------------------------- |
+| `default`   | All tasks that affect local users                                                                                                    |
+| `push`      | Delivery of payloads to other servers                                                                                                |
+| `mailers`   | Delivery of e-mails                                                                                                                  |
+| `pull`      | Lower priority tasks such as handling imports, backups, resolving threads, deleting users, forwarding replies                        |
+| `scheduler` | Doing cron jobs like refreshing trending hashtags and cleaning up logs                                                               |
+| `ingress`   | Incoming remote activities. Lower priority than the default queue so local users still see their posts when the server is under load |

 The default queues and their priorities are stored in [config/sidekiq.yml](https://github.com/mastodon/mastodon/blob/main/config/sidekiq.yml), but can be overridden by the command-line invocation of Sidekiq, e.g.:

@ -80,7 +106,6 @@ As a solution, it is possible to start different Sidekiq processes for the queue

 **Make sure you only have one `scheduler` queue running!!**

-
 ## Transaction pooling with pgBouncer {#pgbouncer}

 ### Why you might need PgBouncer {#pgbouncer-why}
@ -258,8 +283,8 @@ As far as configuring the Redis database goes, basically you can get rid of back

 To reduce the load on your Postgresql server, you may wish to setup hot streaming replication (read replica). [See this guide for an example](https://cloud.google.com/community/tutorials/setting-up-postgres-hot-standby). You can make use of the replica in Mastodon in these ways:

-* The streaming API server does not issue writes at all, so you can connect it straight to the replica. But it’s not querying the database very often anyway so the impact of this is little.
-* Use the Makara driver in the web and sidekiq processes, so that writes go to the primary database, while reads go to the replica. Let’s talk about that.
+- The streaming API server does not issue writes at all, so you can connect it straight to the replica. But it’s not querying the database very often anyway so the impact of this is little.
+- Use the Makara driver in the web and sidekiq processes, so that writes go to the primary database, while reads go to the replica. Let’s talk about that.

 You will have to edit the `config/database.yml` file and replace the `production` section as follows:

@ -284,4 +309,3 @@ Make sure the URLs point to wherever your PostgreSQL servers are. You can add mu
 {{< hint style="warning" >}}
 Sidekiq cannot reliably use read-replicas because even the tiniest replication lag leads to failing jobs due to queued up records not being found.
 {{< /hint >}}
-
--- a/content/en/admin/upgrading.md
+++ b/content/en/admin/upgrading.md
@ -59,14 +59,14 @@ systemctl reload mastodon-web
 The `reload` operation is a zero-downtime restart, also called "phased restart". As such, Mastodon upgrades usually do not require any advance notice to users about planned downtime. In rare cases, you can use the `restart` operation instead, but there will be a (short) felt interruption of service for your users.
 {{< /hint >}}

-Rarely, the **streaming API** server is also updated and requires a restart:
+The **streaming API** server is also updated and requires a restart, doing so will result in all connected clients being disconnected, which can increase load on your server:

 ```bash
 systemctl restart mastodon-streaming
 ```

 {{< hint style="danger" >}}
-The streaming API server is updated very rarely, and in most releases, does *not* require a restart. Restarting the streaming API leads to an increased load on your server as disconnected clients attempt to reconnect or poll the REST API instead, so avoid it whenever you can.
+Restarting the streaming API leads to an increased load on your server as disconnected clients attempt to reconnect or poll the REST API instead, so avoid it whenever you can.
 {{< /hint >}}

 {{< hint style="success" >}}