Typos and language improvements. (#1362)

* Typos and language improvements.
* Table -> defn list per suggestion.
* clarify priority of queues

Signed-off-by: Andy Piper <andypiper@users.noreply.github.com>
Co-authored-by: Michael Stanclift <mx@vmstan.com>
This commit is contained in:
Andy Piper 2024-01-11 13:57:12 +00:00 committed by GitHub
parent fe380facfa
commit 09ac07584a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 61 additions and 47 deletions

View File

@ -30,22 +30,22 @@ In terms of throughput, more processes are better than more threads.
### Streaming API {#streaming}
The streaming API handles long-lived HTTP and WebSockets connections, through which clients receive real-time updates. The following environment variables control it:
The streaming API handles long-lived HTTP and WebSocket connections, through which clients receive real-time updates. The following environment variables control it:
- `STREAMING_API_BASE_URL` controls the base URL of the streaming API
- `PORT` controls the port the streaming server will listen on, by default 4000. The `BIND` and `SOCKET` environment variables are also able to be used.
- Additionally, the shared [database](/admin/config#postgresql) and [redis](/admin/config#redis) environment variables are used.
- Additionally, the shared [database](/admin/config#postgresql) and [Redis](/admin/config#redis) environment variables are used.
The streaming API can use a different subdomain if you want to by setting `STREAMING_API_BASE_URL`. This allows you to have one load balancer for streaming and one for web/API requests. However, this also requires applications to correctly request the streaming URL from the [instance endpoint](/methods/instance/#v2), instead of assuming that it's hosted on the same host as the Web API.
One process of the streaming server can handle a reasonably high number of connections and throughput, but if you find that a single process isn't handling your instance's load, you can run multiple processes by varying the `PORT` number of each, and then using nginx to load balance traffic to each of those instances. For example, a community of about 50,000 accounts with 10,000-20,000 monthly active accounts, you'll typically have an average concurrent load of about 800-1200 streaming connections.
The streaming server also exposes a [Prometheus](https://prometheus.io/) endpoint on `/metrics` with a lot of metrics to help you understand the current load on your mastodon streaming server, some key metrics are:
The streaming server also exposes a [Prometheus](https://prometheus.io/) endpoint on `/metrics` with a lot of metrics to help you understand the current load on your Mastodon streaming server, some key metrics are:
* `mastodon_streaming_connected_clients`: This is the number of connected clients, tagged by client type (websocket or eventsource)
* `mastodon_streaming_connected_channels`: This is the number of "channels" that are currently subscribed (note that this is much higher than connected clients due to how our internal "system" channels currently work)
* `mastodon_streaming_messages_sent_total`: This is the total number of messages sent to clients since last restart.
* `mastodon_streaming_redis_messages_received_total`: This is the number of messages received from Redis pubsub, and intended to complement [monitoring Redis directly](https://sysdig.com/blog/redis-prometheus/).
- `mastodon_streaming_connected_clients`: This is the number of connected clients, tagged by client type (websocket or eventsource)
- `mastodon_streaming_connected_channels`: This is the number of "channels" that are currently subscribed (note that this is much higher than connected clients due to how our internal "system" channels currently work)
- `mastodon_streaming_messages_sent_total`: This is the total number of messages sent to clients since last restart.
- `mastodon_streaming_redis_messages_received_total`: This is the number of messages received from Redis pubsub, and intended to complement [monitoring Redis directly](https://sysdig.com/blog/redis-prometheus/).
{{< hint style="info" >}}
The more streaming server processes that you run, the more database connections will be consumed on PostgreSQL, so you'll likely want to use PgBouncer, as documented below.
@ -53,7 +53,7 @@ The more streaming server processes that you run, the more database connections
An example nginx configuration to route traffic to three different processes on `PORT` 4000, 4001, and 4002 is as follows:
```
```text
upstream streaming {
least_conn;
server 127.0.0.1:4000 fail_timeout=0;
@ -64,7 +64,7 @@ upstream streaming {
If you're using the distributed systemd files, then you can start up multiple streaming servers with the following commands:
```
```bash
$ sudo systemctl start mastodon-streaming@4000.service
$ sudo systemctl start mastodon-streaming@4001.service
$ sudo systemctl start mastodon-streaming@4002.service
@ -73,9 +73,9 @@ $ sudo systemctl start mastodon-streaming@4002.service
By default, `sudo systemctl start mastodon-streaming` starts just one process on port 4000, equivalent to running `sudo systemctl start mastodon-streaming@4000.service`.
{{< hint style="warning" >}}
Previous versions of Mastodon had a `STREAMING_CLUSTER_NUM` environment variable that made the streaming server use clustering, which started mulitple workers processes and used node.js to load balance them.
Previous versions of Mastodon had a `STREAMING_CLUSTER_NUM` environment variable that made the streaming server use clustering, which started multiple worker processes and used node.js to load balance them.
This interacted with the other settings in ways which made capacity planning difficult, especially when it comes to database connections and CPU resources. By default the streaming server would consume resources on all available CPUs which could cause contention with other software running on that server. Another common issue was that misconfiguring the `STREAMING_CLUSTER_NUM` would exhaust your database connections by opening up a connection pool per cluster worker process, so a `STREAMING_CLUSTER_NUM` of `5` and `DB_POOL` of `10` would potentially consume 50 database connections.
This interacted with the other settings in ways which made capacity planning difficult, especially when it comes to database connections and CPU resources. By default, the streaming server would consume resources on all available CPUs which could cause contention with other software running on that server. Another common issue was that misconfiguring the `STREAMING_CLUSTER_NUM` would exhaust your database connections by opening up a connection pool per cluster worker process, so a `STREAMING_CLUSTER_NUM` of `5` and `DB_POOL` of `10` would potentially consume 50 database connections.
Now a single streaming server process will only use at maximum `DB_POOL` PostgreSQL connections, and scaling is handled by running more instances of the streaming server.
{{< /hint >}}
@ -86,7 +86,7 @@ Many tasks in Mastodon are delegated to background processing to ensure the HTTP
#### Number of threads {#sidekiq-threads}
While the amount of threads in the web process affects the responsiveness of the Mastodon instance to the end-user, the amount of threads allocated to background processing affects how quickly posts can be delivered from the author to anyone else, how soon e-mails are sent out, etc.
While the number of threads in the web process affects the responsiveness of the Mastodon instance to the end-user, the number of threads allocated to background processing affects how quickly posts can be delivered from the author to anyone else, how soon e-mails are sent out, etc.
The number of threads is not regulated by an environment variable, but rather through a command line argument when invoking Sidekiq, as shown in the following example:
@ -94,20 +94,29 @@ The number of threads is not regulated by an environment variable, but rather th
bundle exec sidekiq -c 15
```
This would initiate the Sidekiq process with 15 threads. It's important to note that each thread requires a database connection, necessitating a sufficiently large database pool. The size of this pool is managed by the DB_POOL environment variable, which should be set to a value at least equal to the number of threads.
This would initiate the Sidekiq process with 15 threads. It is important to note that each thread requires a database connection, so this requires a large database pool. The size of this pool is managed by the `DB_POOL` environment variable, which should be set to a value at least equal to the number of threads.
#### Queues {#sidekiq-queues}
Sidekiq uses different queues for tasks of varying importance, where importance is defined by how much it would impact the user experience of your servers local users if the queue wasnt working, in order of descending importance:
Sidekiq uses different queues for tasks of varying importance, where importance is defined by how much it would impact the user experience of your servers local users if the queue wasnt working. The queues are listed here, in order of descending importance:
| Queue | Significance |
| :---------- | :----------------------------------------------------------------------------------------------------------------------------------- |
| `default` | All tasks that affect local users |
| `push` | Delivery of payloads to other servers |
| `mailers` | Delivery of e-mails |
| `pull` | Lower priority tasks such as handling imports, backups, resolving threads, deleting users, forwarding replies |
| `scheduler` | Doing cron jobs like refreshing trending hashtags and cleaning up logs |
| `ingress` | Incoming remote activities. Lower priority than the default queue so local users still see their posts when the server is under load |
`default`
: All tasks that affect local users.
`push`
: Delivery of payloads to other servers.
`ingress`
: Incoming remote activities. Lower priority than the default queue, so that local users still see their posts when the server is under load.
`mailers`
: Delivery of e-mails.
`pull`
: Lower priority tasks, such as handling imports, backups, resolving threads, deleting users, forwarding replies.
`scheduler`
: Handling cron jobs, such as refreshing trending hashtags and cleaning up logs.
The default queues and their priorities are stored in [config/sidekiq.yml](https://github.com/mastodon/mastodon/blob/main/config/sidekiq.yml), but can be overridden by the command-line invocation of Sidekiq, e.g.:
@ -115,19 +124,21 @@ The default queues and their priorities are stored in [config/sidekiq.yml](https
bundle exec sidekiq -q default
```
To run just the `default` queue.
This command will run just the `default` queue.
Sidekiq processes queues by first checking for tasks in the first queue, and if it finds none, it then checks the subsequent queue. Consequently, if the first queue is overfilled, tasks in the other queues may experience delays.
Sidekiq processes queues by first checking for tasks in the first queue, and if it finds none, it then checks the subsequent queue. Therefore, if the first queue is overfilled, tasks in the other queues may experience delays.
As a solution, it is possible to start different Sidekiq processes for the queues to ensure truly parallel execution, by e.g. creating multiple systemd services for Sidekiq with different arguments.
It is possible to start different Sidekiq processes for the queues to ensure truly parallel execution, by e.g. creating multiple systemd services for Sidekiq with different arguments.
**Make sure you only have one `scheduler` queue running!!**
{{< hint style="warning" >}}
You may run as many Sidekiq processes with as many threads as necessary to efficiently process running jobs, however the `scheduler` queue should never be run in more than one Sidekiq process at a time.
{{< /hint >}}
## Transaction pooling with PgBouncer {#pgbouncer}
### Why you might need PgBouncer {#pgbouncer-why}
If you start running out of available PostgreSQL connections (the default is 100) then you may find PgBouncer to be a good solution. This document describes some common gotchas as well as good configuration defaults for Mastodon.
If you start running out of available PostgreSQL connections (the default is 100) then you may find PgBouncer to be a good solution. This document describes some common gotchas, as well as good configuration defaults for Mastodon.
User roles with `DevOps` permissions in Mastodon can monitor the current usage of PostgreSQL connections through the PgHero link in the Administration view. Generally, the number of connections open is equal to the total threads in Puma, Sidekiq, and the streaming API combined.
@ -143,7 +154,7 @@ sudo apt install pgbouncer
#### Setting a password {#pgbouncer-password}
First off, if your `mastodon` user in PostgreSQL is set up without a password, you will need to set a password.
Firstly, if your `mastodon` user in PostgreSQL is set up without a password, you will need to set a password.
Heres how you might reset the password:
@ -163,13 +174,13 @@ Then `\q` to quit.
Edit `/etc/pgbouncer/userlist.txt`
As long as you specify a user/password in pgbouncer.ini later, the values in userlist.txt do _not_ have to correspond to real PostgreSQL roles. You can arbitrarily define users and passwords, but you can reuse the “real” credentials for simplicitys sake. Add the `mastodon` user to the `userlist.txt`:
As long as you specify a user/password in `pgbouncer.ini` later, the values in `userlist.txt` do _not_ have to correspond to real PostgreSQL roles. You can arbitrarily define users and passwords, but you can reuse the “real” credentials for simplicitys sake. Add the `mastodon` user to the `userlist.txt`:
```text
"mastodon" "md5d75bb2be2d7086c6148944261a00f605"
```
Here were using the md5 scheme, where the md5 password is just the md5sum of `password + username` with the string `md5` prepended. For instance, to derive the hash for user `mastodon` with password `password`, you can do:
Here were using the md5 scheme, where the md5 password is just the md5sum of `password + username` with the string `md5` prepended. For instance, to derive the hash for user `mastodon` with password `password`:
```bash
# ubuntu, debian, etc.
@ -195,14 +206,14 @@ Edit `/etc/pgbouncer/pgbouncer.ini`
Add a line under `[databases]` listing the PostgreSQL databases you want to connect to. Here well just have PgBouncer use the same username/password and database name to connect to the underlying PostgreSQL database:
```text
```ini
[databases]
mastodon_production = host=127.0.0.1 port=5432 dbname=mastodon_production user=mastodon password=password
```
The `listen_addr` and `listen_port` tell PgBouncer which address/port to accept connections. The defaults are fine:
```text
```ini
listen_addr = 127.0.0.1
listen_port = 6432
```
@ -219,7 +230,7 @@ Make sure the `pgbouncer` user is an admin:
admin_users = pgbouncer
```
Mastodon requires a different pooling mode than the default session-based one. Specifically, it needs a transaction-based pooling mode. This means that a PostgreSQL connection is established at the start of a transaction and terminated upon its completion. Therefore, it's essential to change the `pool_mode` setting from `session` to `transaction`:
Mastodon requires a different pooling mode than the default session-based one. Specifically, it needs a transaction-based pooling mode. This means that a PostgreSQL connection is established at the start of a transaction and terminated upon its completion. Therefore, it is essential to change the `pool_mode` setting from `session` to `transaction`:
```ini
pool_mode = transaction
@ -229,7 +240,7 @@ Next up, `max_client_conn` defines how many connections PgBouncer itself will ac
The defaults are fine to start, and you can always increase them later:
```text
```ini
max_client_conn = 100
default_pool_size = 20
```
@ -277,7 +288,7 @@ DB_PORT=6432
```
{{< hint style="warning" >}}
You cannot use PgBouncer to perform `db:migrate` tasks. But this is easy to work around. If your PostgreSQL and PgBouncer are on the same host, it can be as simple as defining `DB_PORT=5432` together with `RAILS_ENV=production` when calling the task, for example: `RAILS_ENV=production DB_PORT=5432 bundle exec rails db:migrate` (you can specify `DB_HOST` too if its different, etc)
You cannot use PgBouncer to perform `db:migrate` tasks, but this is easy to work around. If your PostgreSQL and PgBouncer are on the same host, it can be as simple as defining `DB_PORT=5432` together with `RAILS_ENV=production` when calling the task, for example: `RAILS_ENV=production DB_PORT=5432 bundle exec rails db:migrate` (you can specify `DB_HOST` too if its different, etc)
{{< /hint >}}
#### Administering PgBouncer {#pgbouncer-admin}
@ -306,26 +317,28 @@ Then use `\q` to quit.
Redis is used widely throughout the application, but some uses are more important than others. Home feeds, list feeds, and Sidekiq queues as well as the streaming API are backed by Redis and thats important data you wouldnt want to lose (even though the loss can be survived, unlike the loss of the PostgreSQL database - never lose that!). However, Redis is also used for volatile cache. If you are at a stage of scaling up where you are worried about whether your Redis can handle everything, you can use a different Redis database for the cache. In the environment, you can specify `CACHE_REDIS_URL` or individual parts like `CACHE_REDIS_HOST`, `CACHE_REDIS_PORT` etc. Unspecified parts fallback to the same values as without the cache prefix.
Additionally, Redis is used for volatile caching. If you're scaling up and concerned about Redis's capacity to handle the load, you can allocate a separate Redis database specifically for caching. To do this, set `CACHE_REDIS_URL` in the environment, or define individual components such as `CACHE_REDIS_HOST`, `CACHE_REDIS_PORT`, etc.
Additionally, Redis is used for volatile caching. If you are scaling up and you are concerned about Redis's capacity to handle the load, you can allocate a separate Redis database specifically for caching. To do this, set `CACHE_REDIS_URL` in the environment, or define individual components such as `CACHE_REDIS_HOST`, `CACHE_REDIS_PORT`, etc.
Unspecified components will default to their values without the cache prefix.
When configuring the Redis database for caching, it's possible to disable background saving to disk, as data loss on restart is not critical in this context, and this can save some disk I/O. Additionally, consider setting a maximum memory limit and implementing a key eviction policy. For more details on these configurations, refer to this guide:[Using Redis as an LRU cache](https://redis.io/topics/lru-cache)
When configuring the Redis database for caching, it is possible to disable background saving to disk, as data loss on restart is not critical in this context, and this can save some disk I/O. Additionally, consider setting a maximum memory limit and implementing a key eviction policy. For more details on these configurations, refer to this guide:[Using Redis as an LRU cache](https://redis.io/topics/lru-cache)
## Seperate Redis for Sidekiq {#redis-sidekiq}
## Separate Redis for Sidekiq {#redis-sidekiq}
Redis is used in Sidekiq to keep track of its locks and queue. Although in general the performance gain is not that big, some instances may benefit from having a seperate Redis instance for Sidekiq.
Redis is used in Sidekiq to keep track of its locks and queue. Although in general the performance gain is not that big, some instances may benefit from having a separate Redis instance for Sidekiq.
In the environment file, you can specify `SIDEKIQ_REDIS_URL` or individual parts like `SIDEKIQ_REDIS_HOST`, `SIDEKIQ_REDIS_PORT` etc. Unspecified parts fallback to the same values as without the `SIDEKIQ_` prefix.
Creating a seperate Redis instance for Sidekiq is relatively simple:
Creating a separate Redis instance for Sidekiq is relatively simple:
Start by making a copy of the default redis systemd service:
```bash
cp /etc/systemd/system/redis.service /etc/systemd/system/redis-sidekiq.service
```
In the `redis-sidekiq.service` file, change the following values:
```bash
ExecStart=/usr/bin/redis-server /etc/redis/redis-sidekiq.conf --supervised systemd --daemonize no
PIDFile=/run/redis/redis-server-sidekiq.pid
@ -333,13 +346,14 @@ ReadWritePaths=-/var/lib/redis-sidekiq
Alias=redis-sidekiq.service
```
Make a copy of the Redis configuration file for the new Sidekiq Redis instance
Make a copy of the Redis configuration file for the new Sidekiq Redis instance:
```bash
cp /etc/redis/redis.conf /etc/redis/redis-sidekiq.conf
```
In this `redis-sidekiq.conf` file, change the following values:
```bash
port 6479
pidfile /var/run/redis/redis-server-sidekiq.pid
@ -360,13 +374,13 @@ Start the new Redis instance:
systemctl enable --now redis-sidekiq
```
Update your environment, add the following line:
Update your environment, and add the following line:
```bash
SIDEKIQ_REDIS_URL=redis://127.0.0.1:6479/
```
Restart Mastodon to use the new Redis instance, make sure to restart both web and Sidekiq (otherwise, one of them will still be working from the wrong instance):
Restart Mastodon to use the new Redis instance. Ensure that you restart both web and Sidekiq (otherwise, one of them will still be working from the wrong instance):
```bash
systemctl restart mastodon-web.service
@ -399,14 +413,14 @@ Once done, this is all good and you should start seeing requests against your re
For Mastodon versions before 4.2, you can make use of the replica in Mastodon in these ways:
* The streaming API server does not issue writes at all, so you can connect it straight to the replica (it is not querying the database very often anyway, so the impact of this is small).
* Use the Makara driver in the web and Sidekiq processes, so that writes go to the master database, while reads go to the replica. Lets talk about that.
- The streaming API server does not issue writes at all, so you can connect it straight to the replica (it is not querying the database very often anyway, so the impact of this is small).
- Use the Makara driver in the web and Sidekiq processes, so that writes go to the master database, while reads go to the replica. Lets talk about that.
{{< hint style="warning" >}}
Read replicas are currently not supported for the Sidekiq processes, and using them will lead to failing jobs and data loss.
{{< /hint >}}
You will have to use a separate `config/database.yml` file for the web processes and edit it to replace the `production` section as follows:
You will have to use a separate `config/database.yml` file for the web processes, and edit it to replace the `production` section as follows:
```yaml
production:
@ -424,7 +438,7 @@ production:
url: postgresql://db_user:db_password@db_host:db_port/db_name
```
Make sure the URLs point to wherever your PostgreSQL servers are. You can add multiple replicas. You could have a locally installed PgBouncer with a configuration to connect to two different servers based on the database name, e.g. “mastodon” going to the primary, “mastodon_replica” going to the replica, so in the file above both URLs would point to the local PgBouncer with the same user, password, host and port, but different database name. There are many possibilities how this could be set up! For more information on Makara, [see their documentation](https://github.com/taskrabbit/makara#databaseyml).
Make sure that the URLs point to the correct locations for your PostgreSQL servers. You can add multiple replicas. You could have a locally-installed PgBouncer with a configuration to connect to two different servers based on the database name, e.g. “mastodon” going to the primary, “mastodon_replica” going to the replica, so in the file above both URLs would point to the local PgBouncer with the same user, password, host and port, but different database name. There are many possibilities for how this could be set up. For more information on Makara, [see their documentation](https://github.com/taskrabbit/makara#databaseyml).
{{< hint style="warning" >}}
Make sure the sidekiq processes run with the stock `config/database.yml` to avoid failing jobs and data loss!