documentation/scaling.md at c7cd653ddd466b28ca95c1c6e06038700d327609

16 KiB

Raw Blame History

title

descriptions

Scaling up your server

Optimizations that can be done to serve more users.

docs

weight	parent
100	admin

Managing concurrency

Mastodon has three types of processes:

Web (Puma)
Streaming API
Background processing (Sidekiq)

Web (Puma)

The web process serves short-lived HTTP requests for most of the application. The following environment variables control it:

WEB_CONCURRENCY controls the number of worker processes
MAX_THREADS controls the number of threads per process

Threads share the memory of their parent process. Different processes allocate their own memory, though they share some memory via copy-on-write. A larger number of threads maxes out your CPU first, and a larger number of processes maxes out your RAM first.

These values affect how many HTTP requests can be served at the same time.

In terms of throughput, more processes are better than more threads.

Streaming API

The streaming API handles long-lived HTTP and WebSockets connections, through which clients receive real-time updates. The following environment variables control it:

STREAMING_API_BASE_URL controls the base URL of the streaming API
PORT controls the port the streaming server will listen on, by default 4000. The BIND and SOCKET environment variables are also able to be used.
Additionally, the shared database and redis environment variables are used.

The streaming API can use a different subdomain if you want to by setting STREAMING_API_BASE_URL, this allows you to have one load balancer for streaming and one for web/API requests.

{{< hint style="warning" >}} Previous versions of Mastodon had a STREAMING_CLUSTER_NUM environment variable that made the streaming server use clustering, which started multiple processes (workers) and used node.js to load balance them.

This interacted with the other settings in ways that made capacity planning difficult, especially when it came to database connections and CPU resources. By default, the streaming server would consume resources on all available CPUs which could cause contention with other software running on that server. Another common issue was that misconfiguring the STREAMING_CLUSTER_NUM would exhaust your database connections by opening up a connection pool per cluster worker process, so a STREAMING_CLUSTER_NUM of 5 and DB_POOL of 10 would potentially consume 50 database connections.

Now a single streaming server process will only use at maximum DB_POOL PostgreSQL connections, and scaling is handled by running more instances of the streaming server. {{< /hint >}}

One process can handle a reasonably high number of connections and throughput, but if you find that a single streaming server process isn't handling your instance's load, you can run multiple processes by varying the PORT number of each and then using nginx to load balance traffic to each of those instances.

{{< hint style="info" >}} The more streaming server processes that you run, the more database connections will be consumed on PostgreSQL, so you'll likely want to use PgBouncer, as documented below. {{< /hint >}}

An example nginx configuration to route traffic to three different processes on PORT 4000, 4001, and 4002 is as follows:

upstream streaming {
    least_conn;
    server 127.0.0.1:4000 fail_timeout=0;
    server 127.0.0.1:4001 fail_timeout=0;
    server 127.0.0.1:4002 fail_timeout=0;
}

Background processing (Sidekiq)

Many tasks in Mastodon are delegated to background processing to ensure the HTTP requests are fast, and to prevent HTTP request aborts from affecting the execution of those tasks. Sidekiq is a single process, with a configurable number of threads.

Number of threads

While the amount of threads in the web process affects the responsiveness of the Mastodon instance to the end-user, the amount of threads allocated to background processing affects how quickly posts can be delivered from the author to anyone else, how soon e-mails are sent out, etc.

The number of threads is not regulated by an environment variable, but rather through a command line argument when invoking Sidekiq, as shown in the following example:

bundle exec sidekiq -c 15

This would initiate the Sidekiq process with 15 threads. It's important to note that each thread requires a database connection, necessitating a sufficiently large database pool. The size of this pool is managed by the DB_POOL environment variable, which should be set to a value at least equal to the number of threads.

Queues

Sidekiq uses different queues for tasks of varying importance, where importance is defined by how much it would impact the user experience of your server’s local users if the queue wasn’t working, in order of descending importance:

Queue	Significance
`default`	All tasks that affect local users
`push`	Delivery of payloads to other servers
`mailers`	Delivery of e-mails
`pull`	Lower priority tasks such as handling imports, backups, resolving threads, deleting users, forwarding replies
`scheduler`	Doing cron jobs like refreshing trending hashtags and cleaning up logs
`ingress`	Incoming remote activities. Lower priority than the default queue so local users still see their posts when the server is under load

The default queues and their priorities are stored in config/sidekiq.yml, but can be overridden by the command-line invocation of Sidekiq, e.g.:

bundle exec sidekiq -q default

To run just the default queue.

Sidekiq processes queues by first checking for tasks in the first queue, and if it finds none, it then checks the subsequent queue. Consequently, if the first queue is overfilled, tasks in the other queues may experience delays.

As a solution, it is possible to start different Sidekiq processes for the queues to ensure truly parallel execution, by e.g. creating multiple systemd services for Sidekiq with different arguments.

Make sure you only have one scheduler queue running!!

Transaction pooling with pgBouncer

Why you might need PgBouncer

If you start running out of available Postgres connections (the default is 100) then you may find PgBouncer to be a good solution. This document describes some common gotchas as well as good configuration defaults for Mastodon.

User roles with DevOps permissions in Mastodon can monitor the current usage of Postgres connections through the PgHero link in the Administration view. Generally, the number of connections open is equal to the total threads in Puma, Sidekiq, and the streaming API combined.

Installing PgBouncer

On Debian and Ubuntu:

sudo apt install pgbouncer

Configuring PgBouncer

Setting a password

First off, if your mastodon user in Postgres is set up without a password, you will need to set a password.

Here’s how you might reset the password:

psql -p 5432 -U mastodon mastodon_production -w

Then (obviously, use a different password than the word “password”):

ALTER USER mastodon WITH PASSWORD 'password';

Then \q to quit.

Configuring userlist.txt

Edit /etc/pgbouncer/userlist.txt

As long as you specify a user/password in pgbouncer.ini later, the values in userlist.txt do not have to correspond to real PostgreSQL roles. You can arbitrarily define users and passwords, but you can reuse the “real” credentials for simplicity’s sake. Add the mastodon user to the userlist.txt:

"mastodon" "md5d75bb2be2d7086c6148944261a00f605"

Here we’re using the md5 scheme, where the md5 password is just the md5sum of password + username with the string md5 prepended. For instance, to derive the hash for user mastodon with password password, you can do:

# ubuntu, debian, etc.
echo -n "passwordmastodon" | md5sum
# macOS, openBSD, etc.
md5 -s "passwordmastodon"

Then just add md5 to the beginning of that.

You’ll also want to create a pgbouncer admin user to log in to the PgBouncer admin database. So here’s a sample userlist.txt:

"mastodon" "md5d75bb2be2d7086c6148944261a00f605"
"pgbouncer" "md5a45753afaca0db833a6f7c7b2864b9d9"

In both cases, the password is just password.

Configuring pgbouncer.ini

Edit /etc/pgbouncer/pgbouncer.ini

Add a line under [databases] listing the Postgres databases you want to connect to. Here we’ll just have PgBouncer use the same username/password and database name to connect to the underlying Postgres database:

[databases]
mastodon_production = host=127.0.0.1 port=5432 dbname=mastodon_production user=mastodon password=password

The listen_addr and listen_port tell PgBouncer which address/port to accept connections. The defaults are fine:

listen_addr = 127.0.0.1
listen_port = 6432

Put md5 as the auth_type (assuming you’re using the md5 format in userlist.txt):

auth_type = md5

Make sure the pgbouncer user is an admin:

admin_users = pgbouncer

Mastodon requires a different pooling mode than the default session-based one. Specifically, it needs a transaction-based pooling mode. This means that a Postgres connection is established at the start of a transaction and terminated upon its completion. Therefore, it's essential to change the pool_mode setting from session to transaction:

pool_mode = transaction

Next up, max_client_conn defines how many connections PgBouncer itself will accept, and default_pool_size puts a limit on how many Postgres connections will be opened under the hood. (In PgHero the number of connections reported will correspond to default_pool_size because it has no knowledge of PgBouncer.)

The defaults are fine to start, and you can always increase them later:

max_client_conn = 100
default_pool_size = 20

Don’t forget to reload or restart pgbouncer after making your changes:

sudo systemctl reload pgbouncer

Debugging that it all works

You should be able to connect to PgBouncer just like you would with Postgres:

psql -p 6432 -U mastodon mastodon_production

Then use your password to log in.

You can also check the PgBouncer logs like so:

tail -f /var/log/postgresql/pgbouncer.log

Configuring Mastodon to talk to PgBouncer

In your .env.production file, first off make sure that this is set:

PREPARED_STATEMENTS=false

Since we’re using transaction-based pooling, we can’t use prepared statements.

Next up, configure Mastodon to use port 6432 (PgBouncer) instead of 5432 (Postgres) and you should be good to go:

DB_HOST=localhost
DB_USER=mastodon
DB_NAME=mastodon_production
DB_PASS=password
DB_PORT=6432

{{< hint style="warning" >}} You cannot use pgBouncer to perform db:migrate tasks. But this is easy to work around. If your postgres and pgbouncer are on the same host, it can be as simple as defining DB_PORT=5432 together with RAILS_ENV=production when calling the task, for example: RAILS_ENV=production DB_PORT=5432 bundle exec rails db:migrate (you can specify DB_HOST too if it’s different, etc) {{< /hint >}}

Administering PgBouncer

The easiest way to reboot is:

sudo systemctl restart pgbouncer

But if you’ve set up a PgBouncer admin user, you can also connect as the admin:

psql -p 6432 -U pgbouncer pgbouncer

And then do:

RELOAD;

Then use \q to quit.

Separate Redis for cache

Redis plays a vital role in Mastodon, but some uses are more critical than others. Key features like home feeds, list feeds, Sidekiq queues, and the streaming API rely on Redis for important data storage, which you should strive to protect, though its loss is less catastrophic compared to losing the PostgreSQL database.

Additionally, Redis is used for volatile caching. If you're scaling up and concerned about Redis's capacity to handle the load, you can allocate a separate Redis database specifically for caching. To do this, set CACHE_REDIS_URL in the environment, or define individual components such as CACHE_REDIS_HOST, CACHE_REDIS_PORT, etc.

Unspecified components will default to their values without the cache prefix.

When configuring the Redis database for caching, it's possible to disable background saving to disk, as data loss on restart is not critical in this context, and this can save some disk I/O. Additionally, consider setting a maximum memory limit and implementing a key eviction policy. For more details on these configurations, refer to this guide:Using Redis as an LRU cache

Read-replicas

To reduce the load on your PostgreSQL server, you may wish to set up hot streaming replication (read replica). See this guide for an example. You can make use of the replica in Mastodon in these ways:

The streaming API server does not issue writes at all, so you can connect it straight to the replica. But it’s not querying the database very often anyway so the impact of this is little.
Use the Makara driver in the web processes, so that writes go to the primary database, while reads go to the replica. Let’s talk about that.

{{< hint style="warning" >}} Read replicas are currently not supported for the Sidekiq processes, and using them will lead to failing jobs and data loss. {{< /hint >}}

You will have to use a separate config/database.yml file for the web processes and edit it to replace the production section as follows:

production:
  <<: *default
  adapter: postgresql_makara
  prepared_statements: false
  makara:
    id: postgres
    sticky: true
    connections:
      - role: master
        blacklist_duration: 0
        url: postgresql://db_user:db_password@db_host:db_port/db_name
      - role: slave
        url: postgresql://db_user:db_password@db_host:db_port/db_name

Make sure the URLs point to wherever your PostgreSQL servers are. You can add multiple replicas. You could have a locally installed pgBouncer with a configuration to connect to two different servers based on the database name, e.g. “mastodon” going to the primary, “mastodon_replica” going to the replica, so in the file above both URLs would point to the local pgBouncer with the same user, password, host and port, but different database name. There are many possibilities how this could be set up! For more information on Makara, see their documentation.

{{< hint style="warning" >}} Make sure the sidekiq processes run with the stock config/database.yml to avoid failing jobs and data loss! {{< /hint >}}

16 KiB Raw Blame History Unescape Escape