Update documentation for read-only replica (#1384)

This commit is contained in:
Renaud Chaput 2024-01-09 14:38:50 +01:00 committed by GitHub
parent 4eb8473e62
commit a6528ba511
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 62 additions and 9 deletions

View File

@ -327,6 +327,38 @@ If provided, takes precedence over `DB_HOST`, `DB_USER`, `DB_NAME`, `DB_PASS` an
Example value: `postgresql://user:password@localhost:5432`
### PostgreSQL (read-only replica) {#postgresql-replica}
{{< hint style="info" >}}
If you want to use a read-only database replica, you can have more details [on this page](../scaling/#read-replicas)
{{</ hint >}}
#### `REPLICA_DB_HOST`
No default.
#### `REPLICA_DB_PORT`
No default.
#### `REPLICA_DB_NAME`
No default.
#### `REPLICA_DB_USER`
No default.
#### `REPLICA_DB_PASS`
No default.
#### `REPLICA_DATABASE_URL`
If provided, takes precedence over `REPLICA_DB_HOST`, `REPLICA_DB_PORT`, `REPLICA_DB_NAME`, `REPLICA_DB_USER` and `REPLICA_DB_PASS`
No default.
### Redis {#redis}
{{< hint style="info" >}}
@ -467,9 +499,9 @@ E-mail configuration is based on the *action_mailer* component of the *Ruby on R
* `SMTP_SERVER`: Specify the server to use. For example `sub.domain.tld`.
* `SMTP_PORT`: By default, the value is `25` (the usual port for SMTP). If StartTLS is detected, it may be switched to port 587.
* `SMTP_DOMAIN`: Only required if a HELO domain is needed. Will be set to the `SMTP_SERVER` domain by default.
* `SMTP_FROM_ADDRESS`: Specify a sender address.
* `SMTP_FROM_ADDRESS`: Specify a sender address.
* `SMTP_DELIVERY_METHOD`: By default, the value is `smtp` (can also be `sendmail`).
### Authentication for the SMTP server {#smtpauthentication}
* `SMTP_LOGIN`: Login for the SMTP user.
@ -480,12 +512,12 @@ E-mail configuration is based on the *action_mailer* component of the *Ruby on R
By default, a StartTLS connection will be attempted to the specified SMTP server.
* `SMTP_ENABLE_STARTTLS_AUTO`: Default `true`.
* `SMTP_CA_FILE`: A value may be specified, but on many Linux distros (e.g. Debian-based) this will be `/etc/ssl/certs/ca-certificates.crt`.
* `SMTP_OPENSSL_VERIFY_MODE`: `none` or `peer`. When using TLS, it may be useful to accept connections with a self-signed certificate.
* `SMTP_CA_FILE`: A value may be specified, but on many Linux distros (e.g. Debian-based) this will be `/etc/ssl/certs/ca-certificates.crt`.
* `SMTP_OPENSSL_VERIFY_MODE`: `none` or `peer`. When using TLS, it may be useful to accept connections with a self-signed certificate.
* `SMTP_TLS`: `true` or `false` (default `false`)
* `SMTP_SSL`: `true` or `false` (default `false`)
Note that `TLSv1.3` and `TLSv1.2` are the only SSL/TLS protocols currently considered to be secure.
Note that `TLSv1.3` and `TLSv1.2` are the only SSL/TLS protocols currently considered to be secure.
## File storage {#files}
@ -846,4 +878,3 @@ Defaults to `512`.
#### `GITHUB_API_TOKEN`
Used in a rake task for generating AUTHORS.md from GitHub commit history.

View File

@ -306,7 +306,7 @@ Then use `\q` to quit.
Redis is used widely throughout the application, but some uses are more important than others. Home feeds, list feeds, and Sidekiq queues as well as the streaming API are backed by Redis and thats important data you wouldnt want to lose (even though the loss can be survived, unlike the loss of the PostgreSQL database - never lose that!). However, Redis is also used for volatile cache. If you are at a stage of scaling up where you are worried about whether your Redis can handle everything, you can use a different Redis database for the cache. In the environment, you can specify `CACHE_REDIS_URL` or individual parts like `CACHE_REDIS_HOST`, `CACHE_REDIS_PORT` etc. Unspecified parts fallback to the same values as without the cache prefix.
Additionally, Redis is used for volatile caching. If you're scaling up and concerned about Redis's capacity to handle the load, you can allocate a separate Redis database specifically for caching. To do this, set `CACHE_REDIS_URL` in the environment, or define individual components such as `CACHE_REDIS_HOST`, `CACHE_REDIS_PORT`, etc.
Additionally, Redis is used for volatile caching. If you're scaling up and concerned about Redis's capacity to handle the load, you can allocate a separate Redis database specifically for caching. To do this, set `CACHE_REDIS_URL` in the environment, or define individual components such as `CACHE_REDIS_HOST`, `CACHE_REDIS_PORT`, etc.
Unspecified components will default to their values without the cache prefix.
@ -375,7 +375,29 @@ systemctl restart redis-sidekiq.service
## Read-replicas {#read-replicas}
To reduce the load on your PostgreSQL server, you may wish to set up hot streaming replication (read replica). [See this guide for an example](https://cloud.google.com/community/tutorials/setting-up-postgres-hot-standby). You can make use of the replica in Mastodon in these ways:
To reduce the load on your PostgreSQL server, you may wish to set up hot streaming replication (read replica). [See this guide for an example](https://cloud.google.com/community/tutorials/setting-up-postgres-hot-standby).
### Mastodon >= 4.2
Mastodon has built-in replica support starting with version 4.2. You can use the same configuration for every service (Sidekiq included), and some queries will be directed to your read-only replica, when possible, using Rails's built-in replica support. If your replica is lagging behind for more than a few seconds, then the app will stop sending it queries until it catches up.
To configure it, use the following environment variables:
```
REPLICA_DB_HOST
REPLICA_DB_PORT
REPLICA_DB_NAME
REPLICA_DB_USER
REPLICA_DB_PASS
```
Alternatively, you can also use `REPLICA_DATABASE_URL` if you want to configure them all using the same variable.
Once done, this is all good and you should start seeing requests against your replica server!
### Mastodon <= 4.1
For Mastodon versions before 4.2, you can make use of the replica in Mastodon in these ways:
* The streaming API server does not issue writes at all, so you can connect it straight to the replica (it is not querying the database very often anyway, so the impact of this is small).
* Use the Makara driver in the web and Sidekiq processes, so that writes go to the master database, while reads go to the replica. Lets talk about that.
@ -423,4 +445,4 @@ These endpoints should both return an HTTP status code of 200, and the text `OK`
{{< hint style="info" >}}
You can also use these endpoints for health checks with a third-party monitoring/alerting utility.
{{< /hint >}}
{{< /hint >}}