mastodon/lib
Calvin Lee 9eb30dfb1c Sanitize MathML in post content
Summary:
-------
This commit correctly sanitizes incoming MathML according to [FEP-dc88].
Instead of completely removing MathML nodes, it replaces them with their
LaTeX or plain-text representation, so that the mathematics can be read
in some form by mastodon users.

Test Plan:
----------
```
$ RAILS_ENV=test bundle exec rspec spec/lib/sanitize_config_spec.rb -f d
Run options: exclude {:type=>#<Proc: ./spec/rails_helper.rb:79>}

Randomized with seed 58854

Sanitize::Config
  ::MASTODON_STRICT
    sanitizes math blocks to LaTeX
    converts h1 to p strong
    removes "translate" attribute with invalid value
    removes a without href
    removes a without href and only keeps text content
    math sanitizer falls back to plaintext
    keeps ul
    prefers latex
    removes a with unparsable href
    keeps start and reversed attributes of ol
    removes a with unsupported scheme in href
    keeps a with translate="no"
    keeps a with href
    keeps a with supported scheme and no host
    does not re-interpret HTML when removing unsupported links
    sanitizes math to LaTeX

Finished in 0.17323 seconds (files took 3.28 seconds to load)
16 examples, 0 failures

Randomized with seed 58854

```

observed 100% code coverage of `lib/sanitize_ext/sanitize_config.rb`.

Ran mastodon locally, and fetched [reference post][nyancat] and observed
that math was converted to plaintext form (and was not missing).

[FEP-dc88]: https://codeberg.org/fediverse/fep/src/branch/main/fep/dc88/fep-dc88.md
[tracking]: https://codeberg.org/fediverse/fep/issues/161
[socialhub]: https://socialhub.activitypub.rocks/t/fep-dc88-formatting-mathematics/3564
[nyancat]: https://nyan.network/notice/Aa4IvnBVHysWswRX1s

Related Discussion:
-------------------

Please see [FEP-dc88], the [FEP tracking issue][tracking] and
[FEP forum discussion][socialhub] for more information.

Fixes mastodon/mastodon#26943
2024-04-25 10:16:39 +00:00
..
active_record Rails 7.1 update (#25963) 2023-10-23 17:58:29 +00:00
assets Upgrade to Stylelint 15 with Prettier (#23558) 2023-02-13 04:57:03 +01:00
chewy Add `ES_PRESET` option to customize numbers of shards and replicas (#26483) 2023-08-14 17:46:16 +02:00
devise/strategies Move lib/devise/* to lib/devise/strategies/* (#27638) 2023-11-29 10:10:21 +00:00
generators/post_deployment_migration Clean up the post deployment migration generator (#24233) 2023-04-11 11:25:29 +02:00
linter Consistently use middle dot (·) instead of bullet (•) to separate items (#25248) 2023-06-02 19:58:18 +02:00
mastodon Add reusable duplicate ID finder methods in maintenance CLI (#28910) 2024-04-17 09:00:08 +00:00
paperclip Fixed crash when supplying FFMPEG_BINARY environment variable (#30022) 2024-04-22 09:00:24 +00:00
rails Enable Rubocop Style/FrozenStringLiteralComment (#23793) 2023-07-12 09:47:08 +02:00
redis Update rubocop and rubocop-rspec (#26329) 2023-08-22 09:31:40 +02:00
sanitize_ext Sanitize MathML in post content 2024-04-25 10:16:39 +00:00
simple_navigation Add customizable user roles (#18641) 2022-07-05 02:41:40 +02:00
tasks Fix Rubocop `Rails/UniqueValidationWithoutIndex` cop (#27461) 2024-04-22 08:04:05 +00:00
templates/haml/scaffold Use `tt` extension for form scaffold template (#29676) 2024-04-10 09:20:21 +00:00
terrapin Autofix Rubocop Style/HashSyntax (#23754) 2023-05-04 05:54:26 +02:00
webpacker Fix `Style/SingleArgumentDig` cop in webpacker/manifest_extensions (#29929) 2024-04-15 09:15:32 +00:00
exceptions.rb Fix error when passing unknown filter param in REST API (#20626) 2022-11-14 08:06:06 +01:00
premailer_bundled_asset_strategy.rb Rename `PremailerWebpackStrategy` -> `PremailerBundledAssetStrategy` (#29934) 2024-04-15 09:16:59 +00:00
public_file_server_middleware.rb Add hardened headers to user-uploaded files (#25756) 2023-07-06 14:31:37 +02:00