Improving Mastodon’s disk usage

Mastodon's built-in CLI gives you the availability to clean attachments and previews from remote accounts, purging the disk cache. This is fantastic and you couldn't possibly survive with out it.

My current crontab (in the mastodon user, not root) that runs every 3 hours:

0 */3 * * * /bin/bash /home/mastodon/purge-media.shCode language: Bash (bash)

As of Mastodon 4.1.0, we have new available commands. Here's the content of my purge-media.sh script:

#!/bin/bash

# Prune remote accounts that never interacted with a local user
RAILS_ENV=production /home/mastodon/live/bin/tootctl accounts prune;

# Remove remote statuses that local users never interacted with older than 4 days
RAILS_ENV=production /home/mastodon/live/bin/tootctl statuses remove --days 4;

# Remove media attachments older than 4 days
RAILS_ENV=production /home/mastodon/live/bin/tootctl media remove --days 4;

# Remove all headers (including people I follow)
RAILS_ENV=production /home/mastodon/live/bin/tootctl media remove --remove-headers --include-follows --days 0;

# Remove link previews older than 4 days
RAILS_ENV=production /home/mastodon/live/bin/tootctl preview_cards remove --days 4;

# Remove files not linked to any post
RAILS_ENV=production /home/mastodon/live/bin/tootctl media remove-orphans;Code language: PHP (php)

⚠️ If you've never run these commands before I'd suggest running them one by one (not in a cronjob) as they might take several hours (or days) to run each. The size of the cached media and database will depend on how many people you follow, how many are on your instance, how many relays you have added to your instance, etc.

Having like 10 relays added to my single-user instance with the bash script above I'm around 7Gb in my Object Storage S3.

Do you have any other tips on how to keep a Mastodon instance lean?


👋 Don't miss the follow up post: Scaling Mastodon: moving media assets to Object Storage

Comments

  1. If you’re running this under the mastodon user in a cron job, make sure the path for ruby is included in the cron line, or in the script.

    I was seeing an “/usr/bin/env: 'ruby': No such file or directory” error that was fixed by putting the full patch of the mastodon user before the script in the crontab.

  2. @ricard Yes my host has the media cache clear after some time – which is great; I was just worried I was going to fill it up before it started kicking in 😆 I’m on a freshly setup instance (just over a week) so it always worries me to see the usage ramp up, this time quicker than last because of the relays I decided to add. Think I’ve got it in a sweet spot now.

  3. @mondanzo It’s different. The cache retention deletes everything (posts you’ve liked, bookmarked, etc).There’s a PR to amend the text so it’s easier to understand.The CLI commands are way more refined and only delete stuff you haven’t interacted with, etc.

  4. @ricard @anlomedad While it takes a while to do so you might not want to run it every night (as well as some providers charging for the api calls to check all the files), I didn’t see a command in there to remove orphans ( https://docs.joinmastodon.org/admin/tootctl/#media-remove-orphans )tootctl media remove-orphansIt will go through for media files that aren’t connected to stuff anymore, just taking up space.
    Using the admin CLI – Mastodon documentation

  5. @ricard I do not have any cronjobs, I have delayed it for so long, obviously too long I actually have that page open, going to set up a cronjob soon as soon as I get the initial prunes done (need some space before making it automatic).Hetzner 1TB storage costs € 3.20 per month. It’s the cheapest I know. And really reliable too.

  6. @wild1145 I think it could have been something with IPv6 I’ve compared my other domains and .dev had it set up at the Linode Domain dashboard, for some reason.I’ve removed the extra configuration (which I don’t remember creating) for IPv6 and now they look the same, see if that solves it after propagation 🤞 Thank you again for the feedback 💜

Leave a Reply