#mastodon #cleanup @Gargron

my harddisk slowly fills up, especially

public/system/accounts
public/system/preview

must be lost of federated stuff. is there a way to cleanup these folers? its over 15GB actually and i would love to reduce it or hold it at some level. we do not need any federated old stuff nobody shared, fav, or otherwise used.

@kmj @Gargron I clean out previews using a simple find on the filesystem. Doesn't work so well for accounts data, since Mastodon doesn't automatically fetch missing avatars and header images for known accounts (though you can force an update using tootctl accounts refresh --all).

When using the local filesystem, media expiry tends to result in having tons of empty directories, and I regularly delete those too.

@kmj @Gargron So that's like two cronjobs:
@weekly cd /home/mastodon/live && find /home/mastodon/live/public/system/preview_cards -type f -atime +7 -delete
(watch out for "noatime" mounts)
@weekly cd /home/mastodon/live/public/system && find . -type d -empty -print0 | xargs -0 -n 1 rmdir

@galaxis @Gargron

for the preview cards this works really fine. thank you very much.

if i am right, even i remove the accounts stuff (>9GB), an "tootctl accounts refresh all" would get all data back again, because actually there is no possibility to delete old toots?

@kmj @Gargron There is "tootctl accounts cull" to delete accounts that are not known by remote instances anymore (only works when the instance is still online though), and "tootctl statuses remove" to expire old remote toots that are not referenced by any local activity.
Neither of both had any immediate impact on my database size, but it should also remove media associated with the deleted accounts and posts.

@galaxis @Gargron

this one is not listed in the help of my tootctl:

tootctl statuses remove

@galaxis @Gargron definitely, now these ones are here too. build a script to run all of this stuff, wich check later. looks like these ones needs some time to run.

then a crontab script for VACUUM the db and periodic maintainance should be fine. thank you!

Follow

@kmj @galaxis Just don't do VACUUM FULL unless you know specifically that you need to. Postgres can already reuse its space after a normal VACUUM, the FULL takes longer because it has to defragment the entire database to let the filesystem reclaim the space.

· Web · 1 · 0 · 1

@Gargron @galaxis

- Cleaning out preview_cards removed GB's

- Accounts scull runs very long, but result is not much

Removed 383 accounts. 412 servers skipped
The following servers were not available during the check:
there was a lot of servers not online

- statuses remove

has not brought alot on the disk. PG vaccum will follow later

- there is still this huge accounts fulder in public which has over 9 GB here.

@kmj @Gargron Well, you could use tootctl domains purge <domain> to selectively delete any data you have from the unreachable instances. I usually ask redis, which has information about instances your system has given up on, using something like this:
redis-cli --csv smembers unavailable_inboxes | tr ',' '\n' | sed 's/"//g' | sed '/^$/d' | awk -F'/' '{ print $3 }'

...and then use domains purge on that list.

@kmj @Gargron I'm not currently sure if avatars and header images for local accounts are stored in the accounts folder too (probably yes, same as local media that's in the general media pool), so just cleaning out everything from public/system/accounts is probably dangerous.
I have run the find .. -delete on my accounts folder too, and haven't lost my own data, but I'm on a single-user instance, so it wouldn't have mattered much. It's an option, and remote data can be fixed using refresh --all.

@galaxis @Gargron asking redis results in 12 domains, where scrull reports 412 skipped servers. i run domains purge on these 12 actually.

even am i too on a very small instance i do not want to remove all accounts stuff without beeing sure, that i do not delete local users stuff.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!