I wanted to find out if using as a storage backend would give file deduplication "for free", but unfortunately it looks like ImageMagick operations on the same input file are not deterministic, so you still end up with different hashes when the same file is uploaded more than once.

Of course it sounds nice in theory if your server didn't need to download a remote file and would simply save a given hash to then just put it into a IPFS gateway URL, however I believe in practice that's not sufficient. No guarantee that the software that creates a post uses the same thumbnail dimensions that Mastodon needs, so the file still needs to be downloaded to the server, converted, then re-uploaded to IPFS, and if that process is non-deterministic, it's not worth it...

@Gargron IPFS should surely give you deduplication for free. Do you strip image metadata before calling ImageMagick? You might want to use something like MAT2 for cleaning uploaded files: 0xacab.org/jvoisin/mat2
(I observed that Twitter removes metadata from JPEG uploads.)


@chpietsch If I upload the same identical file twice, the thumbnails for it (same settings, dimensions) get different hashes.

@Gargron Maybe it's time to trash ImageMagick then. I am surprised that you use it for heavy lifting. It also has a bad security record.

Sign in to participate in the conversation

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!