I wanted to find out if using as a storage backend would give file deduplication "for free", but unfortunately it looks like ImageMagick operations on the same input file are not deterministic, so you still end up with different hashes when the same file is uploaded more than once.

Of course it sounds nice in theory if your server didn't need to download a remote file and would simply save a given hash to then just put it into a IPFS gateway URL, however I believe in practice that's not sufficient. No guarantee that the software that creates a post uses the same thumbnail dimensions that Mastodon needs, so the file still needs to be downloaded to the server, converted, then re-uploaded to IPFS, and if that process is non-deterministic, it's not worth it...

Yeah, seems like it's down to timestamps in the metadata, but ImageMagick refuses to accept any options that are supposed to unset those. Plus it looks like all IPFS-related libraries in Ruby are both incomplete and unmaintained, so...

Follow

@Gargron Have you ever heard of exiftool? I think it uses Pearl. But, privacy conscious people generally use it to wipe EXIF data from photos and PDFs before uploading to sites.

@TheOuterLinux @Gargron so apparently this is not happening by default, already? jeezus...

GOOD TO KNOW!

@DJ_Pure_Applesauce @Gargron I haven't actually checked how Mastodon handles EXIF data in images, but a lot of sites will convert images to JPEG in order to embed their own data, which can aid in tracking the images and PDFs (ones that use JPEG). And most conversions to formats like PNG, to help either save data or because it's a standard (JPEG is debatably not a free format), still retain that metadata. I just assume and wipe before uploading regardless of what site. Exiftool has many uses.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!