found in a project gutenberg file: "Updateŕs note: This file has been recoded to UTF8." (yes, that is LATIN SMALL LETTER R WITH ACUTE)

I can't tell if this is supposed to be a joke or not

@darius wouldn't that imply that the text "Updater's note: This file has been recoded to UTF8" was in the *physical* copy?

@aparrish My thought was:

physical -> some other encoding via OCR -> UTF8, and the error happened in first conversion

@aparrish although that wouldn't account for it either

maybe it is just a clever joke

@aparrish oh maybe it went

physical -> OCR to ASCII -> UTF-8 -> physical -> OCR to UTF-8

@darius that would be really weird in this case I think—this is the file it came from gutenberg.org/ebooks/2090

@aparrish oh yeah now I'm just imagining extremely implausible reasons that could account for it to amuse myself

@darius @aparrish I’m expecting a random re-encoding path bot for common memes any minute now.

@darius @aparrish my guess is that it’s visible proof of the Unicode being correct

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!