Follow

Imagine your backup software encounters an unreadable file (permissions, I/O problem, whatever) during the backup process.

What should be the default behavior?

@fribbledom halt, catch fire, burn all previous backups, then corrupt the main drive.

@fribbledom the thing that freaks me out about any backup solutoon is that a corrupted file looks exactly the same as a purposeful modification, and will happily copy over the top of all previous versions silently.

@fribbledom so, an actually unreadable file during a routine backup? things have most likely been fucked long enough for all your backups to be worthless.

@fribbledom what would be good is a backup utility that knows, for instance, DCM_0456.jpeg is not a joeg file this time.

@zensaiyuki @fribbledom That’s why at least one of your backups needs to save snapshot versions.

@zensaiyuki @fribbledom That's why you should use file systems that detect corruption :)

@KopfKrieg @fribbledom yeah, great. none of those available for mac, windows, or linux.
i mean, i know about btrfs and zfs but i don’t have a spare $2000 laying around for a NAS that isn’t a worthless waste of space

@KopfKrieg @fribbledom and no i am not interested in using “experimental wrappers” or “unofficial drivers” for my backups. wake me when there’s an actual option.

@zensaiyuki @fribbledom

I'm not really sure what you're talking about. Personally I use btrfs, and you can use it like any other file system. Not sure why you'd need a NAS for that?

Also, corruption of your backups must be detected by your backup software, not by the filesystem.

@KopfKrieg @fribbledom right, so not here to read, understand anything written or offer useful insight or advice. just here for the smug swipes.

@zensaiyuki @fribbledom

I'm having serious trouble understanding what you want. Yes, if you're using Windows then you're out of luck, but if you use Linux then format your partition to btrfs, or any other checksumming file system. If you're using an Apple product then switch to APFS, which also supports file integrity checks through checksums.

And as a backup drive some external HDD if you want or a cheap online storage.

For the software part, I recommend borg backup.

@KopfKrieg i don’t use windows primarily, i use mac. interestingly, apfs does not do *any* of the things you think it does.
if i were to use linux to back up *to* what would be the point of not setting it up as a NAS?

and your suggestion of a cheap external HHD has me wondering why you think you have the right to speak to me, let alone be smug about it.

@KopfKrieg with shitty backup strategy like this, no wonder you forgot *the very first thing you tooted at me*

@zensaiyuki It's a simple backup solution, that's all. If you want a better backup strategy then use something like 3-2-1 with at least one off-site backup.

@KopfKrieg a simple backup strategy that would make sense to mention if you haven’t read a single thing i have written.

@zensaiyuki Whoops, you're right, APFS uses checksums for metadata only.

A NAS is still not required for backups, though.

Also, I did never say "cheap external HDD".

Oh, and regarding the "right to speak to you": This is the Fediverse. If you don't want people to communicate with you, then make your profile private and don't comment on other's posts.

@KopfKrieg all i ask is that if youugonna come at me with smug snark the least you can do is pass the turing test.

@zensaiyuki @fribbledom

Sidenote: Backups are never a "worthless waste of space".

@KopfKrieg @fribbledom yeah that swimming pool of faded unreadable floppies is totally worth it.

@zensaiyuki @fribbledom

Anything <10TB of data is rather easy to backup. Not sure what's your problem here.

@zensaiyuki
Not sure what you back up to, or with what software, but most of the ones I know support versioning internally: Backintime (Linux), Duplicati (Linux ,Windows, prbly MacOS), Time Machine (Mac only). So if something gets corrupted, you can retrieve older versions of the same file.

If you're okay to pay for Cloud services, Spideroak seems to be fairly secure, and they too support versioning.
@KopfKrieg @fribbledom

@Mr_Teatime @KopfKrieg @fribbledom well, nothing really, because nothing works right. it’s all built by smug boring people who i guess just edit a couple ms word documents a year, so things like time machine are good enough.

@Mr_Teatime @KopfKrieg @fribbledom never had to deal with multiple slightly different copies of 4gb packed git repos partially synced over dropbox full of 300mb duplicate video files encoded at slightly different bit rates.

@Mr_Teatime @KopfKrieg @fribbledom or, my favourite, recent adventure, discovering my backup only contains icloud stub files, not real copies of the files.

@zensaiyuki
Isn't your backup versioned?
Duplicati, Backintime, that Synology thing, Time Machine, even syncthing (not a "real" backup software) provide this. Heck, even Windows has this "previous versions" thing.

@fribbledom

@Mr_Teatime @fribbledom i guees i must be weird, because i have never been satisfied with any of those backup solutions. time machine has never finished a backup in less than a week, and that, and everything else doesn’t reallly handle the issue of single very large files changing slightly over time gracefully, which is a thing i run into a lot what with running VM’s and editing photoshop files.

@Mr_Teatime @fribbledom the other thing nothing seems to handle gracefully is deciding to reorganise my files. rename a single 300gb folder and suddenly my backup drive runs out of space. damn.

@zensaiyuki@mastodon.social @Mr_Teatime@social.tchncs.de @fribbledom@mastodon.social i'm still in dire need of a proper backup solution that isn't me cherry picking stuff and rsyncing it to offline storage once in a blue moon.. i run in to the same issues you are :(

@purple @fribbledom @Mr_Teatime i long ago came to the conclusion that photos, videos, music, large blobs that change slightly over time, code, and documents that get reorganised regularly, each need vastly different solutions .
i don’t have any clue how “normal” people seem to be satisfied with existing solutions.

@purple @fribbledom @Mr_Teatime plus i tend to accumulate stuff over time at a rate regularly exceeding typical backup drive size, so there’s a ton of stuff on old backups that i’d like to have available to look at but it’s on one of the 20 or so external drives i have somewhere, which may or may not be corrupted by now.

@purple @fribbledom @Mr_Teatime and honestly getting sick of people acting like backups are easy and they really don’t understand what the problem is. what are they backing up? 6 or 7 text files?

@fribbledom Of course it has to drop all of the progress and shit itself, duh. What kinda HDD do you even have that it doesn't let you to get to every file, you... you basic bitch uuuuuuuuuuuuuuuuuuuuuuuuuussssseeeeeeeeeeeeeeeeerrrrr?

@fribbledom report, wait for the user to acknowledge (popup box/press any key), then continue

@fribbledom@mastodon.social I think i'd prefer if I was shown a dialog informing me about the error and then getting to decide if I want to continue or not, rather than either of the two immediately happening.

@fribbledom report but continue: partial backup is still better than no backup.

BUT the "report" must be very clear and prominent (even annoying) so ppl know some file was NOT backed up.

This is the default behaviour of DejaDup, btw.

@fribbledom Backups are mostly run on a timer and without someone looking over them. If the erroneous file was the second file looked at and then the backup aborts, you have a very empty backup. Better to try save as much as possible than be pedantic about it being a complete backup

@fribbledom I voted what I know my software (rsync) does, continue and report.

@trini @fribbledom Excellent choice. Pipe errors into a logfile for scripts.

I really prefer a GUI for browsing errors, which is why I've been loving SyncThing. It will display error status even _after_ the backup has run, and highlight them with a pretty icon.

@fribbledom

*BUT* ensure that the report is prominent:

1. List and log the failing file(s) and reasons for it at the *END* of the output; otherwise, they will scroll away and be ignored.

2. Print and log the number of failures at the end of the output.

3. On *nix-like environments, exit with a non-zero status that is different from the one caused by fatal-and-can't-continue type errors.

@fribbledom If When I'm using a backup, I'd much rather whatever I can be backed up. You're already going to have an inconsistent state by what has already been copied.

Also, nothing is as infuriating as starting an hours-long process only to come back and discover that it stopped 15 minutes in.

@fribbledom report, continue, but abort if there are hundreds of them, or if more than $threshold% fail?

@fribbledom (i.e. differentiate between "this file failed to back up" and "this backup failed")

@derwinmcgeary @fribbledom Only makes sense if attempting to back up the rest is likely to cause damage to the source drive or other unwanted side effects. But most of those scenarios are backup from previous backup scenarios anyway.

@fribbledom Send a cryptic note to an unconfigured mail alias, then disable automatic backups until an undocumented lock file somewhere deep inside /var is deleted.

@renatoram @fribbledom Not that exact scenario, but the general gist of it, sure.

@mansr @fribbledom I'm also a big fan (not) of config options like "dontdisablesettingX" with no documented default. Maybe the value must be int. Who knows.

One time I had to dig into the sources and discover a handful of undocumented (but working and useful) CLI options...

@fribbledom report and suspend, so operator can do whatever and then resume or abort at their discretion

@fribbledom I think there should be some good retry logic to avoid temporary I/O or whatever problems

@fribbledom
Ideally it should be a configurable option; defaults of abort if an interactive session (using gui or attached to tty), report if not (eg timer). The logic being that interactive might be a config issue on a first run and noninteractive could be failing filesystem so save what you can.

@fribbledom Interesting poll! Makes me think of Postgres' fsyncgate a few years back.

@fribbledom skip and maybe retry later, adding the error to a log then exit nonzero. If this happens with cron, mail the log to the user for further investigation.

@fribbledom if i don't catch it in time (i.e. before my previous backups expire), i'd be much more distressed by having no backups at all rather than being unable to restore one file.

@fribbledom report and continue, with an optional flag that would abort if the person doing that backup needed it to do that.

@fribbledom Report, but continue … but it's absolutely essential that the user knows what happened - i.e., a report buried on page 4 of 10 pages of routine gibberish really doesn't count.

@fribbledom Mark the file as needing attention, continue, wait for user intervention once everything else is done.

Sign in to participate in the conversation
Mastodon

Server run by the main developers of the project 🐘 It is not focused on any particular niche interest - everyone is welcome as long as you follow our code of conduct!