friendica.eskimo.com

Broken SSD - Disaster or not

This week, it finally happened. I think it’s the first time in 20 years that a hard drive has died on me without warning. And it was also the first time I was using an NVMe drive, but that could be a coincidence.

The drive was still under warranty (barely a year and a half old). I even had a spare lying around. But the true cost of restoration is, of course, my own labor. My planning had not been perfect (for such a remote event, as I had judged). However, it was easy enough. I simply installed NixOS from a USB loader and downloaded my configuration from my backup on my NAS (daily rsync jobs to the rescue). I also downloaded all the important files for my home directory. Then, it was simply a matter of adjusting a few things in the configuration file, rebuilding the system, and voilà. Well, except for a few things that didn’t work quite right for some reason and had to be manually fixed, but nothing major.

However, next time I want this to be even easier. It’s probably overkill to install a RAID controller and have multiple drives running in RAID1 or RAID5, but the restoration process is still too much manual work. I was thinking of regularly backing up my main drive on the block device level, so I would just have to swap out the drive and restore the delta from the backup. I’m not quite sure if that’s feasible or a good idea. For my personal system, I have to balance the investment of preparing for a disaster with the likelihood and impact of such an event. This seems like a good trade-off, but I would be curious to hear how other people prepare for drive failure.

15 2

I have successfully restored a whole disk to a working OS from a borgbackup, which takes much less space on the backup storage due borgs extremely efficient compression and deduplication.

So thats what I would recommend.

I backup all my computers and servers with borgmatic.

If you need any help with setting it up, let me know.

This entry was edited (20 hours ago)
12
Thanks, that looks interesting! I wonder how that compares to something like btrfs snapshots. How easy is it to restore a whole disk as opposed to files and directories?
1

I have btrfs snapshots with snapper on my desktop. It keeps the last 20 snapshots. Sending them to a second drive would require an equal amount of space as the main drive, which is ~850GB / 1T full.

But the borg backup for the same takes only ~450GB and also keeps the last 20 versions.

So I use btrfs to restore situations about filechanges (for example a bad system update).

Borg is easier to set up a central server for all my devices, because it takes much less space. So I use that in case where the drive fails. To restore I set up the same partition layout as before and then throw the borg backup at it. It seems pretty easy so far.

This entry was edited (8 hours ago)
1
You don't need a RAID controller, I have dual NVME set up with RAID1 and boot off the RAID one partition, the only partition I can't raid is the EFI partition because BIOS doesn't know about it, but that I simply duplicate by hand on both drives using dd, since it only gets updated at kernel updates, it just adds a dd to the kernel upgrade process.
Yeah, I assume you don't. But with my mainboard I would take a big performance hit, since it does not offer full speed if you occupy both M.2 slots. How do you manage the RAID by the way? Is that all handled by the BIOS?