That Time I Nuked the Disklabel and Recovered the DiskAaron Poffenberger
dd if=install5?.iso of=/dev/sd2rc bs=1m
I knew hardcoding the output device was dangerous. Fixing the scripts was on my todo list. On that day it was on my too-late-todo list. I forgot that the last time I booted there was an SD card in the machine which caused softraid(4) to assemble the crypto partition on sd2 when it unlocked the encrypted boot partition. The boot partition was now on sd2, right where my script was hardcoded to copy to. The intended target, a thumb drive, was on sd1.
The dd(1) process had barely started copying when I realized what was going on. I hit ctrl-c as quickly as I could. Too late.
The boot partition was overwritten.
first natural thing to do when you do something really destructive on the computer is to panic. If you keep a cool head and don't start smacking keys, you sometimes think of a solution. I kept cool and thought:
Wait, I haven't rebooted yet. The kernel is loaded into memory. There's still time to act.
Experienced OpenBSD users will know what I knew: OpenBSD has sane defaults. One of those sane defaults is the daily security(8) script. During each run it makes a backup of all the disklabels (and a lot of /etc). As long as your system has run overnight at least once since you created or changed the disklabel(5), there's a backup in /var/backups.
All I needed to do was relax, copy the disklabel(5) to a thumbdrive, restore it and console myself over a few losses and berate my stupidity:
Yeah, sure, I've probably lost the root partition but /home, /usr/ports/mystuff and a few other important directories are OK, right?
Not so fast. Nuking the disklabel(5) is actually worse than deleting the kernel file. The kernel will run since it's loaded in memory. Overwriting the disk to the tune of ~280 MB while the system is running is really bad. While I was reassuring myself that I could recover the disklabel(5) the system hung, leaving me to contemplate my backup system.
I have a backup system. It's the best backup system. Everybody says so. Every night, my computers backup themselves up to a FreeNAS server. ZFS snapshots FTW.
Surely there'd be a copy of /var/backups/disklabel.sd1. Of course there would...unless someone had recently deleted the backup dir and not created a new backup point.
Time to panic.
- rarely needed when you have them
- rarely needed when you don't
But you'll miss them sorely in the latter situation.
There's not much to do when you've nuked the disklabel, the kernel is hung and you don't have a backup. Panic, rage or even cry. At some point it's over. It's time to go gentle into that good night....
Or is it?
What if you run an operating system that's so insanely sane that it's well documented? What if it's so well documented that reading the man(1) pages is interesting? What if your self-proclaimed super power is reading man pages during lunch?
In other words, what if you know your OS well enough that a glimmer of hope appears?
About now, experienced OpenBSD users are jumping up and down screaming "scan_ffs(8)!"
It's OK to ask for help. But read the man pages first. Practice your search-fu.
Whining on misc@ will not help...whatever you're complaint is.
It's not the OS's fault if you shoot yourself in the foot. Running scripts that do dangerous things sometimes leads to disaster. In other words, if you break it you get to keep both halves. Still, the original BSD geeks made the filesystem resilient. That resilience starts with the process of creating the filesystem.
Unless you always go for coffee while newfs(8) is at work, you may have noticed it prints a series of numbers of increasing value while creating the filesystem:
$ doas newfs sd2a /dev/rsd2a: 477.4MB in 977664 sectors of 512 bytes 4 cylinder groups of 119.34MB, 7638 blocks, 15360 inodes each super-block backups (for fsck -b #) at: 32, 244448, 488864, 733280,
That little message is pure gold. You don't have to memorize it (unless you messed with the block size...hint, don't). Just knowing it exists is enough.
Those numbers are the offsets where newfs(8) created extra superblocks. Those superblocks have information about the partition size and offset. With that information we can recreate the disklabel(8).
What's a superblock?
A file system is described by its super-block, which in turn describes the cylinder groups. The super-block is critical data and is replicated in each cylinder group to protect against catastrophic loss. This is done at file system creation time and the critical super-block data does not change, so the copies need not be referenced further unless disaster strikes.
scan_ffs(8) describes itself as "life-saver of typos"...and that it is. It saves butts and lives:
This little program will take a raw disk device (which you might have to create) that covers the whole disk, and finds all probable UFS/FFS partitions on the disk.
scan_ffs(8) does it's magic by scanning the disk looking for superblocks...and extracting those handy bits of info regarding the partition size and offset. Now you see why superblocks matter.
Also, you may now understand why newfs(8) writes lots of superblocks to the disk. One backup is nice, but multiple copies make it more likely scan_ffs(8) will find one on a damaged disk.
Using scan_ffs(8) is often very easy:
$ doas scan_ffs device
For damaged disks you may have to provide it with a begin and end range in which to work. But in a case like mine, scanning the device from the beginning was enough.
Imagine a disk that looks like this after partitioning:
$ doas disklabel sd2 # /dev/rsd2c: type: SCSI disk: SCSI disk label: USB DISK 3.0 duid: de794248e8848224 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 1927 total sectors: 30965760 boundstart: 64 boundend: 30957255 drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 977664 64 4.2BSD 2048 16384 7638 b: 1791550 977728 swap c: 30965760 0 unused d: 1547904 2769280 4.2BSD 2048 16384 12093 e: 2279840 4317184 4.2BSD 2048 16384 12958 f: 2657024 6597024 4.2BSD 2048 16384 12958 g: 1536864 9254048 4.2BSD 2048 16384 11959 h: 5821984 10790912 4.2BSD 2048 16384 12958 i: 2422688 16612896 4.2BSD 2048 16384 12958 j: 3313472 19035584 4.2BSD 2048 16384 12958 k: 8608000 22349056 4.2BSD 2048 16384 12958
The results of scan_ffs(8) would look this:
$ doas scan_ffs -s -l sd2c X: 977664 64 4.2BSD 2048 16384 1 # X: 1547904 2769280 4.2BSD 2048 16384 1 # X: 2279840 4317184 4.2BSD 2048 16384 1 # X: 2657024 6597024 4.2BSD 2048 16384 1 # X: 1536864 9254048 4.2BSD 2048 16384 1 # X: 5821984 10790912 4.2BSD 2048 16384 1 # X: 2422688 16612896 4.2BSD 2048 16384 1 # X: 3313472 19035584 4.2BSD 2048 16384 1 # X: 8608000 22349056 4.2BSD 2048 16384 1 #
Look at column 2 of both examples. Those are sizes. In the third column are the offsets. You might notice that other than "X:" and the columns labels, the output above looks like the output from disklabel(8) above it. The -l parameter tells scan_ffs(8) to output in disklabel(5) format.
disklabel(8) can install a new label from an ASCII text file with the -R option:
$ doas disklabel -R /tmp/disklabel.sd1 sd1
I'm omitting in all these examples the steps I took to assemble the softraid(4) crypto partition itself.
It sounds easy, but it wasn't. In the first case, my laptop wouldn't boot off the internal drive to effect the repair. I first had to install OpenBSD to a thumbdrive to have a working system.
Once I had a bootable thumbdrive I had to run scan_ffs(8) and apply the recovered disklabel(8) to the drive. Unfortunately, I changed the size of a couple of partitions after the initial newfs(8) meaning I had some old superblocks lying around.
The -s parameter tells scan_ffs(8) to skip "intelligently" to the next partition. Knowing I had old superblocks on disk meant that to find /var I had to do a full disk scan (no -s) to find all the superblocks (fortunatley the laptop had an SSD).
Then I had to guess which size + offset matched the location of my /var, create an ASCII disklabel(8), apply it, mount the partition and see what it found.
Once I found the real /var mounted I was able to get the full disklabel(8) from /var/backups and apply it.
All's well that ends well. Just reboot, right?
Had I kept the automatic disk allocation suggested by disklabel(8) I could've used -s and the process would have gone faster. Fortunately I kept the partitions in the order suggested by disklabel(8). I at least knew their relative positions to one another.
With the help of scan_ffs(8) I recovered the disklabel(5) but the computer still wouldn't boot. Why? Remember how this began.
dd(1) didn't just annhilate the disklabel(5) it ran long enough to nuke the file table. The root partition was a total loss.
Still, most of the data I care about was easily recovered once the disklabel was restored.
While the disk didn't have a standard OpenBSD layout, it did have multiple partitions. Each partition has it's own file table. So while / was a total loss, the other partitions were intact. No loss of data on them!
Did you know security(8) only backs up files listed in /etc/changelist. Read changelist(5) to learn the format for adding or excluding files from the daily backup.
If I had done something crazy, like creating a single root partition for the whole drive, I would have been playing with data recovery utilities trying to recover files in a tedious, semi-manual process, restoring the names based on reading the contents of each file.
I've heard people ask why they shouldn't use a single, unified partition, especially now that we have SSDs. While overwriting the disklabel(5) is not recommended and should be rare, it happens.
Separate partitions saved my bacon and my data.
Ah, you caught that? Yes, the partition was encrypted with softraid(4). Fortunately, softraid(4) details are stored in the parent partition. My script only overwrote the disklabel(5) of the inner, decrypted partition.
$ doas bioctl -i softraid0 Volume Status Size Device softraid0 0 Online 512105629696 sd1 CRYPTO 0% done 0 Online 512105629696 0:0.0 noencl <sd0a>
Had I nuked the disklabel(5) and softraid(4) details on sd0, the disk would've been a total loss. Unless you've done something really unique (like backing-up the softraid(4) info) there's no recovering the data.
Don't nuke the disklabel(5) and don't lose the password!
RAIDframe(4) was removed from OpenBSD in 5.2 in favor of softraid(4).
You may have heard about that guy who recovered his OpenBSD crypto password. It's a true story (which you should read if you like technical spelunking).
The key details are:
- He didn't overwrite the disklabel(5)
- He didn't crack the crypto, he brute-forced decryption because he remembered a significant part of the password
What have we learned from this tale?
- backups, Backups, BACKUPS!
- Be careful with automation
- Use an OS with Sane Defaults
- When bad things happen, panic, but don't freak out
- Know your OS and its tools
- Read the base OS documentation
- backups, Backups, BACKUPS!
Due to using a well designed and implemented OS which I have taken the time to study, I was able to restore my data. Overwriting the disklabel(5) could have ended far worse than the loss of a few config files, some scripts and time.
There's some truth to the suggestion that these kind of near-miss disasters are valuable because they build skill and confidence. I wrote a blog post based on my experience that may help others in a similar situation.
Still, I'd rather have had a good backup to go to.
Yeah, I fixed the thumbdrive script and updated the backup system.
You can find the script and related Makefile on GitHub as OpenBSD Thumbdrive Utilities.
The images, heading links and most of the quotes and allusions above refer to the lyrics page at the OpenBSD site. Since version 3.0 the OpenBSD developers have released one or more songs with each system release, typically May and November.
Take a listen while you download the latest version.
During these hostile and trying times and what-not
OpenBSD may be your family's only line of defense