slices of computer: September 2014

Sometimes, you just need to reformat. Instead of trying to convert my existing system from extX to btrfs raid, I reinstalled. And then converted my brand-new system from ext to btrfs. Because why do things the easy way when you could do them the hard way? (If you want to actually convert an existing system, just skip to step 3. It should work, but this is all cowboy-style, so don't blame me if everything explodes.) Here are the basic steps. Be warned, this is from a few weeks old memory:

(If you want to read about how I got here, check this out. If you just want a guide to do this, the backstory doesn't matter, so just keep reading this page.)

Install Wheezy. However you normally do; all the defaults are fine. If you feel like it, you can halve the size of swap and add that back in to the system partition. Or you can do this later with gparted, or you can leave it and have twice as much disk devoted to swap as the Debian installer thinks you'll need.
Upgrade to Jessie. Also in the normal way.

root@serv$ vi /etc/apt/sources.list
:%s/wheezy/jessie/g
:%s/stable/testing/g
:%s/^deb-src/#deb-src/g
:wq
root@serv$ apt-get update
root@serv$ apt-get dist-upgrade -y

Boot into an alternate Jessie environment. Or at least something with recent btrfs-tools. Ubuntu may work, but I made a custom Jessie iso on the Debian live-systems build interface. This tool is really cool, someone's dedicating a lot of server time to make this thing happen and I think it's awesome. On the downside, you'll probably have to wait a few days before you make it to the top of the queue, and once your "build finished" email is sent out, you'll have to download the iso in the 24 hours before they delete it.
Install btrfs-tools in the live boot. Once booted into the new environment, we'll need btrfs-tools, of course. btrfs --version to make sure you've done stuff right - if you're using the ancient Wheezy 0.19 version this stuff may not work right. The correct version should sound like a kernel version number, mine is currently π:

Convert the just-installed ext root to btrfs. I got most of my instructions on this step from the occasionally wonderful btrfs wiki. It doesn't matter if your root is ext3 or ext4 - in fact, these steps may even work with ext2, how should I know. The steps go something like this:
```
root@serv$ # run fdisk -l as root to make sure you're using the hard disk's root filesystem partition for these next steps
root@serv$ fsck -f /dev/sdX1
root@serv$ btrfs-convert /dev/sdX1
```
Use the following optional but prudent steps to make sure your data survived:
```
root@serv$ mkdir /btrfs && mount -t btrfs /dev/sdX1 /mnt/btrfs
root@serv$ btrfs subvol list
root@serv$ # find the name of the saved subvolume, something like extX_saved
root@serv$ mkdir /ext_saved && mount -t btrfs -o subvol=extX_saved /dev/sdX1 /ext_saved
root@serv$ mkdir /orig && mount -o loop,ro /ext_saved/image /orig
```
Yep, that's a triple mount. The contents of the last mount should be the same as the contents of your root filesystem. Check anything important or customized, and, if you're satisfied and want to set everything in stone:

root@serv$ btrfs subvol delete extX_saved
root@serv$ umount /orig
root@serv$ rm /ext_saved/image
root@serv$ umount /btrfs /ext_saved
root@serv$ rmdir /orig /ext_saved /btrfs

Modify fstab. Make sure you change fstab or your system isn't going to boot, fool. Use blkid to get the UUID of the boot partition and make sure this matches the entry for your / in fstab (I don't think the UUID will change but I can't remember). Then make sure the line looks something like this:

UUID=deadbeef-beef-dead-beef-deadbeefbeef    /    btrfs    noatime,ssd,discard,space_cache    0    0

root@serv$ man fsck.btrfs

Check this out

Pop out the alternate boot media and reboot. Sometimes, when emerging from deeply nested sessions, chroots, or alternate boot environments, don't you feel like Cobb waking at the end of Inception? Anyway, you should be booted into the newly buttery root of your recently installed system now.
Verify integrity and clean up. I know that shit's boring, yo, but we're gonna do it anyway.

root@serv$ btrfs subvol delete ext_saved
root@serv$ # allegedly you can verify with btrfs subvol list -d /, but the manpage for the btrfs-tools version pi on Jessie didn't have this documented
root@serv$ btrfs fi defrag -r /
root@serv$ btrfs balance start /

Add the secondary drive and partition. To get the second drive partitioned properly, I simply popped in the second drive and dd'ed the existing disk to the second one.

root@serv$ dd if=/dev/sdSETUPDRIVE of=/dev/sdNEWDRIVE bs=32M # don't fuck this up, mmk?

Convert to raid1 live! "Fuck it, we're doing it live." Yeah, computers are pretty cool I guess. From here.

root@serv$ btrfs fi show # to see which device is mounted as root
root@serv$ fdisk -l # to see which device will be added to form our raid1 (aka, which one is NOT root)
root@serv$ btrfs device add /dev/sdNOTBOOT1 /
root@serv$ btrfs balance start -dconvert=raid1 -mconvert=raid1 -sconvert=raid1 -f /

And we're done! Isn't it great? Hypothetically, one of our drives can fail and we'll still be able to boot! I think we might be screwed if the boot partition gives us trouble, but I'm not realy sure yet.

As always, the Arch wiki docs are unparalleled, peruse related info here. I hope it all worked, drop a line below if something didn't, or if something did!

One of my servers, lil turbo, was booting from one of those bottom-of-the-barrel ADATA 32GB SSDs. There are tons of reviews out there saying that these things are little turds, but I was feeling ballsy. Then, one day, the server wasn't on the network any more. I went into the closet, where lil turbo lives, to see what was the matter.

One of the non-boot drives was locked in a death grip on the sector it had been reading when it was interrupted, and fractured, seemingly non-Latin characters were bleeding all over the display. Fuck.

Rebooted, and no dice. Neither SSD was even seen in POST, not the boot drive and not the one I bought a year ago to mirror the boot drive with.

That was three months ago.

Last week, I decided to take a crack at reviving the comatose lil turbo. Thinking either the SSD hot swap module or the SATA controller had died, I tried replacing both parts. Still no dice.

So I started working on something else, and needed a spare 3.5" HDD to test a bus on a different server (vault 101). So I pulled one of the RAID drives from lil turbo to use. Then, forgetting that lil turbo was missing a drive, I booted it again, and the SSDs showed up! However, they didn't boot - the screen came up with "Missing boot drive" or some shit.

I was thinking that the hot swap enclosure must be loose, and the drive was making connection and then loosing it. But several subsequent boots failed the same way.

Then it hit me. I grabbed the RAID disk back from vault 101 and inserted it in lil turbo's yawning, empty bay, but not all the way. Then I went down the front and opened all the hot swap bays for the RAID disks, nine in all, so none of the would be seen or spun up when I next booted lil turbo.

When lil turbo booted, both SSDs were seen, and once it got to grub, I slowly began closing all the RAID drive bays. Once the system had booted, I issued an mdadm --assemble --verbose /dev/md0 /dev/sd[abcdehijk] and a mount /dev/md0 /mnt/store, and watched the drive lights flicker as my data, marooned for three months, finally came back to me.

* * *

Later I learned that the ADATA was a turd after all - the smart log showed two critical-looking errors from around the time that the server would have crashed.

Next step: turn the root into a btrfs RAID1 and mirror it across both drives, finally!

(Edit: So I ended up trying various things live and borked the install. Rather than fixing it or restoring from backup, I decided it was time for a fresh start. Read about how I reinstalled lil turbo to boot from a raid1 btrfs root here.)

slices of computer

Tuesday, September 30, 2014

btrfs raid1 as root file system - the immortal life of lil turbo

Tuesday, September 9, 2014

on the mortality of SSDs