Setting up a software RAID array

There are various forms of RAID: via a hardware RAID controller, via "fake RAID", and "software RAID" using mdadm, which is Linux-only. These instructions only discuss the last form of RAID. Also, it only discusses how to setup a RAID array for arbitrary storage. It is possible to have one's system root /, or /var, or swap, or even one's /boot, on a RAID array. See Setting up disks manually for more details about doing any of that.

RAID Levels

There are several "levels" of RAID to choose between:

RAID0 essentially just glues two devices together, making a larger virtual drive. Reads and writes are "striped" between the drives for speed improvements. (That is, your hardware may read from, or write different data to, multiple devices in parallel.) A "device" here is usually a partition of a hard drive.
RAID1 "mirrors" writes to two devices, for improved safety. Then if one of the devices fails, the data will still be available on the other.
RAID5 is similar to RAID1, but it uses three devices and provides the space of two of them. The data will be preserved as long as any two of the three devices continue to work.

There are other RAID levels as well. Here is more explanation of their differences.

Advice

Your /boot partition should either not be on RAID, or else be on a RAID1 array, with no further layers of encryption or LVM. (Alpine's default bootloader extlinux can't handle either. Grub2 can handle /boot being on LVM.) The usual practice is to create a small (32--100 MB) partition for /boot. That can be a mirrored (RAID1) volume, however this is just for post-init access. That way, when you write a new kernel or bootloader config file to /boot, it gets written to multiple physical partitions. During the pre-init, bootloader phase, only one of those partitions will be used (and it will be mounted read-only).
You can put swap on a RAID0 volume, but there doesn't seem to be any good reason to do so. The Linux kernel already knows how to stripe several swap partitions. So you can just devote multiple ordinary (not-residing-on-RAID) partitions to swap, and get the same effect. The downside from doing either of these things is that when one of your disk fails, the system will go down. For better reliability, you can create a mirrored (RAID1) volume and put swap there. This will let your system keep running even when one of the disks fails.
All partitions in a RAID array should be the same size.
Don't ever mount just one of the devices in a RAID1 array, even though it "has the same data" as the other. If you mount it r/w, then---even if you don't explicitly write anything to the device---it may get out of sync with the unmounted device, for example because the journal on its filesystem has been updated. If you ever subsequently mount the other device, or the two of them together, your data will likely become corrupted. If you have to do this, make sure you mount your device r/o. Better yet, abandon the device you didn't mount. Zero out its RAID headers, and tell mdadm that that device has failed. Then you can if you like treat it as a new disk, which you can add as a replacement to your (now degraded) original RAID array.
A mirrored RAID array (level 1 or 5) protects you against hardware failure. It doesn't protect against rm -rf /, software errors, exploits, earthquakes, fire. Don't rely on RAID as a backup strategy.
Running a mirrored RAID only provides one line of defense against drive failures. It doesn't license you to stop thinking about them. If a device in a RAID 1 starts failing and you aren't aware of it, your data will end up just as silently corrupted as it would be if you were running one drive. You have to watch your logs.

This document was updated for Alpine 2.4.6.

Loading needed modules

Start with loading the raid1 kernel module:

modprobe raid1

Add it to /etc/modules so it gets loaded during next reboot:

echo raid1 >> /etc/modules

Creating the partitions

I will use /dev/sda and /dev/sdb in this document but your devices may be different. To find what disks you have available, look in /proc/partitions.

Create the partitions using fdisk.

fdisk /dev/sda

I will create one single partition of type Linux raid autodetect. Use n in fdisk to create the partition and t to set type. Logical volumes will be created later. My partition table looks like this ('p' to print partition table):

   Device Boot      Start         End      Blocks  Id System
/dev/sda1               1       17753     8388261  fd Linux raid autodetect

Use w to write and quit. Do the same with your second disk.

fdisk /dev/sdb

Mine looks like this:

   Device Boot      Start         End      Blocks  Id System
/dev/sdb1               1       17753     8388261  fd Linux raid autodetect

Alternately, if your disks are the same size (as they should be, see above) you can copy the partition table from one to the other like this:

apk add sfdisk sfdisk -d /dev/sda | sfdisk /dev/sdb

Setting up the RAID array

Install mdadm to set up the arrays.

apk add mdadm

Create the array.

mdadm --create --level=1 --raid-devices=2 /dev/md0 /dev/sda1 /dev/sdb1

Monitoring sync status

You should now be able to see the array syncronize by looking at the contents of /proc/mdstat.

~ # cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sdb1[1] sda1[0]
      8388160 blocks [2/2] [UU]
      [=========>...........]  resync = 45.3% (3800064/8388160) finish=0.3min  speed=200003K/sec

unused devices: <none>

You don't need to wait til it is fully syncronized to continue.

Saving config

Create the /etc/mdadm.conf file so mdadm knows how your raid setup is:

mdadm --detail --scan > /etc/mdadm.conf

To make sure the raid devices start during the next reboot run:

rc-update add mdadm-raid

If you're not running Alpine from a hard disk install, use

lbu commit

as usual to save your configuration changes to your removable media.

The raid device /dev/md0 is now ready to be used with LVM or mkfs.

More Info on RAID

These resources may be helpful: