Raid Administration: Difference between revisions

From Alpine Linux
Line 104: Line 104:
Check the output of <code>mdadm --detail /dev/md2</code> and see how the device is marked as 'removed'.
Check the output of <code>mdadm --detail /dev/md2</code> and see how the device is marked as 'removed'.


To add a removed device back in, ensure it's partitioned correctly and then simply add it back in again:
To add a removed device back in, ensure it's partitioned correctly (replace the drive if necessary and copy over the partition table from a known good drive) and then simply add it back in again:
<pre>
<pre>
   mdadm /dev/md2 -a /dev/sdc3
   mdadm /dev/md2 -a /dev/sdc3
Line 118: Line 118:
  mdadm --grow /dev/md2 -n 2  
  mdadm --grow /dev/md2 -n 2  
</pre>  
</pre>  
If you do need to add disks back in again, you need to add them as spares (<code>mdadm /dev/md0 -a /dev/sdb1</code> etc) and then change the device count if you wish, as per the section on adding devices.
If you do need to add disks back in again, you need to add them as spares (<code>mdadm /dev/md0 -a /dev/sdb1</code> etc) and then change the device count if you wish to make the device active, as per the section on adding devices.


== General recommendations ==
== General recommendations ==

Revision as of 08:51, 8 January 2014

Introduction

Whilst there are articles on RAID installation (see 1, 2, 3, 4, 5, 6, 7 for example) this article is designed to provide practical information on RAID administration, regardless of RAID type used.

This article is of course using linux software RAID, which is controlled by the mdadm command.

For the purposes of this example, we will create a RAID 1 array across /dev/sda and /dev/sdb using the setup-alpine script (more specifically the setup-disk script) and then add /dev/sdc to the array after installation. This will add it as a hot spare which will be used if one of the other drives becomes degraded. Alternatively the drive can immediately be added to the RAID array (as explained in the optional steps). The instructions in this article should work regardless of whether you are using RAID 1 or RAID 5 and whether you have setup your disks manually or with the setup script, unless stated otherwise.

In this example /dev/sda, /dev/sdb and /dev/sdc are all virtual 2GB disks on a VMware machine (it doesn't matter that it's a VM, the same process applies to a real machine with physical disks of larger sizes).

In our example, all disks are available (present) at the time of installation, however /dev/sdc could be added at a later time; this has no impact on the procedure described other than having to physically add the disk.

Initial setup

Install with setup-alpine and pass the relevant disks to setup-disk (in our case sda sdb) and use installation method sys.

This should create the following disk setup (it will differ in your setup since values of course depend on drive size):

md0 composed of /dev/sda1 and /dev/sdb1 ~100MB mounted as /boot

md1 composed of /dev/sda2 and /dev/sdb2 ~512MB as /swap

md2 composed of /dev/sda3 and /dev/sdb3 ~1400MB mounted as /

As you can see, we have redundancy across the two drives /dev/sda and /dev/sdb.

Review

Run df -h and observe that the RAID arrays are mounted, not the disk partitions as usual.

To see information on the current RAID partitions use the query option:

 mdadm --query /dev/md0

or for more information use the detail option

 mdadm --detail /dev/md1

After the initial setup, if you haven't added the third drive (/dev/sdc) now is the time to poweroff and physically add it to the machine.

Add devices to the array

Now, let's add /dev/sdc to the RAID array.

Copy partition table

First, copy the partition table from an existing drive to the new drive. Be very careful with the dd command and ensure you are copying from/to the correct place!

Note: for GPT partitioning, which you might have used if you've setup your disks manually, this dd command is unlikely to work since GPT stores its information differently
 dd if=/dev/sda of=/dev/sdc bs=512 count=1  

Ensure this worked correctly by comparing the output of sfdisk, they should be identical:

 sfdisk --dump /dev/sda
 sfdisk --dump /dev/sdc

Add devices

Now add the partitions of the new disk to the relevant RAID arrays. Be sure to add the correct partitions to the correct arrays!

 mdadm /dev/md0 -a /dev/sdc1
 mdadm /dev/md1 -a /dev/sdc2
 mdadm /dev/md2 -a /dev/sdc3
 

You should see something like "mdadm: added /dev/sdc1" if the command is successful.

Now see how the output of the query command has changed from earlier:

 mdadm --query /dev/md0
 mdadm --query /dev/md1
 mdadm --query /dev/md2

You should see we still have two devices in each array, plus now we have a spare. A spare is an inactive device that is a member of the array; it will only be used if one of the other devices fails. If this is good enough for you, you're done!

Grow the array (optional)

Otherwise you can take the optional step to add the 'spare' device so it immediately becomes part of the array. Since we're using RAID 1 in our example this effectively gives us another backup of all data:

 mdadm --grow /dev/md0 -n 2

Should give you something like mdamd: /dev/md0: no change requested. This is because we already have -n 2 set (so we use 2 devices in the array). Let's increase the value and bring in the additional device to the array:

 mdadm --grow /dev/md0 -n 3 

You should see something like raid_disks for /dev/md0 set to 3 if successful.

Review the output of mdadm --query /dev/md0 or mdadm --detail /dev/md0 again to confirm it worked. Don't worry if you see something about 'spare rebuilding' - this is normal and will be replaced with a state of 'active sync' once data copying is complete.

Ensure to add the other devices (partitions) to the arrays by increasing the device count for the other arrays (otherwise they will remain as spares and not be immediately utilised):

 mdadm --grow /dev/md1 -n 3 
 mdadm --grow /dev/md2 -n 3 

Remove devices

To remove a failed device use the following; remember you will need to remove all the partitions of the failing drive (devices) from the relevant RAID arrays. In our example, we will mark the partitions of /dev/sdb as failed and remove them from the array:

 mdadm /dev/md0 -f /dev/sdb1 -r /dev/sdb1
 mdadm /dev/md1 -f /dev/sdb2 -r /dev/sdb2
 mdadm /dev/md2 -f /dev/sdb3 -r /dev/sdb3

Check the output of mdadm --detail /dev/md2 and see how the device is marked as 'removed'.

To add a removed device back in, ensure it's partitioned correctly (replace the drive if necessary and copy over the partition table from a known good drive) and then simply add it back in again:

  mdadm /dev/md2 -a /dev/sdc3

(repeat for other devices as appropriate).

Change device count (optional)

To entirely remove the device from the array (assuming you are not going to add it back later for instance) amend the device count again, this will remove it from the list so it no longer shows as 'removed' and we are back to two devices in the array:

 mdadm --grow /dev/md0 -n 2 
 mdadm --grow /dev/md1 -n 2 
 mdadm --grow /dev/md2 -n 2 

If you do need to add disks back in again, you need to add them as spares (mdadm /dev/md0 -a /dev/sdb1 etc) and then change the device count if you wish to make the device active, as per the section on adding devices.

General recommendations

When making use of RAID arrays best practice is to have one more disk than is required (whether it is added as a spare or as part of the array). This immediately provides some form of redundancy. Remember that for RAID 1 you cannot go below 2 disks (well you can run on one disk, known as degraded mode, but this is best avoided at all costs) and with RAID 5 you cannot go below 3 disks. In short, if you are using RAID, have a spare device configured. Disks cost money, but the data on those disks is often priceless!

It's a good idea to have a test environment to play around with RAID before implementing it in a production environment. Worst case, setup a VirtualBox host and run an Alpine VM and play around with that, prior to using a production system.

Further information

man mdadm

RAID on wikipedia

Linux RAID wiki at kernel.org