How I moved from CentOS with RAID6 to FreeBSD with ZFS: Part 1

I have used my file server for quite a while now. It always bothered me that I didn’t use ZFS for it. How can I migrate from a ext4 RAID to ZFS without having to buy another file server with the same storage space available?

Well, I managed to do so with the following approach. I had to buy new hard disks, but I wandted to get rid of the WD Green disks anyway.

Current Situation

  • CentOS 6
  • 6×3 TB hard disks
    • 4×WD Red
    • 2×WD Green
  • RAID 6, 12 TB, ext4
  • 2 available SATA ports
+-md0-(RAID6, 12TB)------------------------------------+
|                                                      |
| +-sda-+  +-sdc-+  +-sdd-+  +-sde-+  +-sdg-+  +-sdh-+ |
| |sda1 |  |sdc1 |  |sdd1 |  |sde1 |  |sdg1 |  |sdh1 | |
| +-----+  +-----+  +-----+  +-----+  +-----+  +-----+ |
|   3TB      3TB      3TB*     3TB*     3TB      3TB   |
+------------------------------------------------------+
                                              * WD Green

Goals

  • Use ZFS instead of ext4
  • Use mirrored vdevs instead of RAID61
  • Migrate from CentOS 6 to FreeBSD
  • Replace the existing 2 WD Green by WD Red
  • Make it possible to extend the storage pool in the future without the need to buy 6 hard disks of the same size

Approach

I will describe the approach high level before I describe each commands and steps to follow in part 2.

New hard disks

First of all, I bought 2 new hard disks. It’s important that these have at least double the size of the largest existing hard disk in the RAID6. In my case these are 2 6 TB (WD Red) hard disks.

Partitioning

One disk gets one big partition spanning the full disk size (without a few sectors to achieve the optimal alignment). The other one will be partitioned to have two partitions – each at least the size of the largest existing hard disk in the existing RAID6. Keep in mind to use optimal alignment (-a opt) for them.

Create ZFS pool

The three partitions on the two new disks are striped in a new ZFS pool called zstorage. Each of these partitions will later be extended to create a stripe of 3 mirrors of each 2 disks. This ZFS pool has already the target size (12 TB) but lacks the redundancy.

+-md0-(RAID6, 12TB)------------------------------------+
|                                                      |
| +-sda-+  +-sdc-+  +-sdd-+  +-sde-+  +-sdg-+  +-sdh-+ |
| |sda1 |  |sdc1 |  |sdd1 |  |sde1 |  |sdg1 |  |sdh1 | |
| +-----+  +-----+  +-----+  +-----+  +-----+  +-----+ |
|   3TB      3TB      3TB*     3TB*     3TB      3TB   |
+------------------------------------------------------+
                                              * WD Green
+-zstorage-(zpool, 12 TB)-+
|                         |
| +-sdi-+                 |
| |sdi1 |---vdev1         |
| |sdi2 |---vdev2         |
| +-----+                 |
|   6TB                   |
                          |
| +-sdf-+                 |
| |sdf1 |---vdev3         |
| |     |                 |
| +-----+                 |
|   6TB                   |
+-------------------------+

All data is copied from the RAID6 to this ZFS pool. There is no way to convert the existing ext4 into ZFS without copying the files. This copying provides some sort of redundancy. All files are now both on the RAID6 and on the ZFS pool.

Move redundant disks from RAID6 to ZFS pool

RAID6 allows 2 disks to fail without loosing data. I fail 2 disks manually to remove them from the RAID6 and add them to the ZFS pool.

+-md0-(RAID6, 12TB)------------------------------------+
|                                                      |
| +-sda-+  +-sdc-+  +-sdd-+  +-sde-+  +-sdg-+  +-sdh-+ |
| |sda1 |  |sdc1 |  |sdd1 |  |sde1 |  |sdg1 |  |sdh1 | |
| +-----+  +-----+  +-----+  +-----+  +-----+  +-----+ |
|   3TB      3TB      3TB*     3TB*      |        |    |
+----------------------------------------|--------|----+
                                         |        |
+-zstorage-(zpool, 12 TB)-+              |        |
|                         |              |        |
| +-sdi-+                 |              |        |
| |sdi1 |---vdev1 <----------------------+        |
| |sdi2 |---vdev2 <-------------------------------+
| +-----+                 |
|   6TB                   |
                          |
| +-sdf-+                 |
| |sdf1 |---vdev3         |
| |     |                 |
| +-----+                 |
|   6TB                   |
+-------------------------+                   * WD Green

The ZFS pool then looks like this.

+-zstorage-(zpool, 12 TB)---+
|                           |
| +-sdi-+           +-sdg-+ |
| |sdi1 |---vdev1---|sdg1 | |
| |sdi2 |---vdev2-+ +-----+ |
| +-----+         |         |
|   6TB           | +-sdh-+ |
|                 +-|sdh1 | |
| +-sdf-+           +-----+ |
| |sdf1 |---vdev3           |
| |     |                   |
| +-----+                   |
|   6TB                     |
+---------------------------+

Replace the splitted disk devices by hard disk

Now that the previously splitted hard disk partitions have a mirror each, they can be replaced by the smaller devices from the existing RAID6 to free the large new disk (so we can add it as mirror for the other large disk).

This will degrade the RAID6. If something goes wrong, data might get lost.

+-sda-+  +-sdc-+  +-sdd-+  +-sde-+
|sda1 |  |sdc1 |  |sdd1 |  |sde1 |
+-----+  +-----+  +-----+  +-----+
   |        |       3TB*     3TB*
   |        +-------replace----------+
   +---------replace---------------+ |
                                   | |
+-zstorage-(zpool, 12 TB)--------+ | |
|                                | | |
| +--------------------------------+ |
| | +--------------------------------+
| | |                            |
| | |                            |
| | |  +-sdi-+           +-sdg-+ |
| | +->|sdi1 |---vdev1---|sdg1 | |
| +--->|sdi2 |---vdev2-+ +-----+ |
|      +-----+         |         |
|        6TB           | +-sdh-+ |
|                      +-|sdh1 | |
| +-sdf-+                +-----+ |
| |sdf1 |---vdev3                |
| |     |                        |
| +-----+                        |
|   6TB                          |
+--------------------------------+

The ZFS pool then looks like this.

+-zstorage-(zpool, 12 TB)---+
|                           |
| +-sda-+           +-sdg-+ |
| |sda1 |---vdev1---|sdg1 | |
| +-----+           +-----+ |
|                           |
| +-sdc-+           +-sdh-+ |
| +sdc1 |---vdev2---|sdh1 | |
| +-----+           +-----+ |
|                           |
| +-sdf-+                   |
| |sdf1 |---vdev3           |
| |     |                   |
| +-----+                   |
|   6TB                     |
+---------------------------+

Readd the large disk to the ZFS pool

The two remaining hard disks are the WD Green hard disks. The plan was to replace them. These are now free and not used by any pool.

Now, the last step is simply to repartition the large hard disk that has been replaced in the previous step and add it as mirror for the other large disk.

After that, the ZFS pool looks like this.

+-zstorage-(zpool, 12 TB)---+
|                           |
| +-sda-+           +-sdg-+ |
| |sda1 |---vdev1---|sdg1 | |
| +-----+           +-----+ |
|                           |
| +-sdc-+           +-sdh-+ |
| +sdc1 |---vdev2---|sdh1 | |
| +-----+           +-----+ |
|                           |
| +-sdf-+           +-sdi-+ |
| |sdf1 |---vdev3---|sdi1 | |
| |     |           |     | |
| +-----+           +-----+ |
|   6TB                     |
+---------------------------+

By replacing two of the 3 TB disks, I can now grow the ZFS pool as needed. The two new disks don’t necessarily have to be 6 TB. Could be any size that is available for a good price in the future.

Conclusion

Copying all files and multiple resilvering puts the hard disks to some stress. As I wanted to replace the two WD Green disks, I tried to spare them as much stress as possible.

So that is what happened:

  1. copy from RAID6 to ZFS pool
    • read from: sdg, sdc, sda, sde, sdh, sdd
    • write to: sdi, sdf
  2. resilver sdg and sdh with sdi
    • read from: sdi
    • write to: sdg, sdh
  3. resilver sdc with sdi and sdg
    • read from: sdi + sdg (50 % each)
    • write to: sdc
  4. resilver sda with sdi and sdh
    • read from: sdi + sdh (50 % each)
    • write to: sda
  5. resilver sdi with sdf
    • read from: sdf
    • write to: sdi

This adds up to the following:

  • sdd (WD Green, 3 TB): r
  • sde (WD Green, 3 TB): r
  • sda (WD Red, 3 TB): rw
  • sdc (WD Red, 3 TB): rw
  • sdg (WD Red, 3 TB): rwr
  • sdh (WD Red, 3 TB): rwr
  • sdf (WD Red, 6 TB): wr
  • sdi (WD Red, 6 TB): wrrrw

sdd and sde are the WD Green disks. These have both just been read once to get replaced. I didn’t want to risk a failure by doing more than that. Even though S.M.A.R.T. didn’t indicate any risk of doing so.

sdg and sdh have been read more than sda and sdc. One pair had to be read twice. I chose the youngest two of the hard disks for that.

sdi was put through the most stress during this operation. This hard disk was new. I could have chosen sdf as well.

If you are curious, how this looks in reality, the detailed description of each step can be found in the second part.