Category Archives: raid

How to setup large partitions (>2TB RAID arrays) in CentOS 6.2 with a Supermicro Blade SBI-7125W-S6

We’re on the process of retiring our non-blade servers to free up space and reduce power usage. This move affects our 1U backups servers so we have to migrate it to blades as well.

I was setting-up a blade server as a replacement for one of our backup servers when I encountered a problem…

But before I get into that, here’s the specs of the blade:

  • Supermicro Blade SBI-7125W-S6 (circa 2008)
  • Intel Xeon E5405
  • 8 GB DDR2
  • LSI RAID 1078
  • 6 x 750 GB Seagate Momentus XT (ST750LX003)

The original plan was to set-up these drives as a RAID 5 array, about 3.5+ TB in size. The RAID controller can handle the size. So Rich, my colleague who did the initial setup of  the blade & the hard drives, did not encounter a problem.

I was cruising through the remote installation process until I hit a snag in the disk partitioning stage. The installer won’t use the entire space of the RAID array. It will only create partition(s) as long as the total size is 2TB.

I find it unusual because I’ve created bigger arrays before using software RAID and this problem did not manifest. After a little googling I found out that it has something to do with the limitations of Master Boot Record (or MBR). The solution is to use the GUID partition table (or GPT) as advised by this discussion.

I have two options at this point,

  1. go as originally planned, use GPT, and hope that the SBI-7125W-S6 can boot with it, or…
  2. create 2 arrays, one small (that will use MBR so the server can boot) and one large (that will use GPT  so that the disk space can be used in its entirety)

I tried option #1, it failed. The blade won’t boot at all. Primarily because the server has a BIOS, not an EFI.

And so I’m left with option #2…

The server has six drives. To implement option #2, my plan was to create this setup:

  • 2 drives at RAID 1 – will become /dev/sda, MBR, 750GB, main system drive (/)
  • 4 drives at RAID5 – will become /dev/sdb, GPT, 2.x+TB, will be mounted later

The LSI RAID 1078 can support this kind of setup, so I’m in luck. I decided to use RAID 1 & RAID 5 because redundancy is the primary concern, size is secondary.

This is where IPMI shines, I can reconfigure the RAID array remotely using the KVM console of IPMIView like I’m physically there at the data center 🙂 With the KVM access, I created 2 disk groups using the Web BIOS of the RAID controller.

Now that the arrays are up, I went through the CentOS 6 installation process again. The installer detected the 2 arrays, so no problem there. I configured /dev/sda with 3 partitions and  left /dev/sdb unconfigured (it can be configured easily later once CentOS is up).

In case you’re wondering, I added a 3.8GB LVM PV since this server will become a node of our ganeti cluster, to store VM snapshots.

The CentOS installation booted successfully this time. Now that the system’s working, it’s time to configure /dev/sdb.

I installed the EPEL repo first, then parted:

$ wget -c http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-6.noarch.rpm 
$ wget -c https://fedoraproject.org/static/0608B895.txt 
$ rpm -Uvh epel-release-6-5.noarch.rpm 
$ rpm --import 0608B895.txt 
$ yum install parted

Then, I configured /dev/sdb to use GPT, then formatted the whole partition as ext4:

$ parted /dev/sdb mklabel gpt 
$ parted /dev/sdb 
(parted) mkpart primary ext4 1 -1 
(parted) quit 
$ mkfs.ext4 -L data /dev/sdb

To mount the /dev/sdb, I need to find out its UUID first:

$ ls -lh /dev/disk/by-uuid/ | grep sdb 
lrwxrwxrwx. 1 root root 9 May 12 15:07 858844c3-6fd8-47e9-90a4-0d10c0914eb5 -> ../../sdb

Once I have the right UUID, I added this line in /etc/fstab. /dev/sdb will be mounted in /home/backup/log-dump/:

UUID=858844c3-6fd8-47e9-90a4-0d10c0914eb5 /home/backup/log-dump ext4 noatime,defaults 1 1

The partition is now ready to be mounted and used:

$ useradd backup
$ mkdir -p /home/backup/log-dump
$ mount /home/backup/log-dump
$ chown backup.backup -R /home/backup/log-dump

There, another problem solved. Thanks to the internet and the Linux community 🙂

After a few of days of copying files to this new array, this is what it looks like now:

/dev/sdb is almost used up already 🙂

References:

Advertisements

a poke on SSD write endurance, Intel SSD 320 and iostat

The decision to move  to virtualization-using-KVM as our standard way of deploying servers  was really a success, given the cost savings for the past 2 years. The only downside is the performance hit in intensive disk IO workloads.

Some disk IO issues were already addressed in the application side (e.g. use cache, tmpfs, etc., smaller logs) but it’s apparent that if we want our deployment to be more “denser”, we have to find alternatives for our current storage back-end. Probably not a total replacement but more of a hybrid approach.

Solid State Drives is probably the the best option. It is cheaper compared against Storage Area Networks. I like the idea even more because it’s a simple drop-in replacement to our current SAS/SATA drives compared against maintaining additional hardware. Besides, my team does not have the luxury of “unlimited” budgets.

After a lengthy discussion with my MD, he approved to perform some tests first to see if SSD route is feasible for us. I chose to use 4 120GB Intel SSD 320s. The plan was to setup these 4 drives in a RAID 10 array and see if how how many virtual machines it can handle.

I chose Intel because it’s SSDs are more reliable among the brands in the market today. If performance is the primary requirement, I’d choose a SSD with a SandForce controller (maybe OCZ) but its not, its reliability.

The plan was to set-up a RAID  10 array of four 320s. But since our supplier can only provide us with 3 drives at the time we ordered, I decided to go with a RAID 0 array of 2 drives instead. I can’t wait for the 4th drive. (It turned out to be a good decision because the 4th drive arrived after 2 months!).

The Intel 320s write endurance, 160GB version, are rated at 15TB. My premise was, if we’re going to write 10GB of data per day, it will take almost 5 year to reach that limit. And in theory, if it’s configured in a striped RAID array, it will be a lot longer than 5 years.

It’s been over a month since I set-up the ganeti node with the SSD storage, so I decided to check and see its total writes.

The ganeti node has been running for 45 days. /dev/sda3 is the LVM volume configured for ganeti to use. The total blocks written is 5,811,473,792 at the rate of 1,468.85 blocks per second.  Since 1 block = 512 bytes, this translates to 2,975,474,581,504 bytes (2.9TB) at the rate of  752,051.2 bytes per second (752kB/s). The write rate translates to 64,977,223,680 bytes (64.5GB) of total writes per day! Uh oh…

64.5GB/day is remotely near from my premise of 10GB/day. At this rate, my RAID array will die in less than 2 years!

Uh oh indeed…

It turned out that 2 of the KVM instances that I assigned to this ganeti node are DB servers. We migrated it here a few weeks back to fix a high IO problem. A move that cost the Intel 320s a big percentage of its lifespan.

It seems that 64GB/per day is huge but apparently, it’s typical on our production servers. Here’s an iostat of one of our web servers:

I’m definitely NOT going to move this server to a SSD array anytime soon.

As a whole, the test ganeti node has been very helpful. I learned a few things that will be a big factor on what hardware we’re going to purchase.

Some points that my team must keep in mind if we’ll pursue the SSD route:

  • IO workload profiling is a must (must monitor this regularly as well)
  • leave write intensive VMs in HDD arrays or
  • consider Intel SSD 710 ??? (high write endurance = hefty price tag)

I didn’t leave our SSD array to die that fast of course. I migrated the DB servers to a different ganeti node and replaced it with some application servers.

It decreased the writes to 672.31 blocks/sec (344kB/s), more than half of its previous rate.

Eventually, the RAID array will die of course. For how long exactly, I don’t know, > 2 years? 🙂

Munin plugin – MegaRAID HDD temperature using MegaCLI

Munin Exchange approved my plugin recently. I submitted it for approval a few months ago that I already forgot about it. The plugin is written in Bash and it graphs temperatures of HDDs attached to a LSI MegaRaid controller.

It uses the serial numbers of the HDDs as labels:

Most of our servers, circa 2008+, uses LSI cards especially our Supermicro blades. So if you’re using LSI cards as well, check it out.

UPDATE: Munin Exchange is down. They’re moving to github so the links above are not working anymore.

UPDATE: I moved the code to GitHub. Just follow this link.

Set-up a headless “file server”/”file burning station” with CentOS/RHEL 5 using VNC

I’m setting up a file server for my team that has to minimally comply with the following:

  • it has to cost NOTHING, parts have to be salvaged from old servers
  • it has to have a disk space of at least 1TB (must be RAID for good read/write performance)
  • it has to be headless (no monitor, keyboard and mouse), with GUI and accessible remotely
  • it has to have the ability to burn files using a USB DVD writer

setting up the disk space

I used (4) 500GB SATA drives and (2)  250 PATA drives (salvaged from our old servers) for this one. Deploying software RAID in CentOS is easy so I won’t elaborate on that.

Suffice to say my software RAID has this configuration:

going headless using VNC

I installed the GNOME desktop by default and accessed it remotely using VNC. Here’s a good guide [wiki.centos.org] on how to do it. Everything went well (at first…)

I hit a snag when I unplugged the monitor. After a reboot, the X server won’t load anymore. Last few lines of  /var/log/Xorg.0.log says “No screens found”. Apparently, installing the proprietary Nvidia driver has a side effect. The driver will try to “auto-detect” the monitor. But since no monitor is attached, the X server won’t load! (duh! so much for going headless…)

Reverting to the open-source driver fixed the problem. I opened /etc/X11/xorg.conf and changed “nvidia” to “nv“, then restarted X server.

burning files in CentOS?

Now, this one should be easy right. Well… apparently not… This got me baffled at first because I can’t access the DVD writer plugged in the system (got spoiled with Ubuntu). Eventually, I figured that it’s a file permission problem. The user must be added to the disk group. I added my username to the disk group by running this command as root:

$ usermod -aG disk pro

After a reboot I can now access the writer using k3b 🙂

Here’s a screenshot of the headless file server. I’m accessing it using Vinagre from my laptop running Ubuntu 10.04