Category Archives: kvm

a poke on SSD write endurance, Intel SSD 320 and iostat

The decision to move  to virtualization-using-KVM as our standard way of deploying servers  was really a success, given the cost savings for the past 2 years. The only downside is the performance hit in intensive disk IO workloads.

Some disk IO issues were already addressed in the application side (e.g. use cache, tmpfs, etc., smaller logs) but it’s apparent that if we want our deployment to be more “denser”, we have to find alternatives for our current storage back-end. Probably not a total replacement but more of a hybrid approach.

Solid State Drives is probably the the best option. It is cheaper compared against Storage Area Networks. I like the idea even more because it’s a simple drop-in replacement to our current SAS/SATA drives compared against maintaining additional hardware. Besides, my team does not have the luxury of “unlimited” budgets.

After a lengthy discussion with my MD, he approved to perform some tests first to see if SSD route is feasible for us. I chose to use 4 120GB Intel SSD 320s. The plan was to setup these 4 drives in a RAID 10 array and see if how how many virtual machines it can handle.

I chose Intel because it’s SSDs are more reliable among the brands in the market today. If performance is the primary requirement, I’d choose a SSD with a SandForce controller (maybe OCZ) but its not, its reliability.

The plan was to set-up a RAID  10 array of four 320s. But since our supplier can only provide us with 3 drives at the time we ordered, I decided to go with a RAID 0 array of 2 drives instead. I can’t wait for the 4th drive. (It turned out to be a good decision because the 4th drive arrived after 2 months!).

The Intel 320s write endurance, 160GB version, are rated at 15TB. My premise was, if we’re going to write 10GB of data per day, it will take almost 5 year to reach that limit. And in theory, if it’s configured in a striped RAID array, it will be a lot longer than 5 years.

It’s been over a month since I set-up the ganeti node with the SSD storage, so I decided to check and see its total writes.

The ganeti node has been running for 45 days. /dev/sda3 is the LVM volume configured for ganeti to use. The total blocks written is 5,811,473,792 at the rate of 1,468.85 blocks per second.  Since 1 block = 512 bytes, this translates to 2,975,474,581,504 bytes (2.9TB) at the rate of  752,051.2 bytes per second (752kB/s). The write rate translates to 64,977,223,680 bytes (64.5GB) of total writes per day! Uh oh…

64.5GB/day is remotely near from my premise of 10GB/day. At this rate, my RAID array will die in less than 2 years!

Uh oh indeed…

It turned out that 2 of the KVM instances that I assigned to this ganeti node are DB servers. We migrated it here a few weeks back to fix a high IO problem. A move that cost the Intel 320s a big percentage of its lifespan.

It seems that 64GB/per day is huge but apparently, it’s typical on our production servers. Here’s an iostat of one of our web servers:

I’m definitely NOT going to move this server to a SSD array anytime soon.

As a whole, the test ganeti node has been very helpful. I learned a few things that will be a big factor on what hardware we’re going to purchase.

Some points that my team must keep in mind if we’ll pursue the SSD route:

  • IO workload profiling is a must (must monitor this regularly as well)
  • leave write intensive VMs in HDD arrays or
  • consider Intel SSD 710 ??? (high write endurance = hefty price tag)

I didn’t leave our SSD array to die that fast of course. I migrated the DB servers to a different ganeti node and replaced it with some application servers.

It decreased the writes to 672.31 blocks/sec (344kB/s), more than half of its previous rate.

Eventually, the RAID array will die of course. For how long exactly, I don’t know, > 2 years? 🙂


ganeti notes

Listing instances with detailed info
$ gnt-instance list -o name,status,oper_ram,oper_vcpus,pnode,snodes,network_port

Creating a snapshot using gnt-backup
$ gnt-backup export -n NODE --noshutdown NAME
$ gnt-backup list -n NODE
# backup all non-drbd instances
$ for INSTANCE in $(gnt-instance list -oname,disk_template --separator=:| grep plain | cut -f1 -d:); \
do gnt-backup export -n NODE --noshutdown $INSTANCE; done

Adopting a raw file-based KVM instance from a Samba share
$ mkdir -p /mnt/images/ && mount -t cifs -ouser=guest,password='' //samba.server/images/ /mnt/images/
$ qemu-img info /mnt/images/NAME # to get the SIZE
$ lvcreate -LSIZE vmvg -n
$ time dd if=/mnt/tmp/raw/NAME of=/dev/vmvg/
$ gnt-instance add -t plain -n NODE -o image+default -B memory=2G,vcpus=2 --no-start --disk NAME
$ gnt-instance modify -t drbd -n NODE NAME

Changing disk type from ide to virtio (vice-versa)
$ gnt-instance modify -H disk_type=ide,root_path=/dev/hda1 NAME
$ gnt-instance modify -H disk_type=paravirtual,root_path=/dev/vda1 NAME

Changing an instance’s MAC address
$ gnt-instance modify --net 0:mac=MAC_ADDRESS NAME

ganeti and KVM Virtualization

We’ve been using KVM Virtualization for almost 2 years now and we’re happy with it. But as the number of hypervisors & VM instances increases, so is the complexity of server management which can be frustrating at times.

I realized that we have to find a way to manage it somehow.  I’ve been scouring the net for possible solutions. I’ve read about OpenStack & Eucalyptus but the disparity of deploying VM instances against our current deployment is big that migrating one will be difficult.

I have 6 requirements for the target platform:

  1. cost
  2. centralized management
  3. learning curve / ease of deployment
  4. migration constraints (lesser, the better)
  5. performance / high availability
  6. community support

My boss forwarded me this blog about ganeti a few months ago. I was skeptical to try it at first because deployment was debian-centric. We’re using CentOS so that could be a problem. But after reading the documentation + mailing-lists, I realized that migrating to ganeti will be less painful than other solutions (in theory), so I decided to install a test cluster and ran it for a few weeks.

Testing phase is over and ganeti is promising (drbd + live migration rocks!). Our current cluster has 5 nodes but that will surely change as we go into full production 🙂