NAS and Virtualization at Home

In between eating hot dogs and blowing up fireworks this weekend, I worked on a couple of home IT projects that I’d been planning for a while. My goals were straightforward. First, I wanted to implement a more robust backup solution. Second, I wanted to get rid of a Fedora Core 5 server and replace it with something newer. The two projects were related since the FC5 server was being used solely as a Samba file server to host my backup drives. Here’s what I ended up doing to accomplish both tasks.

I didn’t like the dedicated Fedora server for two reasons. First, I was stuck using a computer for a very narrow purpose. I have only two computers in my “server room” (my den) and my other one is my AD domain controller. I’ve been installing lots of Windows application software on the AD machine because I can’t run it on Linux. Installing random software on a DC is not a good idea. The second reason I wanted to get rid of the Fedora machine is because I wanted to run a more current distro. I worried about replacing FC5 with Ubuntu, however, because FC5 uses a funky logical volume manager. If the change failed, I might have to scramble to recover my data.

To get rid of the FC5 file server, I spent $150 on a Linksys NAS 200 device and $200 on two 500Gb SATA hard disks. The Linksys device is, essentially, a cheapy Linux box with an ethernet port and two SATA drive bays. It can be configured to use the drives separately or in a RAID 0 (striping) or RAID 1 (mirroring) array. I chose the latter configuration giving me 500Gb of storage but with the security of knowing that I can lose a drive and still have my data.

The Linksys NAS 200 was pretty easy to install. I made one mistake which was to start using it (copying over 100Gb to it) before realizing that it was running very slowly. A look at the Linksys web site showed that there was a firmware update that allowed the use of a non-journaled file system. Without journaling, the Linksys device is much faster but will have to perform a “scandisk” (fsck) if detects any disk errors. Installing the firmware upgrade and switching to the non-journaled file system required reformatting the disks and re-copying the 100Gb again.

WIth the NAS in place, I was able to go to my FC5 computer and copy over all the old backups. I then changed the key computers in the house to use the NAS instead of the FC5 machine for backups. Along the way, I also stopped using Windows backup software and started using NTI Shadow (dumb, cheap) instead.

Now that my FC5 computer was out of a job, I could repurpose it. I increased its RAM to 2Gb, deleted its Linux file system partitions and installed Windows XP on it. Deleting the partitions was necessary as, with them,  Windows XP would get confused during installation.

The first thing I did after installing XP (well, the second, after waiting for SP2 and a million other updates to install), was to install VMWare Workstation. VMWare Server is free, but the Workstation version allows for multiple “snapshots” which I find very useful.

With VMWare installed, the first VM I created was an Ubuntu 8.04 Linux VM.

What’s the point of replacing a Linux machine with a Windows machine running Linux in a VM? Two things: first, I can run Windows software in the host operating system. Actually, I will probably create a Windows XP VM and run the Windows software in the VM instead of on the host OS. Second, if I get tired of one Linux distribution, I can always create another VM with a different one.

With VMWare, I can keep my host OS in pristine condition. I won’t install any application software there. If any problem occurs in a guest VM, I can always use the VMWare snapshot features to “undo” them. Worst case, I can blow away a VM and recreate it. What about data? Here’s the key: don’t keep your important data on virtual disks. Use virtual machines, but keep your data on real drives or on a NAS device. Windows file shares or the Linux mount.cifs command can help with this (if you keep all your data on WIndows file servers; if you want, you can use NFS and store data on UNIX file servers, instead). Use virtual disks only to store operating system files.

This is exactly the architecture used in large virtualized Enterprise IT departments. Application data is kept on attached storage accessed by one or more virtual machines. Deploying additional virtual server instances is easy because the data is centrally located. The same concept can be used at home, on a smaller scale.

Everything is up and running now. I’m happy running Ubuntu instead of FC5 and I’m happy knowing my data backups are mirrored. I’ll be tracking the performance of the NAS device over the next few weeks and months. Consumer NAS devices are a tricky tradeoff of simplicity vs. functionality and performance. Someday, I want to experiment with removing a drive from the array and validating that the RAID rebuild occurs properly. For now, I’m just hoping the NAS is doing the right thing.