Archive for July, 2008

Be Smart About Virtualization

Tuesday, July 15th, 2008

We are heavy users of virtualization at Likewise Software. Since we develop software for over 100 different platforms (multiple flavors of UNIX, Linux and Mac OS X), we have to be able to boot up a Red Hat 2.1 machine one minute and a Open Solaris machine the next. Developers and testers, both, need access to a wide variety of machines on a regular basis. Without virtualization, it would either be very expensive (we’d need hundreds of machines) or very slow (we’d have to re-image machines all the time) in order to do our work.

We also use virtualization outside of development/test. Over time, we’ve tended to collect an assortment of servers running project management tools, bug databases, internal wikis, HR, financial and other applications. A few months ago, our IT folk examined all the servers in our inventory and migrated many of them to virtual machines.

Unquestionably, virtualization can bring about good things — reduced administrative costs, increased flexibility, reduced energy use, etc. Virtualization doesn’t always make sense, however.

Occasionally, I have a conversation with someone who’s basically saying something like “Virtualization is terrible! I moved my database server and my risk management grid onto VMs and now they run at half the speed they used to!”. Yes, I do want to whack them upside the head when they say this.

Obviously, if you have a CPU-intensive, heavily threaded, application running on a physical server it’s going to slow down if you put it on a virtualized server along with other CPU-intensive applications. If you wouldn’t run these two apps on the same physical server, certainly, don’t run them on two VMs on a single physical server. VM hypervisors can run multiple virtualized machines effectively and with little degradation in performance, but only to the extent that the virtualized systems are amenable to this. If the VMs are running applications that are not heavily threaded and do not heavily tax their CPU and I/O systems then the VM hypervisor can exploit multiple cores and spare CPU cycles to provide acceptable performance.

There are some “textbook” examples of applications/systems that are ideal for virtualization. Web farms, for example, can deploy web sites in their own VMs and give you complete control of a virtualized server. You can muck with system configuration to your heart’s content without worrying about other web sites that might be deployed on the same physical server. Web farms can also quickly duplicate VMs allowing them to provide additional load-balanced capacity on an on-demand basis.

Beyond the textbook examples, here are some others to consider.

Infrequently run applications are great candidates for virtualization. Consider financial apps that might only be run at quarter- or year-end.  Rather than dedicating a machine to these applications that sits idle 95% of the time, these applications can be deployed on virtual systems that are suspended until needed. This approach is ideal for sensitive applications such as financial and reporting systems. It is best to not run these applications on shared hardware. If there are other applications on the same computer this increases the likelihood of intential or unintential access to secure data. With virtualization, physical systems don’t have to be “wasted” on infrequently used sensitive applications. Note, too, that by suspending sensitive VMs while they’re not in use that you’re reducing the attack surface for hackers.

Another great use of virtualization is for old, legacy, systems. If you’re running old versions of Windows NT or SUSE Linux or Solaris x86 and don’t want to update them (why fix something that’s not broken?) why not move these systems to VMs? In all likelihood, these systems are running on flaky outdated (perhaps unsupported) hardware. It’s possible that they’ll run faster on VMs than on old metal.

Demo systems are ideal candidates for virtualization. The systems receive a lot of “wear and tear” – they’re frequently polluted with sample data and often left in weird states. Moving these to VMs allow you to use VM snapshots  to quickly restore them to a recognizable state.

Finally, one of my favorite uses for VMs is as security honeypots. Create a VM (especially a Windows VM) and give it a suggestive name, perhaps, payroll or HR. Create some directories and files in it, again, with suggestive names. Now, turn on all the auditing features available in the OS. Protect this system as you would any other secure server in your network (but don’t use the same admnistrative passwords!). If possible, isolate this VM from your other systems. Put it on its own subnet and disallow routing to other systems, for example. If you have an intrusion detection system, make sure it monitors this VM. There should be no access to this computer (other than by you, to assure its health). If your IDS or audit logs signal that someone is trying to access the system, you know you’re under attack.

Virtualization has been around for 30+ years. I used VM/370 in college in 1977. It offers many benefits that, thanks to VMWare, Xen and others, are now available to any computer user. At the end of the day, however, virtualization is simply multitasking with really, really, good application isolation. Rather than multitasking applications that call a single operating system instance, hypervisors multitask entire operating system instances. The rest of the gory details (how they virtualize hardware, where drivers live, etc.) are just that: details.

Usability Testing, Revisited

Wednesday, July 9th, 2008

A couple of weeks ago, I wrote about usability testing. Mostly, I talked about testing methodology and the value that it brings. We’ve now finished 8 sessions and I thought it would be good to revisit the subject.

Once again, I’m amazed by how much value usability testing can provide. We’ve been testing our Likewise Open evaluation and download process with the goal of increasing the number of people who successfully install the software. This process begins with a user arriving at our web site and ends with that user performing a successful “join” operation to connect his/her non-Windows computer to Microsoft Active Directory. When we first decided to test the process, I thought we’d not learn much. What could be simpler than clicking on a download link and running an installer program? For the 1,498,753rd time in my life, I was wrong.

The first thing that we learned is that our home page is not our Home page.

Although we’ve tried several web analytics packages, we’ve lately becomed enamored with Google Analytics. The free version is relatively capable and sufficient for our current needs.  A few weeks of analysis with the tool told us that more customers are coming to our Likewise Open community page than to our corporate home page. Looking at the analytics report, it was obvious why: our partners are driving traffic to our site and their linking to the Likewise Open page instead of the corporate home page.

This makes sense. When our Linux partners want to reference Likewise, they want to get their customers as close to their final destination as possible. They don’t want to link to a high-level page with a lot of sales-oriented material. By linking to our Likewise Open community page, they are taking users to a page that’s very relevant and only a couple clicks away from a download.

When we realized that our Likewise Open page was our effective home page, we realized we needed to improve it. We knew that it would be a bad idea to make it too sales-oriented but we also knew that it had several shortcomings. This was borne out in usability testing and quickly corrected.

The second thing we learned is that clicking on a link is non-trivial.

Our download page has a big table with many different rows for different operating systems (Linux, Solaris, etc.), different CPUs (i386, SPARC, Itanium, etc.), different CPU modes (32/64 bit) and different packaging forms (RPM, DEB, etc.). The user has to find the right row and click on a download link. Simple no? No.

If you don’t set your mime-types properly on your web page links, Firefox can make a mess of things. We had many complaints of users who would get a screenful of binary stuff instead of a downloaded file.

The next thing we learned is that it’s possible to be too smart.

Linux and UNIX folk are used to painful install processes. They’ll download packages and then have to use some type of package manager (rpm, dpkg, etc.) to install it. We decided to make life easier for customers by giving them a nice, executable, installer program. In the case of Linux, Mac and other operating systems that are likely to have a GUI present, we make use of a Bitrockbased setup program. Download the software, run it. Simple no? No.

When Firefox downloads a program to Linux, it doesn’t retain its executable file mode. Before you can run it, you have to chmod +x it. If users didn’t read our 100 page Installation and Administration Guide they might not realize this. In fact, as usability testing pointed out, they might try to do other weird stuff.

Our setup programs are typically called something like LikewiseOpen-4.1.0.2921-linux-i386-rpm-installer. Long, yes, but it tells you everything you need to know: product name, version, operating system, architecture and packaging format. Note that we include “rpm” (or “deb” or other) in the name. Some Linux folk would fail to realize that the installer was an executable, would see the “rpm” in the name and think, “maybe I’ve got to install this thing with the rpm program.” Wrong.

The last thing that we learned is that nobody reads anything. No documentation, for sure. They don’t spend much time reading screen output, either.

After users install the software, they need to run our domain-join utility afterwards. We tell them this at the very end of the installer program. Alas, as usability testing showed us, many users decide to just ignore that information and hit Enter repeatedly without reading anything. Right after they dismiss the last dialog they realize that they just missed something important.

We’ve made numerous changes to the Likewise Open pages as a result of our testing. Much of it has been simple to accomplish: more prominent links; short, task-specific document; a short video; corrected mime-types. We’ll do a new round of usability testing now to verify the results of our changes. I’m confident that we’ll see improvement but I’m much less confident now that we won’t find a different set of problems to address. The main lesson of usability testing is that your software UI is never as good as you think it is.

More on NAS at Home

Tuesday, July 8th, 2008

After re-reading my last post, I realize that some of you might have no clue what I’m talking about when I mention network attached storage (NAS). To use an oxymoron, this post is a follow-up primer.

The idea with a NAS is to centralize storage across multiple machines in a network. Instead of having to maintain numerous independent disk drives on the individual machines in a network, NAS places all key files in a central location and worries only about managing the NAS.  This concept is frequently used with server computers but can also be used with workstations. Microsoft Active Directory, for example, supports the concept of a roaming profile that allows your personal files to be stored in one consistent place regardless of what computer you login to. UNIX and kin can do something similar with automounts.

There are actually two main mechanisms for implementing centralized storage.

The storage area network (SAN) approach is a different approach than that used by NAS. A SAN storage appliance provides low-level storage “blocks” to the computers connected to it. The SAN device has no concept of a “file” only of an assortment of storage blocks assigned to a particular computer. SANs are frequently accessed by a separate, high-speed, fibre channel network but can also be accessed over Ethernet using iSCSI and other other protocols.

A NAS device, on the other hand, provides file-level operations. The device implements the smb/cifs protocol and/or the NFS protocol in order to provide file-oriented services to Windows or UNIXy computers (respectively).

If you have used a traditional Netware or Windows-based file server you have used a NAS device. There are much cooler devices now, however. Isilon, for example, makes very clever clustered storage NAS devices that allow multiple NAS nodes to replicate data in a fashion that provides redundancy and high-availability at much lower cost than SANs and many other NAS devices.

The Linksys NAS 200 device that I talked about in the last post is a dirt-cheap home NAS device. It is not particularly fast nor does it offer much sophisticated functionality. Its security model, for example, is very crude. I run a Windows domain controller at home but the NAS 200 does not integrate with AD-based security. To avoid authentication hassles, I simply allow the guest (any user) to have read/write access to all the shared folders. Fine for home (where things are protected with a perimeter firewall and with secure wireless access points) but not fine for a more public network.

I installed the Linksys appliance in order to provide a backup destination for the 6 computers that we have strewn throughout the house. Using the appliance means that I don’t have to dedicate a general-purpose computer to this task. Additionally, Linksys has figured out how to set up Raid and how to automatically perform various recovery operations all using a simple Web interface. It would have been much more complicated for me to figure this out myself.

The one last piece of the backup puzzle that I’d like to implement would be to add some form of offsite storage. Ideally, the NAS 200 would, itself, backup files to some Web-based storage provider. Since it doesn’t, I might have to implement this myself with some type of periodic job that detects new files on the NAS and copies them to a service during off hours.