Archive for June, 2008

The Evolution of API

Monday, June 30th, 2008

It is my premise that, despite its monopolistic practices, Microsoft has succeeded by winning the minds of software developers. The Windows API, OLE/COM, the Win32 API, the distributed systems architecture, ODBC, .NET – these are all examples of API provided by Microsoft that have had tremendous impact on the industry. Even as developers decry their dependence on a single vendor, even as they install Eclipse and study Ruby, they can’t deny that it’s simply easier to run Visual Studio and program in C#/.NET.

There are chinks in Microsoft armor. PHP is pretty hard to beat for simple web apps. Microsoft’s share of the mobile device business is pretty small, too, regardless of how cool the Compact Edition of .NET might be. Overall, however, I can think of no other company or technology that has as much influence over developers as Microsoft.

To more clearly demonstrate how Microsoft achieved this hegemony, I thought it might be interesting to review the history of a particular API call and to note Microsoft’s involvement with it. Any programmer that’s read Kernighan and Ritchie’s The C Programming Language, is familiar with the “Hello World” program. (Any programmer that hasn’t read this book should be ashamed of his/herself. This would be akin to an English Lit major not having read any Shakespeare.) Let’s consider how the implementation of “Hello World” (on a PC) has changed over the years.

In 1981, when the first PC became available, the BIOS provided one way of implementing the program:

.data
    Message DB "Hello World"
.ends
.code
    MOV AX,1300h       ; set function code for "Write string", don't move cursor
    XOR  BX,BX         ; write to page 0 with no character attribute
    MOV CX, 11         ; write 11 characters
    PUSH DS
    POP ES
    LEA BP, Message    ; set ES:BP to the address of the message
    INT 10h            ; do it!
.ends

If you didn’t care about screen location or video attributes, MSDOS provided a simpler way:

.data
    Message DB "Hello World"
.ends
.code
    MOV AH,9           ; set function code for "write string at cursor"
    LEA DX, Message    ; set DS:DX to the address of the message (DS is already set)
    INT 21h            ; do it!
.ends

In the early 80’s when high-level languages (other than BASIC) started arriving on the PC, you could do it like K&R:

    int main()
    {
        printf("Hello World");
    }

The arrival of Microsoft Windows in 1985 made life much more complicated. Displaying a label on a dialog box was pretty simple (you could just place it there in the dialog box editor). If you wanted to display text in your application’s client window, that meant you had to write a window procedure and process the WM_PAINT message. Processing this message typically meant deciding on a font to use, figuring out colors to use and dealing with opacity:

    void HandlePaint(HWND hwnd)
    {
        /* Get device context for window */
        HDC hdc = GetDC(hwnd);

        /* Create a font and select it, keeping the old one around */
        HFONT hf = CreateFont(...lots of args...);
        HFONT hfOld = SelectObject(hdc, hf);

        /* Get the client rectangle for the whole window */
        RECT rect;
        GetClientRect(hwnd, &rect);

        /* Draw the text at the top left of the window */
        rect.left = rect.top = 10;
        DrawText(hdc, "Hello World", 11, &rect, DT_SINGLELINE);

        /* Select the old font */
        SelectObject(hdc, hfOld);

        /* delete created resources */
        DeleteObject(hf);
    }

No wonder so many programmers had a hard time moving to Windows! And yet move they did! What were the alternatives? VisiOn was cool but it was a pig. TopView was character-mode only. OS/2 Presentation Manager was Windows run through the IBM bureaucracy filter (e.g. put Win in front of all the API and flip the Y axis top to bottom!). The Mac was equally complicated and less capable (no multitasking, no child windows).

The arrival of 32-bit programming didn’t affect “Hello World” all that much, but it did accomplish two things. First, it freed programmers from having to worry about segmentation (putting code into 64K groups that could be efficiently swapped in and out). Second, it performed better process cleanup meaning that users had to worry less about depleting system resources (as could happen, for example, if you forgot to call DeleteObject in Win16).

OLE/COM affected “Hello World” to the extent that you could now write a “Hello world” application that could be embedded in another one, saved to a single file and printed in composite form. Your code had to worry about garbage collection (maintaining proper reference counts) and implementing appropriate COM interfaces. I’ll not cover this here as I still have scars from those days. Alternatives? None. Corba was still working on PL/1 and Ada support when OLE/COM first shipped. Microsoft was willing to punt on this problem in exchange for 99% market share.

As the final installment, let’s consider “Hello World” in the .NET era. First of all, Winforms programs are very often forms-oriented these days; you might very well choose to implement the program by simply creating a form and dragging a label there in Visual Studio. A Webforms program would do it the same way. If you do need to write a Winforms program that fully handles its client area, you’d write an event handler for a form’s Paint event:

    void Paint(object sender, PaintEventArgs e)
    {
        // get the Graphics object from the event arguments
        Graphics g = e.Graphics;

        // draw our string
        g.DrawString("Hello World", new Font("Arial", 10),
                         Brushes.Black, new Point(10, 10));
    }

If you compare this to the previous version, you can see a lot of similaries. The Graphics object is the equivant of the Windows device context. You still have to create a font. The difference is that C#/.NET has some “syntactic sugar” that makes life easier, for example, simplified arguments when creating fonts and the ability to create objects in-line. Additionally, C#/.NET perform automatic garbage collection so you don’t have to worry about deleting any created objects.

“Hello World” is an extremely simple example that demonstrates how APIs have changed over the course of 27 years. I chose it because, even this simple example, demonstrates how Microsoft has affected developers lives (both positively and negatively). There are better examples (OLE/COM and ODBC/ADO/LinQ) that demonstrate Microsoft’s contributions to the industry but none that can be so easily summarized in an already too long blog!

High Hopes for Silverlight 2.0

Sunday, June 29th, 2008

Let me start out by admitting that I haven’t done a whole lot of client-side scripting. I’ve written a few DHTML scripts in vbscript and in jscript, but I haven’t written a hard core Ajax application.

That said, the RIA (Rich Internet Application) technology has always seemed terribly crude to me compared to what’s available for thick clients and for server-side web applications. As I’ve mentioned in previous posts, I find .NET incredibly productive for writing Winforms (thick client) applications. I’ve written some ASP.NET code, too, and I think that Microsoft has done a good job of providing a reasonable continuum between how one writes Winforms and Webforms (server-based Internet) applications. Both approaches rely on event-driven programming and both let you use visual tools to design forms/pages and to hook them up with code.

When I wrote ASP.NET applications, I was happy to be able to use the Web as a delivery mechanism and I was happy to not have to worry about writing installers and updating customer software. On the other hand, I was never happy with the quality of the end-user experience (the web UI) compared to what was possible with a thick client. The page oriented nature of Web applications resulted in poor user feedback. I felt like I was, once again, running IBM 3270 CICS applications as I did in the 70’s.

Keep in mind that one of the great benefits of cheap computing power is the luxury of using that power to provide kestroke-level interactivity. The best examples of this are Intellisense in Visual Studio or on-the-fly spell check in word processors. With slower CPUs, UI validation and other operations were typically only performed during focus transitions (for example, when the user tabbed from field to cield). With even slower CPUs, validation was only performed at the transaction level (when the user pressed OK to submit the form/dialog).

With crude Web-based apps, it doesn’t matter that you’re running a two-processor, quad-core, system; validation only happens when you press OK (or for the leather crowd, Submit).

RIAs are supposed to solve this problem. BTW, I’m using RIA, because I want to talk about this without specifying a particular technology, just yet. A RIA, after all, could be implemented various ways:

  • As a java applet
  • As a Flash application
  • Using vbscript/jscript client-side scripts

I find something inherenly wrong with the first two approaches. Java applets are often plain old ugly. Your browser page marks off a big square while the browser goes off and loads some stuff that clearly belongs to another world. The buttons and fonts look wrong. Subtle things (menus, etc.) behave differently. Yuck. Flash applications are better looking, but I hate web sites that are 100% Flash. You find youself deep in the site and want to save a bookmark but, lo and behold, your URL is the same URL you started with. Everything you’ve done has taken place in Flash.

The last approach is the one taken by Ajax. You write conventional web applications (in Java or .Net or whatever) but the pages that you deliver to web browsers contain client-side scripts to provide a more keystroke interactive experience. If you’ve used recent versions of Google Mail or Yahoo or Hotmail, you’re experienced this approach. In addition to client-side scripting, Ajax also performs XML-based Web requests to “quietly” get additional information without having to perform disruptive page transitions.

The concept of client-side scripting is good. My objections to it are in its execution. Vbscript and jscript are poor substitutes for the full-fledged language features found in Java or .NET languages. Typelessness might be “in” with compiler developers but it results in tremendously flaky code. Additionally, the object model available to client-side scripts is the DHTML object model. DHTML might have been fine in 1998 but it’s very limiting 10 years later.

Silverlight 2.0 promises to change this. It will allow client-side scripting to be done with any .NET language. I am a C# fan, but you can use VisualBasic or PHP# or Python# or whatever other .NET language you like. Silverlight will provide a rich visual object model for UI programming. It provides a nice visual programming tool for UI design that’s based on XAML and the new Windows presentation layer.

While all of this sounds very Microsoft-centric (and undoubtedly, it will take a while for Firefox and Safari to catch up), Microsoft has tapped the Novell Mono folk to provide Moonlight – an implementation of Silverlight on non-Windows systems.

So far, I’ve mostly read about Silverlight 2.0. I hope to spend some time taking it through its paces, soon, and will report on this once I do. If Microsoft has provided a good continuum between Winforms, Webforms and Silverlight (not to mention Mobile apps), it will have truly accomplished a great thing.

Usability Testing

Saturday, June 28th, 2008

We’re performing another round of usability testing at Likewise. This is something that’s been critical to our success and, now that our products are getting more complex, it’s time to do it again.

I don’t remember exactly when usability testing came into vogue. I remember doing it way back in the early 1980s when I was working at Hewlett-Packard. This was in the early days of “WIMP” user-interfaces (Windows, Icon, Menu, Pointer). I remember asking users to edit some text in a word processor and they kept clicking on the “Edit” menu trying to find some editing commands other than “cut” and “paste”.

We had a nice lab at HP. This was in PSD, the “Personal Software”, division. We developed a suite of office software, HP Calc, HP Drawing Gallery, HP Access, etc., for the HP Touchscreen and, later, the Vectra personal computers.

The lab would simultaneously capture screen output and a video of the user’s facial expressions. These would be combined “in post” (postproduction) into a picture-in-picture composite videotape. You’d hear the user describing what he/she was asked to do (we asked them to vocalize their thoughts) and then you’d see how they’d try to achieve it using your UI. Their facial expressions were invaluable to understanding where they were getting confused.

At Microsoft (where I worked from 1987-1998), we did similar usability testing but we often farmed it out to other companies. The data capture was excellent but I thought that we missed out by not having developers present during the actual testing. I also think the company was less than diligent in learning from its testing.

I remember Microsoft hiring a couple of Stanford professors (Nass and Reeves) to design and evaluate our “social” user-interface: i.e. “Clippy” and kin. They’d designed the little popup characters that would “help” you in MS office. What I most remember about the professors was the startling data they’d collected when performing usability testing. The data said that the most popular “character” was “none.” Not Clippy. Not the dog (who came in a distant second). None.

Somehow, I expected this to pretty much kill the concept but, I guess, after you spend that much money on consultants you feel compelled to go on with it anyway. 

BTW, here’s a paper, by a Nass student, analyzing why people hate Clippy.

At Likewise, we don’t have money for a fancy lab but, what we lack in money, we make up for with enthusiasm. A video projector, computer and video camera are all it takes. What we do is to take the computer output and feed it to a video projector that shows the computer screen behind the test subject. We then aim the video camera at the user, simultaneously capturing the screen output behind the user. Total cost: $50 in gift certificates and $10 in pizza. Value: priceless.

I spent a little time experimenting with Camtasia as an alternative to our low-tech approach. Camtasia is an excellent “demo capture” program that can record screen activity directly to Flash video. It can also record microphone and video input, too. The video input is automatically composited as a picture-in-picture – exactly what I wanted!

Alas, the quality of the recorded video (the actual PIP camera-based video) is not as good as I’d like. The screen activity looks great but the camera input looks like 15 fps at best. We’re going to experiment with video settings and, maybe we can improve this. If we can get it to work, it’d be even easier than the projector setup that we’ve been using.

Update: we bought a cheapy Logitech web cam and are getting much better results than with our nice 3 CCD Sony video camera. It may be because we can better control the size and frame rate or, maybe it’s a codec thing. Regardless, with the new camera, the Camtasia approach works pretty well. As an “extra added bonus”, the cheapy web cam is able to perform face tracking. It has a little motor that pivots the camera around to face you. Effective, but creepy.

Don’t let the technology get in the way of the testing. At worst, you could have users bang on your software while you literally looked over their shoulders. It’d still be a valuable exercise. You’ll learn a tremendous amount by watching users struggle with things you thought were obvious. If you’re smart, you’ll also realize that, no matter what your excuses are, you’re wrong. The proof is in the video pudding. Fix it.