Why (OS) Architecture Matters

In previous posts (for example, What Linux Needs to Learn From Microsoft) I’ve complained about the lack of management APIs in Linux and UNIX. Microsoft Windows has a comprehensive set of management APIs for a variety of OS features:

  • Networking
  • File/printer sharing
  • Event logs
  • Registry I/O
  • Service control
  • etc.

Moreover, each of these APIs typically works locally (on the machine the call is made) or remotely (on another machine). How is it that such a comprehensive set of APIs came to be and that these API should so consistently support remote management? The answer to both of these questions is by design.

Let’s understand what this means by going back in time to the late 80’s and early 90’s. I had just arrived at Microsoft back then (from HP). I started working on the Windows 1.04 SDK and then later moved to the OS/2 project.

In the MS DOS (pre-Windows) days, the concept of an API, let alone a network-aware one, was very crude. MS DOS programs accessed the operating system by performing “software interrupts”. An application would set up the x86 registers and would then perform an “Int 21”. The OS would field the interrupt and interpret the registers to determine what to do.

Programming language libraries, for example, the “C run-time library”, added the first primitive API to MS DOS. Instead of performing an Int 21, you could call “open()” or “fopen()” to open a file.

With Windows 1.0, the number of APIs greatly increased. if I recall, Windows added about 300 API including its “kernel”, windowing and graphics features. This number would eventually grow to 3000 in the Windows NT days.

When OS/2 arrived, something else had arrived with it: the local-area network. Networks had been available for some time (Novell Netware, Windows-for-Workgroups, MS DOS LANMAN). OS/2, however, introduced the concept of an operating system that was fundamentally network aware and could provide network services. Unlike Netware, OS/2 was a general-purpose operating system that could behave as a centralized server, too. OS/2 enabled the development of client-server applications.

The presence of networks and of network services necessitated the development of remote management APIs. For example, when a user typed:

net view \\fileserve1

The net utility had to be able to “talk” to fileserve1 and “ask it” what resources it was sharing. At the API level, the operating system needed to provide a NetShareEnum function that included a servername parameter indicating what server was being queried.

How did OS/2 “talk” to servers? Did it have some special protocol talking on some dedicated port just for this purpose? No. OS/2 was built on top of a basic RPC (remote procedure call) protocol that worked “on top” of named-pipes.

Now, I can’t provide all the details and history about how RPC progressed from OS/2 to today. My colleague at Likewise Software, Krishna Ganugapati, however, can. I have a link to his blog on my blogroll.

In Windows NT (later, Windows 2000, Windows XP and now Windows Server and Vista), the RPC mechanism became very formalized. Just about every OS API was written atop RPC. You would define your API using the IDL (interface definition language) and compile it (using the MIDL compiler). The compiler would generate a series of client and server-side stubs that would completely hide the details of RPC. You didn’t have to worry about marshaling arguments or about handling the transport layer. The RPC run-time would take care of communications over TCP/IP or over named pipes (maybe even netbios, too).

Local calls would skip the networking layers but the RPC mechanism would further server to marshal between user and kernel level code.

The key lesson here is that Windows NT was built from the ground up with an RPC, distributed system architecture. UNIX is not (although Sun was an early pioneer in RPC technology and still a key contributor). Linux is not. Neither is Mac OS X/Free BSD.

Admittedly, the RPC libraries might be available for these platforms (altbeit in a neglected state; we’ve made several fixes/improvements to them). The operating systems themselves are not designed from a distributed perspective.

Thus, it’s not suprising that management APIs, especially remote management APIs do not exist for these platforms. The respective vendors are all working on WS-Management or WBEM or some other protocols to help with the situation but they are still, fundamentally, in a weaker position than the Windows NT progeny.

At Likewise, we develop interoperability software that allows UNIX, Linux and Mac OS X computers to work well on Windows networks. In order to provide this software, we have to work with services on Windows that expect to be invoked over RPC.  Although, at first, we “hand-rolled”, these RPCs our most recent software is based on a full DCE RPC implementation. This has greatly facilitated our work and is allowing us to provider further interoperability features between Windows and non-Windows systems. Sometime soon, we will be releasing this software with both our open source and proprietary software products.