Wednesday, January 3, 2024

History and classification of the Unix-like OS family

(this is a slightly edited transcript of a YouTube video)


The Unix-like family is the largest OS family by far, with hundreds of distinct  systems across multiple subfamilies being released throughout its over 50-year history, and by far the most influential, with most non-Unix-like OSes borrowing at least a few Unix concepts.

History

Research Unix

The history of Unix starts in 1969 with an initially unnamed OS for the PDP-7 written at AT&T Bell Labs by Ken Thompson and Dennis Ritchie. AT&T had been in a partnership with MIT and GE on the Multics project, which was one of the first projects to develop a commercial OS for interactive timesharing, but had recently withdrawn because they were dissatisfied with the progress of the project. Multics would later go on to experience modest commercial success, but was never especially popular. Thompson and Ritchie had done a fair bit of work on Multics and decided they wanted to write their own OS after AT&T pulled out. 

Multics 12.8 (2023); Honeywell DPS-8M (SIMH, dps8m fork)

The name Unix, a pun on Multics, was suggested a bit later on, since the system borrowed several features from Multics but was much simpler. This version, which could be called version 0, was extremely primitive and lacked many features that are universal in later Unix versions, such as pathnames. 

AT&T Research Unix V0 (1970?); DEC PDP-7 (SIMH)

In 1970, Unix was reimplemented for the PDP-11 as version 1, at which point it gained pathnames and several other features now taken for granted. 

AT&T Research Unix V1* (1970-72); DEC PDP-11/20 (SIMH)

*actually not pure V1; this is a reconstruction with a V1 kernel and user programs that are closer to V2

In 1974, Unix was mostly rewritten in C for version 4 (the previous versions had all been written in assembly). These early versions of Unix are all considered to be Research Unix. The most widely distributed versions of Research Unix were versions 5 through 7, with everything before and after barely being distributed outside AT&T. 

AT&T Research Unix V5 (1974); DEC PDP-11/45 (SIMH)

AT&T Research Unix V7 (1979); DEC PDP-11/45 (SIMH)
 
AT&T Research Unix V7

Side branches take over

In the late 70s, Unix development split into a few branches. Research Unix development continued until version 10 in 1989, with the post-V7 versions focusing on extensibility and networking. Unfortunately, since they were barely distributed externally, Research Unix versions after V7 had relatively little direct influence on anything else. The side branches and clones that ended up becoming the mainstream of Unix added networking support in much uglier ways and they barely implemented any of the extensibility features.

AT&T Research Unix V9 (1986); Sun-3 (TME)

AT&T Research Unix V9


By the early 80s, Unix had started to gain significant popularity due to a number of factors, with the main ones being its high degree of portability, the relatively clean and powerful yet practical architecture, and the availability of an inexpensive source license. Due to their near total monopoly on telecom in the US at the time, AT&T had been under a consent decree that among other things prevented them from commercializing anything computing-related, which kept the cost of Unix source licenses quite low. Even Microsoft had their own Unix called XENIX, which was one of the most popular Unices in the early 80s. 

TRS-XENIX 1.3.5 (1984); TRS-80 Model 16 (trs80gp)

TRS-XENIX 1.3.5

TRS-XENIX 1.3.5

However, this situation wouldn't last long, since AT&T was forced to divest their local phone service operations in 1984, resulting in the lifting of the consent decree. Almost immediately after this, they both started selling their own line of computers running Unix as well as significantly increasing the price of Unix source licenses. While there was little immediate effect on the popularity of Unix since hardware that could run it well was still rather expensive and the OS license was still cheap relative to the hardware, this would later go on to limit Unix adoption once the cost of hardware that ran it well decreased. 

For example, compare the prices of the low-end Sun 3/50 machines to SunOS...

 
with the price of mid-range Micron PCs and UnixWare Personal Edition a bit later on

It was also probably a major factor in Microsoft abandoning their plans to eventually make XENIX their primary OS around the time of the AT&T divestiture.

Standardization efforts and the Unix wars

The still-growing popularity of Unix in the 80s led to a proliferation of different variants with varying degrees of incompatibility with each other. By the mid-80s, there was a significant push to standardize Unix APIs and commands at the functional level as well as produce a common code base for vendors to build on as part of a larger movement for open standards. The functional standardization efforts would result in the first version of the X/Open Portability Guide in 1985 and the first version of the POSIX standard in 1988. Whereas the XPG standardized a relatively complete Unix environment, POSIX covered only a subset and was intended to be easy to implement on Unix-like OSes as well as compatibility layers on other systems.  Unfortunately, the efforts to build a unified reference code base for Unix resulted in what were known as the Unix Wars, with two groups competing to build their own merged system. The first was a partnership between AT&T and Sun. Other Unix vendors feared that partnering with AT&T would give Sun an unfair advantage, so many of them formed the Open Software Foundation in order to create their own merged Unix. AT&T and Sun would then expand their partnership into their own consortium called Unix International to oppose the OSF. This division into opposing factions was another factor besides the high cost of source licenses that served to limit the popularity of Unix and later on allow Windows to start making significant inroads into the previously Unix-dominated workstation and server market segments in the 90s. Unix International would eventually go on to merge with the OSF in 1994, although much of the damage had already been done by then. AT&T sold the Unix copyrights and trademarks to Novell in 1993, having partially spun off Unix development into a semi-independent subsidiary in 1989, and Novell would transfer the trademark but not the copyrights to X/Open a few years later. X/Open and the OSF would merge to form the Open Group, who still own the Unix trademark to this day.

Free and open source Unices 

The beginnings of Linux and GNU

Around the end of the Unix wars, the first fully open-source Unices were released. The first effort to make a completely free and open-source Unix-like OS, specifically the GNU project, had actually started in the mid 80s, but had only completed the C compiler and library, the shell, and the non-kernel-specific utilities by the early 90s. In 1991 a college student named Linus Torvalds would start work on his own Unix-like kernel, Linux. This would be combined with the completed parts of GNU along with a collection of new Linux-specific utilities to produce a complete system. 

Linux 0.01 (1991); x86 PC (QEMU)

Linux 0.01

Linux 0.11 (1991); x86 PC (QEMU)

Linux 0.11

Yggdrasil Linux/GNU/X 1992 alpha (one of the first full-featured Linux distributions); x86 PC (86Box)

Linux takes over

Linus hadn't originally intended Linux to be anything more than a small hobby project, but it was the most easily available free Unix-like OS, and by the late 90s, was relatively mature and had started to receive attention from various companies, going on to take over the majority of the Unix-like OS market by the late 2000s. 

Red Hat Linux 6.0 (1999); x86 PC (QEMU)

By this point most proprietary Unices had been in a long slow decline for a while, with the main factors being the advance of commodity PC hardware that was cheaper than the proprietary platforms that most proprietary Unices required and Microsoft's aggressive efforts to market Windows. The dot-com crash of the early 2000s also contributed as well. Were it not for Linux, we may have ended up in a world where macOS was the only Unix-like OS not relegated to niche uses and legacy systems.

Free/open source BSDs

The other major effort to develop an open-source Unix OS around this time was 386BSD, which unlike Linux can trace its lineage to Research Unix but had removed nearly all Research Unix code.

386BSD 1.0 (1994); x86 PC (QEMU)
 
386BSD 1.0

386BSD itself would discontinue development in the mid 90s, but a few other projects like NetBSD and FreeBSD forked it and continued development. 

NetBSD 0.9 (1993); x86 PC (QEMU)

NetBSD 0.9

FreeBSD 1.0 (1993); x86 PC (86Box)

However, these systems were never quite as successful as Linux even though they remain active to this day. A lawsuit from AT&T over remaining Research Unix code in the early 90s had cast a shadow on BSD-based OSes and slowed down their adoption significantly. They would later reach a settlement that allowed them to continue development as long as they removed any remaining Research Unix code, but would never quite catch up with Linux.

A decade after AT&T's lawsuit over BSD, Linux would be hit with its own series of lawsuits from the SCO Group, who had acquired the rights to administer Unix licensing from Novell. These were over alleged copying of System V Unix code into Linux by IBM and others. By the time these lawsuits were filed, Linux was already in quite widespread use and there was widespread press coverage of them as well as fears Linux would be significantly impacted. However, no System V code besides one trivial unused section was ever found and all of these suits would eventually be resolved against SCO, with relatively little effect on Linux's popularity.

Classification

Genetic Unix 

Unix-like OSes can be divided into several nested families. The innermost is genetic Unix, which includes the original Research Unix and all OSes directly descended from it.


Genetic Unix can be divided into three major branches, with Research being the first. The second is USG Unix, more commonly known as System III and System V, which replaced Research as the focus of Unix development at AT&T in the early 80s. Various divisions of AT&T outside Bell Labs had their own forks of Research Unix in the mid 70s, and USG  was the AT&T Unix Support Group's effort to merge these into a single system.

AT&T 3B2 System V R3.2 V2 (1988); AT&T 3B2/400 (SIMH)
 

AIX, HP-UX, IRIX, and Solaris are a few notable examples of USG-based Unices.

AIX 4.3.3 (1999); IBM RS/6000 40p (QEMU)

HP-UX 9.10 (1996); HP 9000/370 (MAME)

IRIX 6.5.22m (2003); SGI Indy (MAME)

Solaris 9 (2002); Sun SPARCstation 5 (QEMU)

BSD, the Berkeley Software Distribution, is the third major branch, which originated as a series of enhanced utilities and libraries by the University of California at Berkeley for Version 6 Unix in 1978, eventually being fully split from Research Unix a year later.

4.2BSD (1983); DEC
 

Later versions of BSD gradually replaced all code from Research Unix with original implementations, with all AT&T code being removed by the early 90s. The most popular BSD-based OS by far is Darwin, the basis for all of Apple's current OSes like macOS and iOS. 

Mac OS X 10.3 (2003); Power Mac G4 (QEMU)

iPhone OS 1.1 (2008); iPod Touch (QEMU)

Other BSDs include the open-source ones like FreeBSD and NetBSD as well as older proprietary ones like SunOS.

NetBSD 3.0 (2005); x86 PC (86Box)

SunOS 4.1.4 (1994); Sun SPARCstation 2 (TME)

Versions 8-10 of Research Unix incorporated some of the enhancements from the side branches, especially BSD, but also had a fair bit of original code.

There's also a much smaller branch of Genetic Unix called OSF/1, which was the OSF's effort to merge BSD and System V to produce a standard system during the Unix wars, but the only vendor to deploy it widely was DEC. 

DEC OSF/1 2.0 (1992); DECstation 5000 (GXemul)
 

Despite the OSF having more members, nobody besides DEC fully re-based their existing OS on OSF/1, although quite a few did incorporate components from it. On the other hand, Unix International's merged Unix, System V Release 4, would go on to be deployed fairly widely by multiple companies.

Conventional functional Unix

Moving outwards, the next subfamily of Unix-like OSes is conventional functional Unix. This includes genetic Unix as well as all OSes that don't include genetic Unix code but have directly cloned its architecture. 

The most well-known example of such a system by far is Linux, but there have been quite a few others, such as Coherent, LynxOS, and SerenityOS. The first such clone Unix of which I am aware is Idris from the mid-70s.

Coherent 4.2.14 (1994); x86 PC (PCem)

LynxOS 4.0 (2002); x86 PC (86Box)

SerenityOS 1.0 (2023); x86 PC (QEMU)

CoIdris 2.37 (1985); x86 PC (QEMU)

CoIdiris 2.37

Functional Unix

The next subfamily out from conventional Unix is functional Unix in general. In addition to conventional Unix, this includes OSes that have significantly different architectures, but still have a primary environment that is highly compatible with conventional Unix in terms of API and shell commands. 

Usually when I talk about "Unix" without any qualifiers, I tend to mean functional Unix, since this is usually the most relevant subfamily when it comes to use and application development. Notable examples of such systems include QNX 4 and later, Minix, and GNU/Hurd

QNX Neutrino 6.1.0 (2001); x86 PC (QEMU)

Minix 3.4.0rc6 (2017); x86 PC (QEMU)

Debian GNU/Hurd 20221029; x86 PC (QEMU)

BeOS and Haiku could also arguably be placed here but they would be much closer to the outer edge than the inner one.

Haiku R1 Beta 4 (2022); x86 PC (QEMU)

Unix-like systems in general

The outermost subfamily is Unix-like systems in general. In addition to functional Unix, this includes systems that have significant incompatibilities with functional Unix, but still have a strong architectural resemblance to Unix. 


 

The first such system of which I am aware of is Thoth, which like Idris is from the mid-70s. The most notable example of such a system is probably Plan 9, written by the authors of Research Unix as a sort of successor, but ultimately held back by several factors. 

Plan 9 4th Edition (2004); x86 PC (QEMU)

Other examples of such OSes are QNX 2 and earlier, Microware OS-9 (not to be confused with the completely unrelated and non-Unix-like Mac OS 9), Amoeba, and Domain/OS.

QNX 2.21 (1989); x86 PC (QEMU)

OS-9/x86 6.1 (2017); x86 PC (QEMU)

FSD-Amoeba 2002a; x86 PC (QEMU)

Domain/OS SR10.4.1 (1992); Apollo DN3500 (MAME)

Trademark Unix

There is also another category of Unix-related systems, specifically trademark Unix, which is any OS where the vendor has paid the Open Group to certify that it passes some version of their test suite. 

However, this can apply to compatibility layers on otherwise non-Unix-like OSes, so in my opinion it isn't quite a true subfamily, since certification is the only thing defining it rather than code base or basic architecture. This category has been in decline as of late, with several vendors letting the certification of their OSes lapse (see the list of Open Group-certified Unix implementations in 2019, 2021, and 2023). This seems to be mostly due to the dominance of Linux, of which almost all distributions lack trademark Unix certification.

Design philosophy and architectural features

Some of the general tenets of the Unix design philosophy include: Write programs that do one thing and do it well, and write them to work together. Related to this, keep primitives as simple as possible while remaining sufficiently general. Use text streams wherever possible. Avoid verbose output unless specifically requested, but also try to make failures easy to diagnose. Separate mechanism and policy as much as possible.

Architectural features common to basically all functional Unices include hierarchical nested subdirectories with a single root directory and filesystems that can be mounted on arbitrary subdirectories, a purely stream-based file model that leaves structure to user programs, devices presented as special files, preemptive multitasking with low process spawn overhead, multiple user-mode shells with support for arbitrary programs as commands, redirection of command input and output to arbitrary files, and pipes for connecting input and output of commands.

4.3BSD-Reno (1990); MicroVAX 3900 (SIMH); showing directories and mount points

4.3BSD-Reno; device files and background processes

4.3BSD-Reno; set command under the the default interactive shell, csh

4.3BSD-Reno; set command under sh (the default shell for scripts), redirection, and pipes

Most of the Unix-like OSes outside the functional Unix subfamily include most of these features as well although a few are missing one here and there. 

QNX 2.21, showing DOS-like filesystem model with drive numbers and reserved names for devices

In addition to the features common to functional Unix in general, features common to conventional functional Unices include a monolithic kernel architecture, device files identified by major and minor number located on a disk or some other backing store, and a security model based on a root account that bypasses filesystem permissions and various restrictions on non-file-related system calls, although some conventional Unices have significantly modified security models, often as an optional feature. All of these features may be found in some Unix-like OSes outside the conventional functional Unix subfamily as well, but that isn't always the case with such systems.

4.3BSD-Reno; kernel strings showing disk drivers built into the kernel

4.3BSD-Reno; on-disk device files identified by numbers; attempting to mount a disk as a regular user and as root


The different branches of genetic Unix have various different features that are typical of each like particular filesystems, networking APIs, and options to commands, although there has been a lot of cross-pollination between the different branches, with features that originated in one being fairly common in later versions of the others.

e.g. the ps command has different options under a BSD Unix like 4.3-Reno...

than it does under a USG Unix like AT&T System V R2.2

The majority of Unix-like OSes are written in C, although a few are written in assembly, and some are written in other languages. C is virtually always well-supported for user programs except on the early Research Unix versions that predate C.

Softlanding Linux System 1.0 (1992); x86 PC (86Box); showing C kernel source

AT&T Unix V1; showing PDP-11 assembly kernel source

Redox 0.6.0 (2023); x86 PC (QEMU); showing Rust kernel source


Outside the Unix-like family, the majority of OSes that have experienced active development in the mid-80s and later have some form of Unix influence. This ranges from a few borrowed architectural concepts to complete compatibility layers. By far the most common borrowed Unix feature is hierarchical subdirectories which actually predate Unix but only became popular after Unix did. DOS and Windows borrowed pipes and redirection, as well as a rather hacky and limited implementation of device files, among other features. 

MS-DOS 6.22 (1994); device files, redirection, pipes, subdirectories

MS-DOS 6.22; pipes, subdirectories

OSes with available Unix compatibility layers include Windows, later versions of OpenVMS, and later versions of MVS like OS/390 and z/OS. The MVS compatibility layer even has trademark Unix certification.

Windows XP (2001) with Services for Unix 3.5 (2004); x86 PC (QEMU)

OpenVMS 6.2 (1995); MicroVAX 3900 (SIMH)

OS/390 2.10 (2000); IBM S/390

Conclusions

To conclude, I think the success of the Unix-like family is in part due to the Unix architecture being relatively clean and quite powerful while also being quite practical and applicable to a broad range of use cases, with source availability and portability to a wide range of hardware also being significant factors. This is quite unlike some other OS families, like the DOS and Windows family which has succeeded more due to extremely aggressive marketing than technical merit, although NT- and CE-based versions of Windows are actually fairly architecturally competent at the kernel level at least. Of all the various OS architectures that have been designed over the years, I'd say the functional Unix architecture probably offers the best balance between architectural purity and practicality. That being said, I do think that the side branches and clones that took over the mainstream Unix world in the late 70s have significantly lost sight of the original ideals of Research Unix in many ways. Many of the problems with modern Unices might be less significant had the extensibility features of Version 8-10 Research Unix been more widely adopted. The original Unix developers did release Plan 9 as a successor to Late Research Unix that expanded on those extensibility features, but it never really served as a successor to Unix in general due to a number of factors, with the main ones being incompatibility with functional Unix, excessive minimalism in many places, and the original licenses under which it was released being rather restrictive. To be fair, I don't think they were really trying to make a general successor to Unix though. 

I do think that it is possible to write an OS that fixes the vast majority of the issues with existing conventional Unices while still retaining compatibility with the majority of software for them. Hopefully my own OS will be at least somewhat successful as a modern Unix-like system for general use that actually tries to follow the Unix philosophy as much as possible.

No comments:

Post a Comment

QNX Neutrino 6.1 (2001) - Widely successful but underappreciated outside embedded (part 2 - tour of installer and running system)

(this is a slightly edited transcript of part 2 of my YouTube video review of QNX 6.1; here is part 1 )   Installation Native installer QNX...