The Linux Guide Online

Chapter 01 - Introduction and History of Linux

1.1 Introduction

1.1.1 Operating System

WITHOUT SOFTWARE THE computer is basically a lump of plastic and metal. It would not be able to store or process information. It would not be able to respond to the user. It would in fact be dead. To make the computer what it is, the software forms the most important component of the computer system. The primary and most fundamental of these software entities is the Operating System (OS) which has complete control over all of the system resources and forms the base on which later applications are executed.

Modem Computer is a complex combination of highly variant hardware pieces. A computer may contain a variety of components like processors, memory, storage devices, I/O devices. Each of these can be further divided into a number of types. For example the I/O devices may be keyboards, monitors, mice, scanners, printers, cameras, mobile devices and what not. To further complicate things there are a number of manufacturers who make these devices and therefore each of these devices has different means by which to access them.

Imagine a situation without an OS. And imagine you were writing a program to add two numbers and print the sum. You would have to include in your program code to detect what hardware is present on the client system. Then you will have to have proper functions written to access the various keyboards that are available in the market. You will then have to convert the signals from the keyboard into meaningful information. Then you will have to execute the operation taking care to move the data from memory to Registers and back to memory. After that you will have to Detect and initialize the monitor of the user and present him with the output. And your program will be perhaps 200 MB. And you still will not be able to take care of future models and makes of keyboards and monitors that come out into the market.

Though the above situation is simplistic, (and not very accurate) it still suffices to emphasize the importance of the OS. The Operating System acts as the first software layer above the hardware that manages all parts of the hardware all by itself. It then presents the user with a virtual machine (an interface) that can be used by the user or any application that does not change with the underlying hardware. This uniformity of environment is perhaps the biggest reason for the OS to evolve.

Looking at the case of your program again. Now all you have to do is request the OS to get two numbers from the user. You would then add them up with all memory management done by the OS. You would then again request the OS to print the result. You do not care about the underlying hardware or the method by which the OS accomplices your requests. Moreover your OS would handle future additions to hardware without you having to bother about it.

The above example talks only of two relatively simple (!?) pieces of hardware. Think for a moment about the complexity that would come in if you had to say access a floppy. Think of having to start the motor of the floppy through the controller, checking its status, moving the read-write head and positioning it on the correct sector, reading signal streams. The OS makes all this easy for the application. It provides the program an interface. It for example allows you to read data as files, thereby allowing the program to achieve all that is required in a single call to the OS. Thus logically through the concept of files and operations like reading and writing to them, the OS abstracts and insulates the program from the hardware. The OS hence creates a virtual or extended machine, with its own set of different (and much simpler rules) which can be programmed more easily to indirectly control the underlying hardware.

1.1.2 Multi-user and Multitasking

Modern OSes are more than just an abstraction interface. This is a more user point of view. Modern machines allow more than one user to use the machine simultaneously. It also allows these users to run more than one application at a time. In this environment the OS acts as a resource manager. It is the arbitrator that decides who gets to use what device. It allocates the very critical resources of processor time and memory between executing programs. It decides who has the control of I/O devices like the keyboard and the mouse. It provides each program with its own virtual environment to reside and execute in. And in the worst of cases it terminates programs that don't follow the rules of this abstracted world, in the interest of stability and well being of other programs. To sum up the OS has the job of keeping track of the various resources, account for their usage and act as an arbitrator between conflicting programs.

A few terms here would be helpful. The term multitasking refers to the Operating System running more that one task (program) at the same time. Modern OSes do this by running various processes and dividing the resources among them. All processes are not allowed to directly manipulate memory. Instead they refer to what are known as virtual memory spaces. Through virtual memory, the programs can not only access data as if they are the only process running in the computer (other than the OS of course) but they can access much more memory than physically present in the system. The OS does all the dirty work of working with the actual memory and mapping manipulations in the virtual memory to the physical operations.

The processor time too has to be managed by the OS among the various processes. The OS divides the processor time into what are known as time slices. Each program is then allocated a time slice for execution. When a program is executing, and its time is up, its state is stored as it is and another program is stored for execution. When this happens a few times and it is again the time for execution of the first program, its state is reloaded into the processor and execution continues at the point left off, as if the break never happened. This is done so quickly (hundreds of times a second) that the user is unaware of it at all.

The Operating System also gives the program, through the extended machine, what are known as system calls. These are multi purpose functions interfaced to the program that allow it to accomplish a task quickly. These device independent calls allow the user to accomplish disk access, printer access etc. These also allow process management, signaling between processes, time management and security.

1.1.3 Unix

The Unix is the oldest and probably one of the best Operating Systems out there. Unix formed the basis for the development of the OS as we know it. Unix was developed by Ken Thompson, Dennis Ritchie and others at the AT&T, and is the trademark of The Open Group. Unix is a real Operating System fulfilling the criteria for multi-user and multitasking. Unix has been ported to many hardware platforms.

Having been developed in academic institutions primarily gives Unix a head start in incorporating the cutting edge of technology. Bought up with the network, Unix has networking built into it and is undoubtedly the best when it comes to networking. Also it was available at a time when 'personal' computers were undreamt of, it has security and privacy built into it.

But among its downsides is the fact that it is costly (too costly for a desktop). And secondly the massive proliferation of Unix gave birth to a number of variant (clones) that caused the differences in the OS to become very widespread and made application development non-standard. All good Unix therefore adopted a common standard later and now adhere to the IEEE POSIX.1 standard.

1.2 What is Linux

Linus Torvalds a Finnish developer wanted a version of Unix for his computer at home. The available free clone of Unix, called MINIX, he found, was inadequate for his needs. He took the easy way out (which is what differentiated the present Linux users from other OS users) and started to write one himself. He posted requests for help on the Usenet requesting some information about the Posix systems for the x86 systems. He developed the Linux kernel and released the first version 0.0.2 in 1991. Then he not only put the entire source code on the web but invited people to take it off and develop it further. And what happened later is, as they say, history.

The unique method of cooperative work on Linux added its own strengths. People could borrow, share and modify other people's ideas. Time was there fore spent in not reinventing the wheel. Working on the kernel and the other applications continued in parallel and Linux grew from strength to strength.

This strangely however did not lead to any chaos during the development as would normally be expected. The spirit that drove thousands to spend their free time in building and modifying the OS also ensured that the work would be done according to standard. With Linux falling under the Open General License (OGL) various people all over the world carefully monitored all changes. The same multitude also helped in testing to such an extent that most of the bugs in software were discovered during the initial phases.

And no longer is the myth that Linux is best left to geeks true. Though initially Linux was the products of people with a very strong technical background, great pains are being taken to allow even ordinary users to harness the incredible power of Linux. With the introduction of the Xserver, Linux boasts of a full-fledged GUI, giving it the much needed user friendliness. Once the early learning is accomplished exploring Linux is very easy and rewarding. This book is therefore aimed at building that initial information database that would embark you on the great journey to explore the fascinating world of Linux later on.

1.3 Advantages of Linux

1.3.1 Full fledged OS

Make no mistake about it - Linux is good. And is very comparable with the very best out there. Linux, which till a few years back was restricted to the academic institutions, is now not only active as a desktop alternative but also forms a viable corporate alternative. In fact the Apache server running on a Linux server probably forms that largest percentage of the web servers on the Internet.

There are a large number of applications currently available for Linux. There are databases, web servers, networking tools, office suites, Windowing interfaces and last but not the least games for this platform.

Virtually every utility one would expect of a standard UNIX implementation has been ported to Linux, including basic commands like ls, awk, tr, sed, bc, and more. The familiar working environment of other UNIX systems is duplicated on Linux. All standard commands and utilities are included. Many text editors are available, including vi, ex, pico, jove, and GNU emacs, and variants like Lucid emacs, which incorporates extensions of the X Window System, and joe. The text editor you're accustomed to using has more than likely been ported to Linux.

Most of the basic Linux utilities are GNU software. GNU utilities support advanced features that are not found in the standard versions of BSD and UNIX System Vprograms. For example, the GNU vi clone, elvis, includes a structured macro language that differs from the original implementation. However, GNU utilities are intended to remain compatible with their BSD and System V counterparts. Many people consider the GNU versions to be superior to the originals.

From the technical point of view, Linux easily measures up to all the specifications of an OS specified earlier. It is pre-emptive multitasking. It allows multiple users. Being a Unix clone means that it has security built into the system. Networking too is a strong point of Linux.

The Linux system is mostly compatible with several UNIX standards (inasmuch as UNIX has standards) at the source level, including IEEE POSIX.1, UNIX System V, and Berkely System Distribution UNIX. Linux was developed with source code portability in mind, and it's easy to find commonly used features that are shared by more than one platform. Much of the free UNIX software available on the Internet and elsewhere compiles under Linux ``right out of the box.'' In addition, all of the source code for the Linux system, including the kernel, device drivers, libraries, user programs, and development tools, is freely distributable.

Other specific internal features of Linux include POSIX job control (used by shells like csh and bash), pseudoterminals ( pty devices), and support for dynamically loadable national or customized keyboard drivers. Linux supports virtual consoles that let you switch between login sessions on the same system console. Users of the screen program will find the Linux virtual console implementation familiar.

The kernel can emulate 387-FPU instructions, and systems without a math coprocessor can run programs that require floating-point math capability.

1.3.2 Free for use

Linux has many of its components covered by the Open General License (OGL). This is a unique and very powerful method of copyrighting that allows software to be modified that does not infringe upon the copyrights of the initial author.

What this means to the user is that since everyone contributed to it, Linux is not officially owned by anyone. Linus Trovalds owns GPL one some parts of it and a number of people hold a GPL on others. And since it is not a copyrighted product it can be installed, copied and distributed freely any number of times. Without these restrictions the concept of piracy does not exist on Linux.

This naturally leads to the question 'Where is the money if all this is free?' GPL only states that you cannot be charged for the software but you can be charged for a number of accompanying items like support, distributions etc. This process is 'copyleft' as opposed to 'copyright' on the software. This has spawned a number of Linux distributions. One should keep in mind that no matter what the distribution is, the software is still Linux. The kernel still is one of the official versions approved by Linus Trovalds. But the accompanying applications and installation may vary

One additional point in favor of Linux is the fact that along with the Operating System comes the whole of its source code. That is the actual program compiled to produce the kernel and the other modules. This means that you can use this source to not only customize your copy of Linux (specific to your hardware) but also use it to extend the operations of the OS itself. Linux kernel hacking is a fascinating subject of study and a lot can be learnt from studying the code of a well written software like the kernel.

1.3.3 Stable and robust

One thing that Linux offers, which many commercial Operating Systems cannot offer is unmatched stability and resistance to crashes. The core of Linux is the kernel. It is a very well written piece of code that has been tested thousands of times and is nearly bug-free. This kernel has complete control over the system resources. And all processes are directly in control of the kernel. This allows the kernel to regain control over the machine not only when a program misbehaves but also during the crash of an application. No matter what happens it is always possible to signal the kernel to kill the errant process.

This high reliability of the kernel also makes the platform an excellent choice for mission critical applications. System that run intensive applications like servers, databases etc have an excellent choice in Linux. Systems with up-times (time during which the system has been on continuously without a reboot) of weeks and months prefer a Unix clone with Linux being the first choice. People having legacy software that runs on Unix can now use Linux to run it with better results.

1.3.4 Native support for networking

Linux as its predecessor, Unix, has networking built into the operating system. TCP/IP, the protocol that runs the Internet has been an integral part of the creation of Linux. Linux also has a number of tools that not only make the process of networking easy, but Linux also allows the implementation of other software that make full use of the networking facility.

Linux can be used for networking both in an intranet and to the Internet. With telnet, users can log onto and work on Linux machines from other remote machines. It can serve as HTTP or file server. With support for SMTP and POP mailing in never difficult. With the use of Networked File Server and the Network Information Service complete use can be make of network wide resources. Using the SAMBA server Linux can also connect to machines running Windows.

A regular distribution of Linux also has a number of tools that make full use of these capabilities. Tools include regular and graphical configuration tools, browsers, mail and newsreaders etc. There is also support for very secure transfer of files. Tools also exist to monitor and regulate the activity on a machine from a network. Such a rich collection of tools is an added advantage not only to the users but also to the system administrators. And combined with its unmatched stability, Linux offers complete networking solution on one platform.

1.3.5 Efficient file system, virtual memory, libraries

Linux uses a file system called the ext2 native to it. This file system is very easy to maintain, doesn't fragment easily and has security features built in. Linux also supports a number of other file systems that allows it to recognize, read and write into other file systems like the vfat, msdos and the iso9660 (cdrom).

The Linux kernel is developed to use protected-mode features of Intel 80386 and better processors. In particular, Linux uses the protected-mode, descriptor based, memory-management paradigm, and other advanced features. Anyone familiar with 80386 protected-mode programming knows that this chip was designed for multitasking systems like UNIX. Linux exploits this functionality.

The kernel supports demand-paged, loaded executables. Only those segments of a program which are actually in use are read into memory from disk. Also, copy-on-write pages are shared among executables. If several instances of a program are running at once, they share physical memory, which reduces overall usage.

In order to increase the amount of available memory, Linux also implements disk paging. Up to one gigabyte of swap space may be allocated on disk (upt to 8 partitions of 128 megabytes each). When the system requires more physical memory, it swaps inactive pages to disk, letting you run larger applications and support more users. However, swapping data to disk is no substitute for physical RAM, which is much faster.

The Linux kernel also implements a unified memory pool for user programs and disk cache. All free memory is used by the cache, which is reduced when running large programs.

Linux offers asynchronous file operations that allow file I/O to be done in the background, giving control back to the user before than the actual transfer takes place. This extensive use of buffers allows full CPU utilization and reduces wait time for I/O operations.

Executables use dynamically linked, shared libraries: code from a single library on disk. This is not unlike the SunOS shared library mechanism. Executable files occupy less disk space, especially those which use many library functions. There are also statically linked libraries for object debugging and maintaining ``complete'' binary files when shared libraries are not installed. The libraries are dynamically linked at run time, and the programmer can use his or her own routines in place of the standard library routines. These libraries not only reduce the size of a new application but also make the process of maintenance easy and centralized.

1.3.6 Highly customizable GUI

One main contention against Unix and its clones was the absence of a user friendly GUI. The XFree86 X server allows a very stable and customizable GUI environment. This X server only supports the basic functions to accomplish graphical I/O. Thus by leaving the actual look of the system to the various Window Managers, XFree86 offers unmatched customizability. There are a number of managers to choose from allowing users to make the environment specific to their needs. This user-friendly program is in charge of the placement of windows, the user interface for resizing and moving them, changing windows to icons, and the appearance of window frames, among other tasks. XFree86 includes twm, the classic MIT window manager, and advanced window managers like the Open Look Virtual Window Manager (olvwm) are available. Popular among Linux users is fvwm--a small window manager that requires less than half the memory of twm. It provides a 3-dimensional appearance for windows and a virtual desktop. The user moves the mouse to the edge of the screen, and the desktop shifts as though the display were much larger than it really is. fvwm is greatly customizable and allows access to functions from the keyboard as well as mouse. Many Linux distributions use fvwm as the standard window manager. A version of fvwm called fvwm95-2 offers Microsoft Windows 95-like look and feel.

There also exists a programs called the glade and a KDE GUI builder that allows developments of program using the Linux GUI.

1.3.7 A chance to learn the internals of an OS

This is probably the main reason that drives students from all over the world flocking to Linux. Because its source code is given for free it forms a good mode to study how an OS actually works. The practice, in which configuration files are stored as text files, not only allows an intuitive understanding of the concerned application but also gives a chance to tweak performance that is not possible with front-end graphical tools.

Also how many of us can atually run and configure high end server software like Web servers, and mail servers etc. These are available on default Linux distributions. And these are not scaled down versions, they are full-fledged servers which are actually employed on the Internet.

1.4 Drawbacks of Linux

New users often have a few misconceptions and false expectations about Linux. It is important to understand the philosophy and design of Linux in order to use it effectively. We'll start by describing how Linux is not designed.

In commercial UNIX development houses, the entire system is developed under a rigorous quality assurance policy that utilizes source and revision control systems, documentation, and procedures to report and resolve bugs. Developers may not add features or change key sections of code on a whim. They must validate the change as a response to a bug report and subsequently ``check in'' all changes to the source control system, so that the changes may be reversed if necessary. Each developer is assigned one or more parts of the system code, and only that developer can alter those sections of the code while it is ``checked out'' (that is, while the code is under his or her control).

Organizationally, a quality assurance department runs rigorous tests on each new version of the operating system and reports any bugs. The developers fix these bugs as reported. A complex system of statistical analysis is used to ensure that a certain percentage of bugs are fixed before the next release, and that the operating system as a whole passes certain release criteria.

The software company, quite reasonably, must have quantitative proof that the next revision of the operating system is ready to be shipped; hence, the gathering and analysis of statistics about the performance of the operating system. It is a big job to develop a commercial UNIX system, often large enough to employ hundreds, if not thousands, of programmers, testers, documenters, and administrative personnel. Of course, no two commercial UNIX vendors are alike, but that is the general picture.

The Linux model of software development discards the entire concept of organized development, source code control systems, structured bug reporting, and statistical quality control. Linux is, and likely always will be, a hacker's operating system. (By hacker, I mean a feverishly dedicated programmer who enjoys exploiting computers and does interesting things with them. This is the original definition of the term, in contrast to the connotation of hacker as a computer wrongdoer, or outlaw.)

There is no single organization responsible for developing Linux. Anyone with enough know-how has the opportunity to help develop and debug the kernel, port new software, write documentation, and help new users. For the most part, the Linux community communicates via mailing lists and Usenet newsgroups. Several conventions have sprung up around the development effort. Anyone who wishes to have their code included in the ``official'' kernel, mails it to Linus Torvalds. He will test and include the code in the kernel as long as it doesn't break things or go against the overall design of the system.

1.4.1 Inadequate hardware support

All hardware is accessed by means of special software called the device drivers. The manufacturers of a device generally prepare these device drivers for a particular platform. Since Linux was not a commercial venture, it depended on generic drivers to access the hardware or drivers written by somebody across the Internet. Hence it is natural that the number brands that were available for use under Linux was restricted. Linux therefore couldn't work to its full potential in case of unsupported hardware. Also there was a performance loss in partly supported hardware that made its use unsatisfactory.

Things are changing as more and more hardware companies are realizing the power of Linux, and are porting their drivers to Linux. The day is not far off when this will discontinue to be a problem with Linux.

1.4.2 Inadequate customer oriented support

Also due to the above reasons and the fact that the user base of the OS has been low, there has not been extensive customer support. With the rise of companies like Red Hat, SuSe and Caldera things are now changing.

1.4.3 Primarily command line driven

Despite the existence of the GUI with the XFree86 server, many configurations are still done manually in configuration files. There is a lack of good tools for the X system. There also is an acute shortage of user programs like office suites etc that prevent making Linux a popular desktop alternative.

1.4.4 Learning Curve

Also there exists a learning curve for any user making a transition from a regular GUI based OS to Linux, which may be disconcerting.

1.5 GNU Public License

Linux is copy righted under the GNU General Public License (GPL). The main highlights if the license are as follows.

  • The original author retains the copyright
  • Others can use the software as they wish, including modifying it, basing other programs on it, and redistributing or reselling it. The software may even be sold commercially for a profit. The source code must accompany the software as well
  • The copyright cannot be restricted down the line. Any user automatically is bound to impose on other potential users of the software, after modifications, if at all, the same GPL. Hence all derived works automatically falls under the GPL thereby preventing other people from taking credit for the original authors work.

Such a unique licensing is in place because the original authors of Linux didn't mean Linux as a means to make money or profit from it. It was intended to be freely available to everyone without warranty. Although there is not warranty for the software, having problems does not mean that there is no help available. There are a number of sources of information on the web - websites, newsgroups and the Linux User Groups (LUGs). The no warranty only limits the liability of the author.

The GPL has since proved to be the best way of distributing free, but good, software on the net.

1.6 System Requirements

The Linux has some minimum system configuration for it to execute on a platform. Thankfully though this is way below those required by other modern OSes. Because it is not very demanding on system resources, even old machines that have been discarded as slow to use can now be reactivated with Linux.

1.6.1 Intel platforms

The latest requirements and the list of supported hardware are listed at the web site of the various distributions. Such a listing from the website of Red Hat Linux is given below for the Intel platform.

  • Intel 386 or greater, through Pentium Pro and Pentium II
  • 40 MB of hard disk space in character mode, or 100 MB with X Window
  • 8MB of RAM (16 MB recommended)
  • Most video cards are supported
  • SCSI or IDE CD-ROM drive
  • 3.5" floppy drive

1.6.2 SPARC and Alpha etc

Red Hat Linux supports a variety of hardware including the SPARC and the Alpha variety of hardware. The listing of the supported platforms can be found at the website www.redhat.com