Academia.eduAcademia.edu

The Fragrance of Unix

2003

Abstract: This compendium is meant to be a quick introduction to using Unix-similar operating systems. It is intended for people with some computer experience, but little or no experience with Unix systems. Hopefully, you will also find it useful as a reference matter where you can look things up. The author hopes that The Fragrance of Unix will give you a sufficient introduction to Unix systems that you will be able to continue using them on your own.

The Fragrance of Unix Martin Thorsen Ranang V ERSION 1.1 © 2001, 2002, 2003 by Martin Thorsen Ranang. All rights reserved. This document was set in Palatino by the author using LATEX 2ε . The author of this document has used his best efforts in preparing this document. The author shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance or use of the examples herein. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this document, and the author was aware of a trademark claim, the designations have been printed in initial caps or all caps. Unix is a registered trademark of The Open Group The Open Group and X Window System are trademarks of The Open Group Linux is a trademark of Linus Torvalds All other trademarks are the property of their respective owners. The Fragrance of Unix Martin Thorsen Ranang Contents Preface xi Acknowledgments 1 . . . . 1 1 2 3 4 . . . . . . . . . 5 5 6 6 7 8 8 9 9 11 3 Editors 3.1 Emacs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The Vi (or Vim) Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 17 4 Regular Expressions 4.1 The Syntax of Regular Expressions . . . . . . . 4.2 Finite Automata . . . . . . . . . . . . . . . . . . 4.3 String Replacement with Regular Expressions 4.4 Some Examples . . . . . . . . . . . . . . . . . . . . . . 19 19 21 22 22 . . . . 25 26 26 28 30 2 5 Introduction 1.1 Some History . . . . 1.2 Overview . . . . . . . 1.3 Open Source . . . . . 1.3.1 Free Software xiii . . . . . . . . . . . . . . . . . . . . Getting Started 2.1 Logging in . . . . . . . . . . . 2.2 Navigating in the File System 2.2.1 Ordinary Files . . . . . 2.2.2 Directories . . . . . . . 2.2.3 Special Files . . . . . . 2.2.4 Access Permissions . . 2.3 The Shell . . . . . . . . . . . . 2.3.1 Standard I/O . . . . . 2.4 Mental Models . . . . . . . . . . . . . . . . . . . . . . A Step Further 5.1 More on Shells . . . . . . . . . . 5.1.1 The Bourne-Again Shell 5.2 Operating on Processes . . . . . 5.3 Operating on Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi Contents 6 . . . . . . . . . 31 31 32 32 33 33 33 35 35 43 . . . . . . . 47 47 49 49 49 50 50 51 8 Applications 8.1 LATEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 The GIMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 GNOME Office . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 53 54 54 9 Continuing On Your Own 9.1 The Manual Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 The Info Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Other Sources of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 57 59 59 A Resources A.1 On the World Wide Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A.2 Printed Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 61 61 7 Networking 6.1 Telnet and Remote Login . . . . 6.2 Browsing the World Wide Web 6.3 Internet File Transfer . . . . . . 6.4 Mail . . . . . . . . . . . . . . . . 6.4.1 Pine . . . . . . . . . . . . 6.4.2 GNUS . . . . . . . . . . 6.5 Security . . . . . . . . . . . . . . 6.5.1 Pretty Good Privacy . . 6.5.2 The Secure Shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The X Window System 7.1 Window Managers . . . . . . . . . . 7.1.1 Enlightenment . . . . . . . . 7.2 Desktop Environments . . . . . . . . 7.3 GNOME . . . . . . . . . . . . . . . . 7.4 KDE . . . . . . . . . . . . . . . . . . . 7.5 Operating in the X Window System . 7.5.1 The Xterm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography 63 Command Index 65 Index 67 List of Figures 1.1 A conceptual diagram of a Unix system. . . . . . . . . . . . . . . . . . . . . . . 2 2.1 2.2 2.3 2.4 2.5 2.6 5 6 6 7 8 2.8 2.9 How the login screen may look on a terminal. . . . . . . . . . . . . . . . . . . . A typical user prompt. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How directories, file names and inodes are related. . . . . . . . . . . . . . . . . How directories are organized in a UNIX file system. . . . . . . . . . . . . . . An ordinary file, a hard link and a symbolic link. . . . . . . . . . . . . . . . . . The file descriptors normally available to a process and how they relate to the user’s terminal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How the flow of input and output streams behave in sequences of piped commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . How files and directories conceptually are related. . . . . . . . . . . . . . . . . An example of how symbolic links may lead to recursive file paths. . . . . . . 11 12 13 3.1 3.2 3.3 How Emacs looks on a terminal. . . . . . . . . . . . . . . . . . . . . . . . . . . The modes of Vi. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moving the cursor around in Vi. . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 17 18 4.1 4.2 4.3 The syntax for a regular expression he0i. . . . . . . . . . . . . . . . . . . . . . . An automaton generated from the regexp /[Nn][Ss][Aa]/. . . . . . . . . . An automaton constructed from the regexp /na*sa/. . . . . . . . . . . . . . . 20 21 22 5.1 How to list the help information for a command like pwd. . . . . . . . . . . . . 25 6.1 6.2 6.3 6.4 6.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . the user is . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 The Pine main menu screen. . . . . . . . . . . . . . . . . . . . . . . An example .gnus-file. . . . . . . . . . . . . . . . . . . . . . . . . An overview of how public key cryptography works. . . . . . . . The menu of different RSA key sizes in PGP. . . . . . . . . . . . . The second part of the key generation procedure in PGP, where prompted to enter an user ID. . . . . . . . . . . . . . . . . . . . . . 6.6 How PGP ask for a passphrase. . . . . . . . . . . . . . . . . . . . . 6.7 The last stage of the key pair generation in PGP. . . . . . . . . . . 6.8 The contents of the plaintext file ‘test’. . . . . . . . . . . . . . . 6.9 How to encrypt a file. . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10 The contents of the ciphertext file ‘test.asc’. . . . . . . . . . . 6.11 How to decrypt a ciphertext stored in the file ‘test.asc’. . . . 6.12 The help information for PGP version 2.6.3i. . . . . . . . . . . . . vii 10 35 36 38 39 39 39 40 40 41 41 41 42 viii List of Figures 6.13 Creating a pair of Secure Shell authentication keys. . . . . . . . . . . . . . . . 6.14 Entering the Secure Shell passphrase. . . . . . . . . . . . . . . . . . . . . . . . 6.15 Secure Shell key generation confirmation. . . . . . . . . . . . . . . . . . . . . . 44 44 44 7.1 7.2 An example Unix desktop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The logos of GNOME and KDE. . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 50 8.1 8.2 An example of how a LATEX file may look like. . . . . . . . . . . . . . . . . . . The main menu window of the GIMP. . . . . . . . . . . . . . . . . . . . . . . . 54 55 9.1 9.2 The top of the manual page for man. . . . . . . . . . . . . . . . . . . . . . . . . Examples on how to use the commands apropos and man -k. . . . . . . . . 58 59 List of Tables 3.1 Commands to insert text and to start insert mode in Vi. . . . . . . . . . . . . . 18 4.1 Common character classes and their aliases. . . . . . . . . . . . . . . . . . . . . 21 5.1 5.2 5.3 Some basic commands to operate on and to navigate in the file system. . . . . Commands to list and operate on processes. . . . . . . . . . . . . . . . . . . . . Commands to operate on text files. . . . . . . . . . . . . . . . . . . . . . . . . . 26 28 29 6.1 6.2 6.3 Most of the commands available in ftp and ncftp. . . . . . . A small set of central commands to operate GNUS. . . . . . . An overview of how the command line interface for PGP has version 2.x to version 5.0 and later. . . . . . . . . . . . . . . . . 34 37 . . . . . . . . . . . . . . . . . . changed from . . . . . . . . . 43 8.1 Some applications available under the GNOME Office umbrella. . . . . . . . . 56 9.1 The different sections of the online manual pages. . . . . . . . . . . . . . . . . 58 A.1 The URL and a short description of each online resource. . . . . . . . . . . . . 62 ix Preface In February 2001, the author was asked by the leader of Aktivitetsuka 2001, Petter Norli, to give a lecture introducing Unix to students at the Norwegian University of Science and Technology (NTNU). As the world of Unix is quite large and complex, it would be difficult for the attendees to remember half of the lecture that was intended to span just a couple of hours. If the attendees would receive some additional reference material to look things up, or to refresh their memories, the effect of the course would hopefully be longer lasting. This was the main reason to write the compendium you are currently reading. This compendium is meant to be a quick introduction to using Unix-similar operating systems. It is intended for people with some computer experience, but little or no experience with Unix systems. Hopefully, you will also find it useful as a reference matter where you can look things up. The author hopes that The Fragrance of Unix will give you a sufficient introduction to Unix systems that you will be able to continue using them on your own. If you have any comments or suggestions, the author may be contacted by e-mail at <[email protected]>. Martin Thorsen Ranang Trondheim, Norway xi Acknowledgments The author wants to thank Tayyaba Arif for her suggestions and exceptionally fast and crucial spelling corrections; Marit Limstrand for her precise proofreading and suggestions; Lars Aaserud for his clarifying suggestions; Henrikke Bugdø-Petersen for her spelling corrections and her patience; Elisabeth Bayegan for her proofreading and encouragements; Ingunn Grytting for her clarifying questions. Thank you all for your time and your input. The Fragrance of Unix would not have been the same without your help. Finally, thanks to all my friends for your support and your flexibility. xiii C HAPTER 1 I NTRODUCTION A complex system that works is invariably found to have evolved from a simple system that worked. — John Gall An operating system (OS) is the set of programs that renders a computer useful. The Fragrance of Unix will cover basic usage of Unix systems. The term Unix will be used as a wide concept referring to a set of operating systems, including, among others, Sun Solaris (or SunOS), SGI IRIX, Digital Unix, FreeBSD, NetBSD, OpenBSD and GNU/Linux. We call these operating systems Unix clones. The GNU/Linux operating system is such a Unix clone, and it is freely available and is gaining popularity at a fast rate; therefore, to which degree we focus on any one OS, we will focus on GNU/Linux. This way, the reader should more easily and less expensively be able to get hands on experience with a Unix system. 1.1 S OME H ISTORY In 1969 the Unix operating system was born. It was mainly developed by Dennis M. Ritchie and Ken Thompson, both from Bell Labs (Ritchie, 1979). The earliest versions of the OS ran on the Digital Equipment Corporation (DEC) PDP-7 and PDP-11 computers. The creators of Unix hoped that its users would find that the systems most important characteristics are its simplicity, elegance and ease of use. GNU/Linux is a clone of the operating system Unix, written from scratch by Linus Torvalds with assistance from a loosely-knit team of hackers all across the Internet. It is a completely free, see section 1.3.1 on page 4, reimplementation of the POSIX specification, with SYSV and BSD extension. This means that it looks like Unix, but it does not come from the same source code base. Linux itself is only the name of the kernel. As mentioned above, an OS comprises both a kernel and a set of programs that renders the computer system useful. This is the reason The Fragrance of Unix refers to the whole OS as GNU/Linux. The kernel is available in both source and binary form, as is most of the other software available for GNU/Linux. 1 2 Chapter 1. Introduction Linux aims towards POSIX and Single Unix Specification compliance. It has all the features you would expect in a modern fully-fledged Unix, including true multitasking, virtual memory, shared libraries, demand loading1 , shared copy-on-write pages among executables2 , proper memory management, loadable device driver modules, video frame buffering and TCP/IP networking. Utilities Shell Kernel Hardware Figure 1.1: A conceptual diagram of a Unix system. The kernel surrounds and controls the hardware. A shell enables the user to interact with the kernel. Finally, a large set of tools and utilities may be started from the shell. 1.2 O VERVIEW The Unix system comprises a set of encircling layers, like shown in figure 1.1. The kernel surrounds and controls the hardware. A shell encircles the kernel and acts as an interpreter between the user and the computer. On top of this, a set of tools and applications communicate with the shell and renders the computer system usable. This way, the kernel represents the core of the OS. When you turn on a computer, it enters a phase called bootstrapping, or booting up, in which the computer loads its core program, the kernel, and bootstraps itself into operation. After the Unix system is done bootstrapping, the first program it starts is called init. In a Unix system, every program that is started is a process. In the original Unix system, the only way you could start a new process was to make another already running process 1 Demand loading of executables means that the OS only reads from disk those parts of a program that are actually used 2 The shared copy-on-write technique allows several programs to share the same piece of memory that represents information in a file. If any program writes a page in the file, that page is replaced by a copy in all of the programs, which continue to share that page with each other but no longer share with the file. This has two benefits: increasing speed and decreasing memory usage. 1.3. Open Source fork itself. Forking a process means that the process is divided into two branches, if it succeeds. This is sometimes also referred to as spawning a process. The first of these processes is called the parent process and the other is called the child process. After the fork, both parent and child have independent copies of the same original memory and they share all open files. As you now may see, the init process is the parent of all processes running on a Unix system. The primary role of the init process is to create processes from a script stored in the file /etc/inittab. This file usually contains entries that cause init to spawn programs that makes it possible for users to login into the system. Under Unix, all processes operate in an environment. This environment is a set of variables with different values. Each environment variable has a name and a value that represents a pair, like ‘NAME=value’. Some of the environment variables have special meanings, and therefore control how a process behaves. Every process on a Unix system is owned by a user. The owner’s identity is called the user ID. The user ID of a process determines what resources it may operate on, and what operations are allowed on each of them. Similarly there are something called a group ID. The group identities are used to group users and resources to provide better control over resources. This way it is not necessary to set the appropriate permissions for every individual user of a system, but one can do this on a per group basis. Every user has its own unique identity and may be a member of one or more groups. Every group have their own identities. Processes inherit both the user ID and the group ID from their parents. When the kernel boots, the owner of its process is root. This user has user ID = 0 and group ID = 0. The root user is the most powerful user one can be in a Unix system. The root user has all the privileges there is. Of course, if all child processes inherit the identity of their parent, one might think that all the children have to be owned by root too. This is not the case. It is possible to change identities, provided one can authenticate oneself as the owner of an identity. This is usually done with a user name and a password. This means that you can change to another user identity and thus start processes with another owner, if you can authenticate yourself as this other user. Note that root does not have to do this. When first authenticated as the root user, one has all the privileges, and thus does not need to authenticate oneself to change to other user identities. 1.3 O PEN S OURCE Would you buy a car with the hood welded shut? You can open the hood of our car and fix it yourself. — Bob Young, Red Hat CEO A central concept3 in environments where Unix is used is called open source software. The basic idea behind open source software is simple. When programmers can read, redistribute and modify the source code for a computer program, the program evolves. When there are enough people involved in improving a piece of software, adapting it and fixing bugs in it, this process of evolution can become very rapid. Members of the open source community have learned that this rapid evolutionary process often produces better software than the traditional closed models, where only a very 3 Some might want to call it a religion. 3 4 Chapter 1. Introduction few programmers can see the source. If used to the slow pace of conventional software development, the speed of progress in open source projects may seem incredible. Another consequence of using open source software is that investments in this cannot get lost because of reasons over which one has no control. And, even more important, you have the full possibility to revise and evaluate the quality of the software. More information about the social and economical aspects of open source is given in (Raymond, 1999). A subset of all open source software is (even better) free software. 1.3.1 F REE S OFTWARE In 1983 Richard Stallman made an announcement on the news-groups net.unix-wizards and net.usoft that initiated the GNU (GNU’s not Unix) Project. The main focus of the GNU Project is that software should be free. As stated by the GNU Project: Free software is a matter of liberty, not price. To understand the concept, you should think of “free speech”, not “free beer.” Free software refers to the users’ freedom to run, copy, distribute, study, change and improve the software. The interested reader should visit the GNU Project on the Wold Wide Web to read more about their philosophy4 . In short, GNU is free software: everyone is free to copy it and redistribute it, as well as to make changes either large or small. 4 See section A.1 for more information. C HAPTER 2 G ETTING S TARTED If you are going through hell, keep going. — Sir Winston Churchill (1874–1965) When starting out as a user of Unix systems, it is important to remember that no questions are too stupid to be asked and that you (at least in theory) are much smarter than the computer. So, there is no reason to panic. Computers running UNIX systems are given names. Such a name is called a hostname. The hostname of the computer in The Fragrance of Unix is ’nirvana’. It is common to write hostnames with lowercase letters (and/or numbers). 2.1 L OGGING IN Before you can start to try things on your own on a UNIX system, you need an user account. If you do not currently have a user name and a password, you should contact the IT administration of your organization and ask for an user account. There are mainly two different modes you can log in under on a UNIX system. These are the text-only (console) and the graphical mode (X). Most users prefer to log in under X, as most users are used to operate computers in a graphical environment. Red Hat Linux release 6.2 (Zoot) Kernel 2.2.16-3 on an i686 nirvana login: mtr Password: ******** Figure 2.1: How a login screen may look on a terminal. The example shows a login screen under Red Hat Linux, release 6.2. The asterisk, ‘*’, is not actually shown, but is inserted by the author to simulate keyboard input. Be aware that when entering your user identification, UNIX systems differentiate between lowercase and uppercase letters. When you are logging in you will be prompted to 5 6 Chapter 2. Getting Started enter your user name and your password. On a terminal this may look like figure 2.1 on the preceding page. The figure shows how the author logs into Nirvana1 . nirvana:mtr$ _ Figure 2.2: A typical user prompt. This is how a terminal usually looks when just logged in to a system. The underscore, ‘_’, is inserted by the author to symbolize the user’s cursor. If you entered the correct information at the login prompt, the next thing you should be able to see is usually something like shown in figure 2.2. On some systems you may be greeted with a welcome message and the latest news from the administrators of the system. A line like this is called a prompt. It is possible to configure the look of the prompt as one wants to. We will come back on how to do this later. The prompt in figure 2.2 consists of a hostname and the current working directory in the pattern ‘<hostname>:<working_ directory>$’. The dollar sign indicates that the current user is a normal user, i.e. the user is not root. 2.2 N AVIGATING IN THE F ILE S YSTEM When Unix was developed and released, the most important role of the system was to provide a file system (Ritchie and Thompson, 1974). From the user’s perspective, there are three kinds of files. These are ordinary files, directory files and special files. 2.2.1 O RDINARY F ILES The ordinary files only contain the information that the users places on it. A text file consists simply of strings of plain characters. Binary files are sequences of data not representable by plain characters. Under Unix, a file’s information is stored in its inode. The inode contains all the information about a file, including its contents, but not the file’s name. Inode Table i1 i2 i3 i4 i2 Directory name1 name2 name3 name4 name5 Figure 2.3: How directories, file names and inodes are related. 1 The computer’s name is Nirvana. 2.2. Navigating in the File System 2.2.2 D IRECTORIES A directory is a file that comprises a set of names and pointers to inodes. This way, the directories are files that maps the names of the files onto the files themselves, as shown in figure 2.3 on the preceding page. This way they induce a structure on the file system as a whole. An inode may have several names referring to it. In the figure, where in indicates a pointer to inoden , the directory entries for name2 and name5 both point to the same inode. From the user’s view a directory is a group of zero or more files, but there are some exceptions; A directory should always contain a file called ‘.’, which is a reference to the directory itself, and a file called ‘..’ that is a reference to the parent directory of the current one. There is one special directory called the root directory that is the top level directory. nirvana:unix_intro$ tree . ‘-- parent/ |-- child1/ | |-- file2 | |-- subchild1/ | ‘-- subchild2/ |-- child2/ | ‘-- file2 |-- child3/ ‘-- file1 6 directories, 3 files Figure 2.4: How directories are organized in a UNIX file system. It is possible to reach all other directories from the root directory. Figure 2.4 shows an example of how a (small) tree of files and directories may look. In the example one can see by the prompt that the current working directory is unix_intro. In this directory there is one subdirectory called parent. All of the directories in the figure are marked with a trailing slash (‘/’). Subdirectories of parent are child1, child2 and child3. In addition, the parent directory contains one file called file1. Both in child1 and in child2 there is a file called file2. In one directory there cannot exist any file or directory with the same name as another file or directory in the same directory. But, since the two files, both called file2, reside in different directories there are no contradictions to the rules stated above. The name and location of a file or directory is specified with a file path, or just path for short. An example of this is the name /dir/file that causes the system to search for dir in the root directory. In dir it looks for the file named file. An absolute path is a path specified with a preceding ‘/’, which indicates the root directory. It is also possible to refer to a relative path. E.g., if currently in one of the directories child1 or child2 in figure 2.4, the path ../child2/file2 specifies the file named file2 in the directory child2 located in the directory one level up from the present. A directory basically behaves like ordinary files, but programs treat it as a special class of files. There is a set of programs that operate on directories. For plain navigation in the file system, these are pwd that prints the current working directory, ls which lists the contents of a directory, cd which changes the current directory to another directory, and mkdir which is used to create new directories. There are other programs too, but they are not considered 7 8 Chapter 2. Getting Started to be used in plain navigation in the file system. -rw-------rw------lrwxrwxrwx 2 mtr 2 mtr 1 mtr mtr mtr mtr 16 Feb 21 20:12 file 16 Feb 21 20:12 hl 4 Feb 21 20:16 sl -> file Figure 2.5: An ordinary file, a hard link and a symbolic link. As a directory basically just provides a mapping between a name and the actual contents of the file, it is possible to have several names refer to the same file. This concept is called linking. An example of this is shown in figure 2.5 and, on a more conceptual level, in figure 2.3 on page 6. Figure 2.5 shows both a symbolic link, called sl, and a hard link, called hl, both referring to the ordinary file named file. The difference between a hard link and a symbolic link is that a hard link is just another name for an existing file. The link and the original file are indistinguishable because both names point to an inode, while a symbolic link is really just a kind of special file. A symbolic link does not refer to an inode; thus, the file it refers to does not have to exist. A symbolic link is essentially a text string that sits in place of a file. When the filename of the symbolic link is accessed, the Unix kernel replaces the filename with its text value instead, like a macro expansion. Links are created with the command ln. Symbolic links that refer to files that does not exist are called broken links. 2.2.3 S PECIAL F ILES The last kind of files in the Unix file system is called special files. All supported input and output (I/O) devices are associated with at least one such file. These files resides in the directory /dev and may be read and written like ordinary files. The special thing about these files is that read or write operations on these files result in activation of special hardware functions in the kernel. Many of these functions is for controlling hardware devices. The symbolic links is also a kind of special file. 2.2.4 A CCESS P ERMISSIONS All files and directories in the Unix file system have a set of access permissions. When a file is created, it inherits the user and group identity of the user who creates it. The access information is stored in the inode along with information about the identity of the user who owns the file, other users in the file’s group and other users not in the file’s group. These three attributes are represented by the symbols u (user), g (group) and o (others). Each of u, g and o are given specific access permissions. These permissions are specified by a mode. The mode can be expressed either symbolic or as an octal number. File permissions expressed as an octal number are the sum of the numbers 4, for read permissions, 2 for write permissions and 1 for execute permissions. If a file has the mode 755, it means the owner of the file has permission to read, write and execute (7 = 4 + 2 + 1) the file, while both other members of the file’s group and members not a member of that group have permission to read and execute the file (5 = 4 + 1). The order of the numbers are ‘u’ ‘g’ and ‘o’. 2.3. The Shell When the permissions are represented symbolically, the same permissions would be expressed as u=rwx,go=rx. Where ‘r’, ‘w’ and ‘x’ represent read, write and execute respectively. 2.3 T HE S HELL What you have just entered is called a shell. As mentioned in section 1.2 on page 2, the shell is the medium of communication between the user and the kernel. We also mentioned that all programs that are running on a system are processes. To get a list of all the processes you are currently running, you can type the command ps. To list all commands running on the system, you may type ‘ps -e’. For more information about each process, you may supply the flag ‘-f’ too. The shell provides a command line. The above mentioned prompt indicates the start of this line. When you type in a command on the command line, the shell interprets the entered lines as requests to execute other programs. In its simplest form a command line looks like ‘command arg1 arg2 ... argn ’. The shell splits up the line into separate words i.e., one command and zero or more arguments. Then it searches the system for an executable file called command. As the systems running Unix may be extremely large, one tries to put all executable programs in dedicated directories. To identify where a process2 may look for an executable, each process has an environment variable named PATH. This variable contains a list of directories in which the current process may look for executables. If command is found, it is loaded into memory and executed. The arguments parsed by the shell are available to the new process. 2.3.1 S TANDARD I/O To execute a program, the shell first performs a fork. As stated earlier, after the fork, both processes have independent copies of the same original memory and they share all open files. Now, the child process starts running the new program. The parent process may either wait for the child to exit, or it may continue its thread of execution, if the user supplied an ‘&’ (ampersand) after the last argument on the command line. If a program is started from the command line without the trailing ‘&’, it is possible to temporarily stop it by pressing Ctrl-z. When a process is stopped, it is possible to make it run in the background by issuing the command bg and to bring it back in the foreground by using the command fg. The forking ensures that all new processes started from the shell have three open files available. These are called the standard input, standard output and standard error. These files are the foundation for a concept called standard I/O. By default the standard input is open for reading, and the input are the messages typed by the user. The standard output is open for writing and, except under circumstances explained below, the file is the user’s terminal. In addition to the standard output, there is another file opened for writing available to the process which is the standard error file. This file is reserved for programs to display information that has nothing to do with the normal treatment of data. Figure 2.6 on the following page shows a sketch of how a process has access to a set of file descriptors which may be open or closed and how the standard I/O is related to the users terminal. 2 And therefore also all shells. 9 10 Chapter 2. Getting Started Terminal Input Process Descriptor0 stdin Descriptor1 stdout Descriptor2 stderr ... Descriptorn Terminal Output Figure 2.6: The file descriptors normally available to a process and how they relate to the user’s terminal. The standard input, output and error are often referred to as stdin, stdout and stderr. The files open and available to the processes are called file descriptors. These are numbered 0, 1, 2, . . . , n, depending on how many files are currently opened by the process. File descriptor 0, 1 and 2 are called stdin, stdout and stderr respectively. Redirection of I/O The shell is able to change the standard assignments of the above mentioned file descriptors from the user’s terminal to other files. This can be specified by the user at the command line. By supplying a ‘>’ as an argument to a command at the command line, file descriptor 1, stdout, will refer to the file specified after the ‘>’ for the duration of the command. This is called redirection. Usually the command ps prints information about active processes to the terminal so the user may see them. When the command is written as ps > output the data written to stdout is redirected into the file output, available for later retrieval. To see the contents of a file you may use cat, or you may look at it with more or less. Note that cat may cause you trouble if you try to display the contents of a binary file in your terminal, as cat is a very powerful, yet simple, tool that really just concatenates files and prints on standard output. Just like the way you use ‘>’ for redirecting stdout, you can use ‘<’ to assign stdin to a file other than the keyboard, e.g. the command tr ’ ’ ’.’ < output will use the command tr to translate, or change, all occurrences of ‘ ’ in output to ‘.’ and show the result on the terminal. It means that tr takes input from output instead of the terminal. As mentioned above, there is another file descriptor available to the process that is called stderr. When stdout is redirected to some file, stderr remains attached to the terminal; this ensures that the user will receive important messages from the application which should be printed to stderr. Using Pipes In addition to the above mentioned redirector operators, ‘>’ and ‘<’, there is another extension to the standard I/O that is very important to how a Unix system works; this is 2.4. Mental Models 11 called pipes. A pipe is indicated by the token ‘|’. A sequence of commands delimited by pipes causes the shell to execute the commands simultaneously and to redirect the input and output of the commands so that the output of the first command is the input of the next command and so on. Figure 2.7 shows the flow of input and output streams in a sequence of piped commands. Here, the stdout of Process0 becomes the stdin of Process1 and the stdout of Process1 becomes the stdin of Process2 . Note that the stderr is not redirected, or piped, so that all error messages will be printed to the terminal. The command ps -ef |grep "username" | a2ps | lpr will cause a list of all running processes, including processes owned by other users, to be the input of the grep command. Meanwhile, grep will output only lines that contain the text string ‘username’. The matching lines are forwarded to a2ps which will format the list of ASCII3 characters as a PostScript file which in turn is given as input to lpr, which sends the PostScript file to the printer. This command is equivalent to the commands ps -ef > temporary grep "username" < temporary > temporary2 a2ps < temporary2 > temporary3 lpr < temporary3 which certainly is more messy to write, and even requires that one removes the temporary files after use. stdin Process0 Process1 Process2 stdout stderr Figure 2.7: How the flow of input and output streams behave in sequences of piped commands. A program like grep which copies its standard input and displays it or parts of it on its standard output is called a filter. The use of redirection, pipes and filters is one of the most powerful features of Unix, but it is also one of the concepts that it is really hard to fully grasp the consequences of. 2.4 M ENTAL M ODELS To switch from a user interface that is based on graphical menus to a command line interface, like the one used on Unix terminals may feel very awkward to new users. This section is meant to help users unfamiliar with the command line interface to cope with it. Most of the figures in The Fragrance of Unix is provided to ease this process. A mental model may be thought of as the model people have of themselves, others, the environment and the things with which they interact. These models are dynamic and are formed through experience, training and instruction. The models may be seen as parallel realities (Preece et al., 1994). The mental models are dynamically constructed and enable 3 ASCII stands for American Standard Code for Information Interchange; a code for representing alphanumeric information. 12 Chapter 2. Getting Started you to run simulations of possible things to do and to predict the outcome without really doing it. The term spatial intelligence refers to the capability to form a mental model of a spatial world and to operate using that model (Green, 1999). A Unix system may be perceived as a very complex and threatening world. A structural mental model is a model that describes how a thing works and a functional mental model describes how to use it. The sooner your models of a Unix system are rendered, the sooner you will feel comfortable with the new surroundings. One way to form functional models is to read descriptions about how things can be done, as you are doing now. To form structural models, it may help to see schematic images of things. The Fragrance of Unix tries to rise the readers awareness of this and to help the reader grasp the ideas of Unix, supplies several explaining figures. The file system is basically structured like a tree; and the meaning of commands are dependent of where they are issued. Thus, it is probably of great help to the reader to build a mental model of the file system4 . Figure 2.8 shows a hierarchical tree structure. This is basically what the file system looks like, but this tree does not reflect the links. dirroot dir1 filea fileb dir2 dir3 filec filed Figure 2.8: How files and directories conceptually are related. In figure 2.9 on the next page, filed has been substituted by a symbolic link5 pointing to a directory, dir2 , two levels up in the tree. Now, this is no problem in itself, but as we ascend down, and try to cd into the directory the link points to, we are in fact moving upwards in the tree. Such recursive paths may delude you to believe you are deep down in a file tree, while you physically still are just moving in loop. This is not to worry much about, because most system commands that recursivly ascends trees this way, recognize such recursive loop paths and avoid possible never ending loops, but it is mentioned to clarify how the file system is structured and what is possible to do. One of the most well established findings in memory research is that a human can recognize information far more easily than it can recall it from memory. These phenomena is called recognition and recall. In command line interfaces, commands and associated parameters must be typed, maintaining correct semantic content and syntactic form. Compared to menu interfaces, they provide increased flexibility with respect to combinations of commands and parameters that may be linked together. This way they provide a potentially more efficient means of generating complex commands (Westerman, 1997). This both describes the motivation for using 4 To construct a mental model is of course not a one time activity, but hopefully your mental model of the Unix system will grow and be refined over time. 5 It is a symbolic link since hard links are not allowed for directories. 2.4. Mental Models 13 dirroot dir1 filea fileb dir2 dir3 filec link1 ⇒ dir2 Figure 2.9: An example of how symbolic links may lead to recursive file paths. the command line interface that Unix is associated with and the hardly motivating fact that this interface relies heavily on a persons recall abilities. Of course, it can be difficult to remember all the new commands when you start off using a Unix system; therefore it might help to remember that the extent to which new material can be remembered depends on its meaningfulness. The command names under Unix are often highly abstract and disclose little of their meaning of functionality. This is to a large extent caused by a desire to make the command names short; thus it is possible to write them fast. To understand the meaning behind the command names, you should frequently look up the commands in e.g. the manual pages. This is done by issuing the command ‘man <command name>’. Using the information available online in the Unix system is an important rule for painless and successful utilization of the Unix system6 . 6 This is probably true for all kinds of computer systems, but often you can rely on your recognition abilities to a much greater extent than under Unix. C HAPTER 3 E DITORS I just bought a Mac to help me design the next Cray. — Seymoure Cray (1925–1996) when he was informed that Apple Inc. had recently bought a Cray supercomputer to help them design the next Mac. Now, after you have got a brief introduction to the Unix environment and how to navigate in it, it is time to introduce you to one of the most central, but at times complex, components of the Unix system. This component is called the editor. There are a lot of editors available, but we’ll look at two of them here, which probably are the most used ones. 3.1 E MACS GNU Emacs is the GNU incarnation of the advanced, self-documenting customizable, extensible, real-time display editor Emacs (Stallman, 1998). It is called a display editor because the text being edited is updated on the screen as you type your commands. There are a number of word processors available today that are based on the fact that you can see how your documents will look, including font sizes and colors, as you write your text. This principle is called What-You-See-Is-What-You-Get (WYSIWYG). Emacs is a WYSIWYG editor, but all text in Emacs is displayed as ASCII text. Furthermore, Emacs is much more than a word processor. As stated above, it is a self-documenting, customizable, extensible editor. This means that you may edit what ever you want with it, and that if you know how to program it, you may add your own modes, a kind of small programs, for doing things. When starting Emacs on a text only terminal, emacs will occupy the whole visible area, see figure 3.1 on the following page. This is called a frame. When started under the X Window System, see chapter 7 on page 47, Emacs creates a whole new X window to use for itself and then the whole X window is referred to as the frame. The whole frame minus the first and the last lines is devoted to the text you are editing and this area is called the window. The first line in your Emacs frame is a menu bar. The last line of the frame is called the echo area or minibuffer. This area displays prompts and 15 16 Chapter 3. Editors Figure 3.1: How Emacs looks on a terminal. responses to what is going on. It is possible to split the window into two or more parts. The last line in each window is called the mode line and describes what is going on in each window. Under Emacs the place where your cursor is localized shows where the next editing command will take effect. It is called the point. What is frustrating to new users of Emacs, is that it relies so heavily on key stroke commands. These key stroke commands are usually a combination of keys pressed at once or in sequence. They are often used with modifiers. The modifier keys are Shift, Control and Meta or Escape. Although Emacs is self documenting, it is essential to know that the Control and Meta modifiers are written ‘M-’ and ‘C-’, respectively. So, the command used to open a file, which is a sequence of first pressing down Control, then pressing x, releasing both keys, pressing down Control and then pressing f followed by entering the filename and pressing enter, is written as ‘C-x C-f filename’. You may actually hold down the modifier key and only move your finger from x to f, in that order, but this is only because it is the same modifier to be used for both of the key strokes. In fact, the Meta key is not present on all keyboards, but often keys labeled Alt or that are marked with a little flag are used instead. If none of these keys works as a Meta key on your keyboard, you can always use the Escape key instead. Escape is not really a modifier key, i.e. it does not modify the meaning of other keys while it is being pressed. So, when you do not have a Meta key, you have to first press and release the Escape key and then press the key you want to modify. The best way to start learning Emacs is by doing the interactive Emacs tutorial. To start it, press ‘C-h t’, then follow the instructions. To quit, or escape from the command you are currently performing, you can always press ‘C-g’. If in an Emacs mode, you can most often exit a mode by pressing q. When in X, the menu bar may be accessed by clicking on it with the mouse pointer. If you are using Emacs on a terminal, the menu may be accessed by issuing the command ‘M-x tmm-menubar’. Then you can select the menus you want to by following the simple 3.2. The Vi (or Vim) Editor 17 instructions. When in the minibuffer, Emacs provides tab-completion, just like e.g. BASH. That is, it tries to complete the text string you are typing, every time you press the Tab key. To be able to do this, the string must be possible to complete and for Emacs to predict. If you just type ‘M-x TAB’, Emacs will display a list of all possible completions (or commands in this case). By pressing ‘TAB’ once more, you will be able to scroll down the list. 3.2 T HE V I ( OR V IM ) E DITOR To bring a MicroVAX to its knees, try twenty users running Vi – or four running Emacs. — Unknown, from The Jargon File Vim1 is a text editor that is upwards compatible2 to Vi, and it is intended to be used by programmers. Vi is an old editor, just like Emacs, and Vim is simply an improved version of Vi. As the citation above proposes, Vi is known to be smaller, and therefor easier on the computer resources, than Emacs. Emacs is also known for its heavy reliance of keystroke commands3 . Emacs users often resort to Vi as a mail editor and for small editing jobs because it starts up faster than the bulkier versions of Emacs. Command mode Start :q Exit Append or Insert Command (a, A, i, I, o, O) Escape Insert mode Figure 3.2: The modes of Vi. The commands used to start the editors is vi and vim. To start vim type ‘vim [options] [filelist]’. Vi often frustrates new users, as it will neither take commands while expecting input text nor vice versa. In addition the default setup provides no indication of which mode the editor is in. Figure 3.2 shows how these modes of Vi change. When started, Vi is in command mode. From command mode, the user can either exit, by pressing ‘:q’, or enter insert mode. Insert mode can be entered by pressing a, A, i, I, o or O, see table 3.1 on the next page for an explanation of these commands. You can leave the insert mode by pressing Escape. When in command mode, you can move the cursor around in the text by pressing l for moving forward, h for moving backward, j for moving down and k for moving up, like 1 Vim is short for Vi Improved. When software is upwards compatible, it means that newer versions of the software does not break any older existing rules or systems. I.e., all things that worked in the old version should still work in the new version. 3 Some users explain the name Emacs as an acronym for Escape Meta Alt Control Shift and Eight Megabytes And Constantly Swapping. 2 18 Chapter 3. Editors k h l j (a) How to move the cursor a character at a time. b w (b) How to move the cursor a word at a time. Figure 3.3: Moving the cursor around in Vi. Table 3.1: Commands to insert text and to start insert mode in Vi. Command i I a A o O Description Enter input mode before the cursor. Enter input mode at the start of the current line. Enter input mode after the cursor. Enter input mode at the end of the current line. Create a new line below the cursor and enter input mode on it. Create a new line above the cursor and enter input mode on it. shown in figure 3.3(a). In command mode, you may also move a word at a time backward or forward by pressing b or w respectively, like shown in figure 3.3(b). To undo a command in Vi, you can press u in the command mode. In Vi it is also possible to redo the last command entered by entering the command mode and pressing ‘.’. C HAPTER 4 R EGULAR E XPRESSIONS We don’t like their sound, and guitar music is on the way out. — Decca Recording Company rejecting the Beatles, 1962 If you ask a set of experienced Unix users about what they like most about Unix, you are almost guaranteed to hear the phrase regular expressions. Regular expressions are often called regexps or REs. Regular expressions are very useful to do matching and operations on strings. As (Kernighan and Pike, 1999) states, regular expressions are one of the most broadly applicable specialized languages, a compact and expressive notation for describing patterns of text. 4.1 T HE S YNTAX OF R EGULAR E XPRESSIONS The concept of how regular expressions work may be a bit difficult to comprehend at first. A regular expression specifies a set of strings of characters. A member of this set of strings is said to be matched by the regular expression. In many applications a delimiter character, commonly ‘/’, bounds a regular expression, like /bob/ where bob is a regular expression that matches the string bob and only that string. In the following specification for regexps, the word character means any character, or token, but the newline character. Figure 4.1 on the following page shows the syntax, or grammar, for a regular expression he0i. The parallel branches of each path symbolize alternatives. The symbols to the left of ‘::=’ are called grammar rules. When a grammar rule occurs on the right side of the ‘::=’ operator it may be substituted by the right hand side of the rule in question. This is called an expansion of a rule. This means that he3i may be expanded to either a hliterali, a hcharacter classi, any of the characters ‘.’, ‘ˆ’ or ‘$’, which all have special meanings in regexps, or finally, an opening ‘(’ followed by an expression he0i, followed by a closing ‘)’. The observant reader may have noticed that there are no expansions of the rules hliterali and hcharacter classi in the 19 20 Chapter 4. Regular Expressions he3i ✲ ::=✲ ☎ hliterali ✝ hcharacter classi ✝ ‘.’ ✝ ‘^’ ✝ ‘$’ ✝ ‘(’ he0i ‘)’ he2i ✲ ::=✲ ☎ he3i ✝ he2i hREPi he1i ✲ ::=✲ ☎ he2i ✝ he1i he2i he0i ✲ ::=✲ ☎ he1i ✝ he0i ‘|’ he1i ✲ hREPi ::=✲ ☎ ‘*’ ✝ ‘+’ ✝ ‘?’ ✞ ✆ ✆ ✞ ✆ ✆ ✆ ✆ ✆ ✞ ✆ ✲✛ ✲✛ ✞ ✆ ✲✛ ✞ ✆ ✲✛ ✲✛ Figure 4.1: The syntax for a regular expression he0i. figure showing the syntax. This is done to avoid cluttering up the figure and make it more complex than needed. The characters ‘.*+?[]()|\^$’are called metacharacters and have special meanings in a regular expression. How, is in fact shown in the figure. A literal is any non-metacharacter, metacharacter or the delimiter preceded by a ‘\’. A character class is a nonempty string ‘s’ bracketed, like ‘[s]’ or ‘[ˆs]’. The ‘ˆ’ negates a character class. Therefore ‘[s]’ matches any character in ‘s’ and ‘[ˆs]’ matches any character not in ‘s’. A negated character class never matches the newline character. Inside a character class, a substring ‘a-z’, with ‘a’ and ‘z’ in ascending order, represents the whole inclusive range of characters between ‘a’ and ‘z’. In ‘s’, the metacharacters ‘-’, ‘]’ an initial ‘ˆ’ and the regular expression delimiter must be preceded by a ‘\’. The use of the preceding ‘\’ is called escaping a character. Inside a character class, other metacharacters have no special meaning and may appear unescaped. There are certain character classes that are often used; therefore there exists aliases for the most common ones. Table 4.1 on the next page shows a list of aliases for common character classes. Be aware that, although regular expressions are present and supported in many places throughout the Unix system, there may be subtle differences in how they must be written to work correctly. Sometimes the metacharacters needs to be escaped, the aliases may differ slightly, etc. Therefore, you should always try commands that use regexps in a safe setting before you use them where they might harm, e.g., your files. Above, it was mentioned that there are a set of metacharacters; what follows is an explanation. A ‘.’ matches any character. The ‘ˆ’ matches the beginning of the line, while the ‘$’ matches the end of the line. As the syntax diagram showed, the hREPi rule may be expanded into an ‘*’ (asterisk), a ‘+’ or a ‘?’. The ‘*’ matches zero or more, the ‘+’ matches one or more and the ‘?’ matches zero or one instances of the preceding regular expression, he2i. 4.2. Finite Automata 21 Table 4.1: Common character classes and their aliases. RE \d \D \w \W \s \S Expansion [0-9] [^0-9] [a-zA-Z0-9] [^\w] [ \r\t\n\f] [^\s] Match Any digit. Any non-digit. Any alphanumeric character. Any non-alphanumeric. Whitespace kind of characters. Any non-whitespace. Regular expressions may be concatenated. I.e., he1i and he2i may be concatenated into he1ihe2i, and the resulting regexp will match a match to he1i followed by a match to he2i. On the other hand, the alternative regular expression he0i‘|’he1i matches either a match to he0i or a match to he1i. The ‘|’ (pipe) symbol represents an alternative. The metacharacters ‘(’ and ‘)’ represent the start and stop symbols of a group; thus ‘(...)’ is called a grouping construct. The grouping construct serves three purposes. First, it may be used to enclose a set of ‘|’ alternatives for other operations. An example of this is the regexp /(Einstein|Albert)/ which will match either Einstein or Albert. The second purpose of the grouping construct is to enclose a complicated expression for the postfix operators ‘*’, ‘+’ and ‘?’ to operate on. The last purpose is to record a matched substring for future reference, where this behavior is supported. 4.2 F INITE A UTOMATA In computer science, regular expressions are often explained using something called nondeterministic finite state automata (NFSA). When a computer program encounters a regular expression like /[Nn][Ss][Aa]/ which will match the string nsa, where all the characters may be uppercase or lowercase, it generates an automaton like the one shown in figure 4.2. If it is not implemented exactly like the figure shows, at least it can be used to describe how regexps work. N q0 S q1 n A q2 s q3 a Figure 4.2: An automaton generated from the regexp /[Nn][Ss][Aa]/. 22 Chapter 4. Regular Expressions In figure 4.2 on the page before the arrows are called transitions and each circle is a state. The states are named from q0 to qn , where n is 3 in this case. The start state is represented by the incoming arrow and the final state is represented by the double circle. The automaton, i.e. machine, starts in the start state, q0 . It iterates by the following algorithm: If the next character of input matches the symbol on any arrow, or arc, leaving the current state, then cross that arc and move to the next state. At the same time, advance one character of input. If the final state, or accepting state, is reached before the automaton runs out of input, the input is matched, or accepted. If the accepting state cannot be reached, it is common to say that the machine rejects the input. It should not be too difficult to see that when in q0 , the automaton can iterate to the next state either by an incoming ‘N’ or ‘n’, in q1 , the automaton can move to the next state by either a ‘S’ or a ‘s’, and in q2 , the automaton can accept either an ‘A’ or an ‘a’. Thus the automaton shown may accept any of the strings nsa, nsA, nSA, NSA, NSa, Nsa, nSa and NsA. a n q0 s q1 a q2 q3 Figure 4.3: An automaton constructed from the regexp /na*sa/. Another automaton, shown in figure 4.3, is constructed from the regexp /na*sa/. This regexp will match both of the strings nsa and nasa and all other strings that start with a ‘n’, continue with zero or more ‘a’ characters and end with sa. 4.3 S TRING R EPLACEMENT WITH R EGULAR E XPRESSIONS It is common that programs, like text editors, which support the use of regular expressions also support replacing strings, using regular expressions. The syntax for this may vary slightly but is often like s/old/new/ which causes a string matching the regular expression /old/ to be replaced with the regular expression /new/. It is often possible to use the operator ‘\n’ which matches the nth occurrence of a grouping construct earlier in the regexp. This may also be used while doing string substitution with regular expressions. 4.4 S OME E XAMPLES Now that you have been formally introduced to regular expressions, it is time to look at some examples. Regular expressions pervade Unix. A large set of Unix programs support the use of regexps, including editors, file and text processing tools, like grep and sed and scripting languages like Python, Perl and Awk. 4.4. Some Examples Sed is a stream editor. A stream editor is used to perform basic text transformations on an input stream. The input stream may be a file or input from a piped command. The command grep is a pattern matching program. It applies a regular expression to each line of its input files and prints those lines that contain matching strings. The command awk is also a pattern matching program, but it is especially suited to operate on files when they are databases. If the file file contains the string ‘nasa nsa nasa’, then the command cat file | sed ’s/\(nasa\)/-->\1<--/’ will produce the output -->nasa<-- nsa nasa If the file consisted of more lines of text, only the strings matching the regexp would be affected. Note that in the output, only the first occurrence of nasa has been substituted. To operate on the whole string, the command has to be cat file | sed ’s/\(nasa\)/-->\1<--/g’ where the difference is the trailing ‘g’, which will produce the output -->nasa<-- nsa -->nasa<-To search for lines in a file, long, containing a lot of text and a price given in dollars, like $99.50 or maybe $50, you can use the following grep command cat long | grep -E ’\$[0-9]+(\.[0-9]{2})?’ which will match expressions like the ones above, both with and without the cents. To find a file in the directory /usr/share/texmf/tex/latex/ matching the regular expression /.*[vV]erb.*/ you can use find like find /usr/share/texmf/tex/latex -regex ’.*[vV]erb.*’ Note that the whole path is matched, not only the file’s name. 23 C HAPTER 5 A S TEP F URTHER I am enough of an artist to draw freely upon my imagination. Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world. — Albert Einstein (1879–1955) A Unix system is extremely modular. Tools can be added or removed from it, depending upon the applications required. Today most Unix systems contain an extremly high number of commands and applications. It is not the scope of The Fragrance of Unix to explain, or even list, all of them. Instead it will try to give the reader an overview of the most essential and commonly used commands. It is also important to provide the reader with enough knowledge to find out how new commands work, and where to look for answers. This topic will be explored further in chapter 9 on page 57. Table 5.1 on the following page shows a list of some commands that are essential to a successful, and painless, operation in a Unix file system. As good as all of the commands and programs are documented on-line in Unix. Most commands can be started with a flag like ‘--help’, ‘-help’, ‘-?’ or ‘-h’ to show a short documentation of the command. Figure 5.1 shows an example of how this may be done. If this is not possible, try to look up the command in the man pages, see section 9.1 on page 57. nirvana:unix_intro$ /bin/pwd --help Usage: /bin/pwd [OPTION] Print the full filename of the current working directory. --help display this help and exit --version output version information and exit Report bugs to <[email protected]>. Figure 5.1: How to list the help information for a command like pwd. 25 26 Chapter 5. A Step Further Table 5.1: Some basic commands to operate on and to navigate in the file system. Command Name cd find locate lpq lpr ls mkdir touch 5.1 Description Change the current directory. Search for files in a directory hierarchy. List files in databases that match a pattern. Spool queue examination program. Spooling the named files to print when facilities become available. List files in a directory. Create a new directory. Change time stamps. May also be used to create new, empty files. M ORE ON S HELLS There are a number of different shells available. Until now, The Fragrance of Unix has only talked about the shell, as there is only one; but there are in fact several different shells. They all provide the functionality explained above, but each and one of them also provide some specialties to make them stand out from the others. What shell you choose to use is a matter of individual taste. What shell you will start with, i.e. your login shell, is decided in the file /etc/passwd. This file is a list of all the user accounts in the Unix system. There is one entry per line, in the format username:passwd:uid:gid:gecos:home:shell where username is the user’s login name, passwd is the users password in a non-readable format1 , uid and gid is the user’s own ID and its main group ID, home is where the users home directory is located and shell specifies what login shell the user will get. The gecos field is used for the user’s full name and other human-ID information. To change your login shell, you can either use the command chsh or, if you do not have the appropriate privileges, you can contact your system administrator and ask for the login shell to be changed. 5.1.1 T HE B OURNE -A GAIN S HELL Here, we will look only at the Bourne-Again Shell (BASH), which is a sh-compatible shell. To start a new instance of it, issue the command bash with the arguments you may find appropriate. As with many of the other shells, BASH tries to increase the usability of the shell. It supports looping constructs, conditional constructs, tab-completion, interactive use of the history and a lot of other things. One of the features that will probably save you the most time, and that will probably make you most addicted to itself, is the tab-completion. This feature makes the shell try to 1 On systems that use shadow passwords, this field will only contain an ‘x’. 5.1. More on Shells complete the text before the cursor. Generally, if you are typing a filename argument, a command, a symbol in a program that supports tab-completion or a variable in BASH, you can do tab-completion. In these cases it is called filename completion, command completion, symbol name completion and variable name completion respectively. To do this, you just tap the Tab key where you want the shell to try to complete the text. If you tap Tab twice on an empty command line, the shell will offer you to see all the available commands in your $PATH. The looping constructs come in handy all the time, and they can help you perform large and complex tasks in an elegant way. BASH supports the looping constructs until, while and for. The syntax for the until loop is until TEST-COMMANDS; do CONSEQUENT-COMMANDS; done which means that CONSEQUENT-COMMANDS should be performed until the final command in TEST-COMMANDS return a value different from zero. The syntax of the while construct is in a way the opposite of until. Its syntax is while TEST-COMMANDS; do CONSEQUENT-COMMANDS; done which means that the shell should perform the CONSEQUENT-COMMANDS for as long as the TEST-COMMANDS is zero. The final looping construct is for. Its syntax is given by for NAME [in WORDS ...]; do COMMANDS; done which tells the shell to perform COMMANDS once for every NAME in the list of WORDS. In all the above syntax definitions, the TEST-COMMANDS may be substituted by any command or sequence of commands that return a value that can be tested as a truth value; and under Unix all programs are supposed to do that. The CONSEQUENT-COMMANDS and the COMMANDS may be substituted by any set of commands that you want to perform iteratively. In the for construct, the list of WORDS may be a list of filenames, expanded by the shell, a list of words typed in by the user, or a list of words generated by a commando. Let us start with a relatively easy example. You can generate a good password with the command mkpasswd. The command return a string which is a password that is not to easy to crack, e.g. nirvana:unix_intro$ mkpasswd w1osLU1h nirvana:unix_intro$ which is at least not found in a dictionary2 . Now, if the user wants 10 new passwords stored in a file, this could easily be accomplished by using the for construct and some redirection, as in the command 2 Words found in dictionaries are easy for automated computer programs to guess. 27 28 Chapter 5. A Step Further for i in 0 1 2 3 4 5 6 7 8 9; do mkpasswd >> out done which generates a file called out consisting of newly generated passwords, one per line. Note the ‘>>’ operator. The plain operator redirecting output is ‘>’, but then the file that is redirected to is truncated if it exists beforehand, i.e. all information that were in it is deleted before the redirected stream is written to it. With the ‘>>’ operator, the redirected stream is appended to the end of the existing file, or a new file is generated if it does not exists when the command is executed. The >> operator is used for appending redirected output. Another way to do almost the same thing is to use the command for i in 0 1 2 3 4 5 6 7 8 9; do mkpasswd done | nl > out that will create a file called out consisting of lines starting with a number, indicating the line number, and a password. The nl command is a filter that enumerates the input lines. Note that the redirection now is moved. This way we do not have to use appending redirecting output, and it makes it possible to use the nl filter. 5.2 O PERATING ON P ROCESSES As every program on a Unix system is a process, it is important to be able to control the behavior of these. As a user, you may experience that a program does not respond or behave like expected or that it does not appear as expected. To find out if a process is running at all, you may use the ps command, that will report the process status for a set of processes. This command was explained in section 2.3 on page 9. Table 5.2 shows a set of commands to operate on processes. Table 5.2: Commands to list and operate on processes. Command Name exit bg fg jobs kill killall ps top Description Exit the shell or terminal. Place job in the background, as if it had been started with a trailing ‘&’. Place job in the foreground, and make it the current job. Lists the active jobs (children of a shell). Terminate a process, or send it a specific signal. Kill processes by name. Or, kill all active processes. Report process status. Display and update information about the most CPU demanding processes. If a program, e.g. Netscape Navigator, has stopped responding, it may be time to kill its process. This may be done in a couple of ways. First, you should list all the process you are 5.2. Operating on Processes 29 running, i.e. with the command ‘ps -efw |grep <username>’. Now, you may either kill it by using kill or by using killall. To use the kill command, you must supply the process ID, i.e. the pid, of the process. You can find this in the list output by ps. The command ‘kill 1234’ will send the default kill signal, TERM, to the process with process ID = 1234. This usually causes the process to terminate nicely. With the killall command you can kill a program by name instead. This name must match the name in the list of processes returned by ps. Note that on some systems, e.g. the Sun Solaris operating system, the killall command will try to kill all active processes. This may have severe consequences if you are root. If this is the case for the system you are using, avoid using the killall command. With both of the commands kill and killall it is possible to supply an argument to send an different signal than the TERM signal. Sometimes you will have to send the KILL signal instead, which will certainly cause the death of the process. This is done with the command ‘kill -KILL <pid>’ or the command ‘killall -KILL <process name>’. Table 5.3: Commands to operate on text files. Command Name cat cp dvips file grep, egrep, fgrep gv gzip, gunzip, zcat less ln more mount mv rm sort tar tr type uniq which xdvi Description Concatenate files and print on the standard output. Copy files and directories. Convert a TeX DVI file to PostScript. Determine file type. Print lines matching a pattern. A PostScript and PDF previewer. Compress or expand files. The opposite of more. Make links between files. A filter for paging through text one screenful at a time. This version is especially primitive. The command less provides more emulation and extensive enhancements. Mount a file system. Move (rename) files. Remove files or directories. Sort lines of text files. Archiving utility. Translate or delete characters. Indicate how a string, or name, would be interpreted if used as a command name. Remove duplicate lines from a sorted file. Locate a program file. DVI Previewer for the X Window System. 30 Chapter 5. A Step Further 5.3 O PERATING ON F ILES Table 5.3 on the page before shows a list of commands to operate on files. Nonempty files consist of data. It may be represented in ASCII or in binary format, but in the end, all files consist of data. It is possible to use the commands in the table to operate on binary files too, but avoid listing the contents of a binary file to the terminal, as the results often get a little bit frustrating. As mentioned earlier, all files accessible in a Unix system are arranged in one big tree, the file hierarchy, rooted at ‘/’. These files can be spread out over several storage devices. Each such storage device, e.g. a hard disk, contains a file system. These file systems may be attached to the large file tree by mounting them. After being mounted, the file systems will then appear a sub-trees of the larger file tree. To copy a file, you can use the command cp, as in cp <source> <destination> where <source> and <destination> are the files to copy from and to, respectively. To move a file, you may use the command mv, as in mv <source> <destination> where the file name arguments mean the same as for the cp command. If <destination> is in the same directory as <source>, the file is simply renamed. If not, it is moved to the <destination> directory and possibly renamed3 . You can remove a file with the rm command. Be careful though, as you will get no warnings when issuing Unix commands4 . The Unix system expects that you know what you are doing. This may be frightening at first, but after a while, it seems like the only reasonable way to do it. Sometimes you may want to gather a set of files or directories into one archive file. This may be accomplished by using the tar command. To create an archive, issue the command tar cvzf archive_name.tar.gz <file|directory>... where archive_name.tar.gz is the resulting archive file, in a compressed state. The string <file|directory>... indicate that you should supply any file or directory you want to put in the archive. To decompress the file, you can use the gunzip command, and to compress a file, you can use the command gzip. To extract the contents of an archive file, use the command tar xvzf archive_name.tar.gz where archive_name.tar.gz is the name of the archive file. Note that in the two examples above, the ‘z’ argument causes tar to use gzip and gunzip filters. To change the access permissions for a file you can use the command chmod and to change the owner or the group of a file, you may use chown and chgrp respectively. 3 It is renamed if you supply a destination filename too, not just a destination directory. Often you can supply an argument like ‘-i’ or ‘--interactive’ to these commands which will cause them to prompt the user before any overwrite. 4 C HAPTER 6 N ETWORKING Wow! They’ve got the Internet on computers now! — Homer Simpson, “The Simpsons” The Internet has its origin in a U.S. Department of Defense program called Advanced Research Projects Agency Network (ARPANET) that was established in 1969 to provide a secure communications network for organizations engaged in defense-related research. Researchers and academics in other fields began to make use of the network. At length the National Science Foundation (NSF) took over much of the technology from ARPANET and established a distributed network of networks capable of handling far greater traffic which is today known as the Internet. From its creation, the Internet grew rapidly beyond its largely academic origin into an increasingly commercial and popular medium. The original uses of the Internet was to send and receive e-mail, to transfer files, bulletin boards, newsgroups and access to remote computers (by using telnet). The World Wide Web (WWW), which enables simple and intuitive navigation of hypertext at Internet sites through a graphical interface, expanded dramatically during the 1990s to become the most important component of the Internet. 6.1 T ELNET AND R EMOTE L OGIN As Unix is a multi user, multitasking operating system, networking has always represented a large part of the Unix environment. For two or more persons to share the same computer, it is more natural to have one user sitting at the physical place and other users doing remote logins to it. In some common PC operating systems, it has not been that common to do these things, as they have not been multi user operating systems. As you will see in chapter 7 on page 47, it is even possible to be logged into a remote computer and display graphical windows from that machine on your screen. To simply log into a remote computer you could have used the commands telnet or rlogin, but the author will not encourage the use of these programs in practice because they are not even remotely secure. These commands transmit their communication over the network in a human readable format and is very easy to eavesdrop to. If you use telnet or 31 32 Chapter 6. Networking rlogin, you could enable forwarding of X windows to your screen by using the command xhost, which controls the access to your local X Window System server. To have a remote computer display its graphics on your computer display, you have to allow it to connect to your X Window System server. If the computer is called name, you can add it to the list of allowed computers by issuing the command ‘xhost +name’ and you can remove it with ‘xhost -name’. But, as mentioned above, this will not be secure at all. What you should use is the Secure Shell, which is discussed in section 6.5.2 on page 43. Both of the commands rsh and rexec execute commands on a remote host, although in slightly different manners. The command rcp enables you to copy files to and from remote computers. All of these commands can, and should, be replaced with programs from the Secure Shell package. 6.2 B ROWSING THE W ORLD W IDE W EB These days, it seems like one of the most important components of an operating system, seen from the user’s perspective, is the browser. This is a program for accessing sites or information on a network, such as the World Wide Web. And as any computing environment with respect for itself, Unix has a couple of different browsers to chose from. One, named Lynx is a fully-featured WWW client for users “running cursor-addressable, character-cell display devices”, as the manual pages states it. That is basically all kinds of text terminals. Many users prefer a text-only browser to a graphical one, because they get much of the disturbing information filtered out simply by removing all the images. The user will get indications of where the images are and what they represent1 . The Lynx browser may be started from the command line with the command lynx. The one, probably most used, browser under Unix is called Netscape Navigator or Netscape Communicator. Depending on what version is installed on your system, one of these can be started with the command netscape. Along with a WWW browser called Mosaic (which may be started with the command mosaic), Netscape has been preferred by many Unix users for many years. In 1998, Netscape released large parts of the source code for Netscape Navigator which was adopted by the Mozilla Project. They have been developing an open source browser called Mozilla since then. As it takes shape, it seems that it can become very nice and stable. To start Mozilla, type mozilla at the command line. Another actor in the Unix 2 browser market is the browser called Opera developed by the Norwegian firm Opera Software. 6.3 I NTERNET F ILE T RANSFER Even though many of the WWW browsers have support for the File Transfer Protocol (FTP) they are seldom ideal for the purpose of browsing FTP servers and transferring a lot of files. Under Unix there is a dedicated program for doing this called ftp. A newer, more user friendly FTP client is called NcFTP. The command used to start it is ncftp. NcFTP supports tab-completion of filenames and commands, progress meters, background processing, autoresume downloads, bookmarking, downloading entire directory trees, etc. Both ftp and ncftp can be directed to a remote server at the command line when started, like in ‘ncftp 1 2 This is done by displaying the value of the alt="..." attribute of the <img>-tag in the HTML source. At least in the Linux part of the market. 6.4. Mail ftp://ftp.hostname.org/’. Table 6.1 on the following page shows a list of commands that may be used to operate the clients. Not all the commands may work with the plain ftp program, but all should work fine with ncftp. Like the above mentioned protocols and network services, FTP, also communicates the authentication information, i.e. the user name and the password, in plain human readable format. When logging into a private user account with ftp, this should be of your concern; but when logging into anonymous FTP servers, you should use the user name anonymous and your e-mail address as your password. This way, you do not reveal anything but your e-mail address to potential eavesdroppers. 6.4 M AIL I don’t even have an e-mail address. I have reached an age where my main purpose is not to receive messages. — Umberto Eco (Italian literary critic, novelist, and semiotician), quoted in the New Yorker Unix users have been enjoying e-mail for several decades. At first Unix e-mail permitted users on the same computer to communicate with each other via terminals. Later, Unix systems around the world were linked into a world wide web, decades before the development of today’s World Wide Web. There are a lot of different tools available for Unix systems for reading, writing and sending e-mail, and there will be given a short introduction to two of them below. 6.4.1 P INE Pine, which is an acronym for Program for Internet News & Email, is a tool for reading, sending, and managing electronic messages. It was developed by the Office of Computing & Communications at the University of Washington and it was designed specifically with novice computer users in mind. To start using Pine, just type in the command pine at the terminal. The screen shown in figure 6.1 on page 35 should appear. If it is the very first time you use Pine, it is possible you will be met by a welcome screen; if this happens you can just press ‘E’ to exit it, and to enter the main menu. From here you can select items on the menu by moving the highlight up and down with your cursor keys and pressing enter to invoke the selected part. You should probably start off by visiting the help section first. At all times, you can see a list of possible commands at the bottom of your screen. 6.4.2 GNUS Under Emacs there are several ways to read, write and send e-mail and news postings. One of these is called Gnus Network User Services (GNUS)3 . GNUS’ capabilities are apparently a proper superset of all other newsreaders’ capabilities. This makes it extremely powerful. It is massively configurable which makes it a little difficult to handle at first. And of course, GNUS does not have to invoke your editor to compose replies and followups, because it is 3 Which is another one of those recursive acronyms. 33 34 Chapter 6. Networking Table 6.1: Most of the commands available in ftp and ncftp. Command help ascii binary cat cd chmod close debug dir get lcd, lchmod, lls, lmkdir, lookup, lpage, lpwd, lrename, lrm, lrmdir ls mkdir open page pdir, pls put pwd quit quote rename rhelp rm rmdir set show type Description The first command to know is help. Help keyword prints information about the keyword. This command sets the transfer type to ASCII text. Sets the transfer type to raw binary. Acts like the /bin/cat command, only for remote files. Changes the working directory on the remote host. Acts like the /bin/chmod command. Disconnects you from the remote server. This command is mostly for internal testing. Prints a detailed directory listing. Copies files from the current working directory on the remote host to your machine’s current working directory. These are “l” commands that work with the local host. They perform the commands following the “l”, locally. Prints a directory listing from the remote system. Creates a new directory on the remote host. Establishes an FTP control connection to a remote host. Browses a remote file one page at a time, using your $PAGER program. These commands are equivalent to dir and ls respectively, only they feed their output to your pager. Copies files from the local host to the remote machine’s current working directory. Prints the current remote working directory. Finish using the program. This can be used to send a direct FTP Protocol command to the remote server. Change the name of a remote file Sends a help request to the remote server. Delete a remote file. Removes a directory. Lets you configure some program variables, which are saved between runs in $HOME/.ncftp/prefs. Display program variables. Change transfer types during the course of a session with a server. 6.5. Security 35 Figure 6.1: The Pine main menu screen. your editor. All these features makes GNUS the choice of many computer users, but it can be frustrating to get started. GNUS uses a dot-file called .gnus. This file contains initialization information for the program. To make it a little bit easier to get started, figure 6.2 on the following page shows an example initialization file for GNUS. All you have to do is to create a file like the figure shows, and change the value of the variables ‘gnus-select-method‘, ‘nnmail-spool-file‘, ‘mailhost-address‘ and ‘user-mail-address‘ to values that reflects your environment4 . To start GNUS, just type ‘M-x gnus’ in Emacs. When in GNUS, the commands in table 6.2 on page 37 should provide you with the basics. 6.5 S ECURITY Computer security has become increasingly important since the late 1960s, when modems5 were introduced. The proliferation of personal computers in the 1980s compounded the problem because they enabled crackers to illegally access major computer systems from the privacy of their homes. There are a lot of methods to reduce the risks associated with operating computer systems in a networking environment. One of the most central techniques here, in addition to plain authentication based on names and passwords, is the field of cryptography. 6.5.1 P RETTY G OOD P RIVACY There are many indications that lead the author to believe that protecting one’s privacy will probably be one of peoples most valued abilities as the world of electronic communication 4 The values in the example (figure 6.2 on the next page) should work fine at the Norwegian University of Science and Technology (NTNU). 5 A modem is a device that allows computers to communicate over telephone lines. 36 Chapter 6. Networking (setq gnus-select-method ’(nntp "news.ntnu.no")) (setq gnus-secondary-select-methods ’((nnml "private"))) (add-hook ’gnus-article-prepare-hook ’gnus-article-de-quoted-unreadable) (add-hook ’gnus-article-prepare-hook ’gnus-article-remove-trailing-blank-lines) (add-hook ’gnus-article-prepare-hook ’gnus-article-remove-cr) (setq nnmail-spool-file (expand-file-name "~/mail/INBOX")) (setq nnmail-crosspost nil) (setq nnmail-split-methods ’( ("ntnu-siving2002" "\\(^To\\|^From\\|^[Cc]*\\):.*siving2002@.*ntnu\\.no") ("bindeleddet" "\\(^To\\|^From\\|^[Cc]*\\):.*@bindeleddet\\.ntnu\\.no") ("meteor-users" "\\(^To\\|^From\\|^[Cc]*\\):.*meteor-users.*@rwii\\.com") ("linux-journal" "\\(^To\\|^From\\|^[Cc]*\\):.*@ssc\\.com") ("root" "^To:.*root") ("other" ""))) (setq gnus-message-archive-group ’((if (message-news-p) "misc-sent-news" "misc-sent-mail"))) (setq mail-host-address "mail.stud.ntnu.no") (setq user-mail-address "[email protected]") (setq gnus-local-organization "Private") (setq message-default-headers "Organization: Private\n" \ "Mime-Version: 1.0\nContent-Type: text/plain; " \ "charset=\"iso-8859-1\"\nContent-Transfer-Encoding: 8bit\n") Figure 6.2: An example .gnus-file. 6.5. Security 37 Table 6.2: A small set of central commands to operate GNUS. Command ‘M-x gnus’ ‘q’ ‘m’ ‘M-x mml-attach-file’ ‘M-x o’ ‘[enter]’ ‘A A’ ‘u’ Description Start GNUS. Exit GNUS or, if in “group”-mode, exit the current newsgroup. Send a mail. When done editing, send it by entering ‘C-c C-c’. If you change your mind and don not want to send it, press ‘C-x k [enter]’. If in message editing mode, attach a file to the outgoing MIME message. Move the cursor to the next window in Emacs. Selects a group or a message (both e-mail and news-posting) for reading. If selecting a group, you enter group-mode. From there you may select messages for reading. List all available groups that are available from the server, including those not subscribed to. Toggle subscription of the current group (the one under the cursor). accelerates. One obvious way to achieve this is to keep things secret for other parties than the ones you want to exchange information with. The science of enciphering and deciphering of messages in secret code or cipher is called cryptography. The story of cryptography in the information age is quite dramatic. Events of The Past In 1976, Whitfield Diffie and Martin Hellman discovered public key cryptography. Figure 6.3 on the next page shows an overview of how public key cryptography works. Each party of communication has a pair of keys. These are called the private key and the public key. The private one has to be kept secret but the public key may be, and should be, distributed to the parties one wants to communicate with. Information in human readable format is called plaintext6 . When encryption is applied to the plaintext with the help of a public key, the result is a ciphertext that hides the original information. This ciphertext only gives sense to a human after applying decryption with the matching private key. Not even the party who encrypted it is able to decrypt it without the private key. In 1977, Ron Rivest, Adi Shamir, and Len Adleman discovered another, more general, public key system called (RSA), after its authors. They were researchers at the Massachusetts Institute of Technology (MIT) at the time. The U.S. National Security Agency (NSA) tried to stop MIT and Rivest, Shamir and Adleman from publishing their discoveries but were ignored. Then IDEA was developed by Xuejia Lai and James Massey at ETH in Zurich. Then in 1991 the U.S. government introduced the 1991 Senate Bill 266. It was an effort to stop 6 That is, when the information is not hidden from the reader. 38 Chapter 6. Networking Public Key Plaintext Encryption Ciphertext Private Key Decryption Plaintext Figure 6.3: An overview of how public key cryptography works. organized crime and terrorism by demanding that all encryption software must have a back door in it. This bill prompted Phil R. Zimmermann to write version 1.0 of his program Pretty Good Privacy (PGP). It implemented RSA encryption combined with IDEA symmetric key cipher. Some friends of Zimmermann uploaded the program to bulletin board systems (BBS) in the USA. The intention was to ensure that PGP was available before the above mentioned law came into effect. This succeeded and today PGP is the de-facto standard for e-mail encryption, but some problems remained. National borders are just speed bumps on the information super highway. — Tim May Until 1999, it was illegal to export strong encryption programs such as PGP electronically from the USA without a special license. To work around this problem the International PGP (PGPi) project was started, and the first legally available version of PGP outside the USA (and Canada) was PGP 5.0i, released in 1997. This work was initiated and lead by Ståle Schumacher Ytteborg, a student at the University of Oslo in Norway at the time. The international version of the PGP program was exported as printed books and then scanned and OCRed to make the code available in electronic form. This way, millions of users worldwide have got access to free, strong cryptography. Using the software Now that we have seen some of the history of PGP, it is time to learn how to use it. As with most other software, how to display the help screen is of high importance. Figure 6.12 on page 42 shows the help screen for pgp (invoked by the command ‘pgp -h’). 6.5. Security 39 Pretty Good Privacy(tm) 2.6.3ia - Public-key encryption for the masses. (c) 1990-96 Philip Zimmermann, Phil’s Pretty Good Software. 1996-03-04 International version - not for use in the USA. Does not use RSAREF. Current time: 2001/02/20 08:42 GMT Pick your RSA key size: 1) 512 bits- Low commercial grade, fast but less secure 2) 768 bits- High commercial grade, medium speed, good security 3) 1024 bits- "Military" grade, slow, highest security Choose 1, 2, or 3, or enter desired number of bits: 3 Figure 6.4: The menu of different RSA key sizes in PGP. Generating an RSA key with a 1024-bit modulus. You need a user ID for your public key. The desired form for this user ID is your name, followed by your E-mail address enclosed in <angle brackets>, if you have an E-mail address. For example: John Q. Smith <[email protected]> Enter a user ID for your public key: John Doe <[email protected]> Figure 6.5: The second part of the key generation procedure in PGP, where the user is prompted to enter an user ID. First of all you will need to create a pair of keys. This is done by issuing the command ‘pgp -kg’. It is possible that you will have to create the PGP home directory manually first. This is done with the command ‘mkdir ˜/.pgp’. You will be prompted to select a RSA key size, see figure 6.4, where 512 bits corresponds to a low commercial grade , 768 bits corresponds to a high commercial grade and 1024 bits is named “Military” grade. The larger the key size, the more secure it will be, but the encryption and decryption processes will be slower. You are safe using a key composed of 1024 bits. Next, you will be prompted to enter a user ID for your new public key, see figure 6.5. This should be your real name and your e-mail address. If you do not have an e-mail address, you may use your phone number or some other unique information. You need a pass phrase to protect your RSA secret key. Your pass phrase can be any sentence or phrase and may have many words, spaces, punctuation, or any other printable characters. Enter pass phrase: **************************************** Enter same pass phrase again: **************************************** Note that key generation is a lengthy process. Figure 6.6: How PGP ask for a passphrase. The asterisk, ‘*’, is not shown, but inserted by the author. 40 Chapter 6. Networking We need to generate 858 random bits. This is done by measuring the time intervals between your keystrokes. Please enter some random text on your keyboard until you hear the beep: 0 * -Enough, thank you. ..........................**** .......................**** Pass phrase is good. Just a moment.... Key signature certificate added. Key generation completed. Figure 6.7: The last stage of the key pair generation in PGP. This is my secret. Figure 6.8: The contents of the plaintext file ‘test’. When this is done, you will be prompted to enter a passphrase, see figure 6.6 on the page before. The passphrase is used to maintain exclusive access to your private key and it should preferably be longer than a standard user password. Last, the program asks you to enter some random text on your keyboard, see figure 6.7. This is done to accumulate some random bits that are used to generate your key pair. Now, you are ready to start communicating privately on the Internet. One of the things you may use PGP to is to encrypt files. Let us say that the original file, called test, contains the plaintext as shown in figure 6.8, then you can issue the command shown in figure 6.9 on the facing page to encrypt it. The general syntax of that command is ‘pgp -ea file recipient’, where the flag ‘-ea’ specifies that the plaintext file file should be encrypted and that the format of the output file should be ASCII7 . Further, it specifies that the recipient of the cryptogram is recipient. The result will be stored in a file called test.asc which contains the ciphertext, as shown in figure 6.10 on the next page. Now, this ciphertext may be communicated e.g. by e-mail to the owner of the public key with which it was encrypted. And most importantly, as mentioned earlier, only the person that has the matching private key is able to decrypt it. To be able to encrypt a file with the recipient’s public key, you first have to retrieve a copy of it and add it to your public keyring8 . The copy of the recipient’s public key, may have been sent you by the recipient per e-mail or you may have retrieved it from the recipient’s home page or from a public key server. Once you have obtained a copy of a public key, you may add it to your public keyring by issuing the command ‘pgp -ka keyfile [keyring]’, where keyfile is the file containing the public key and keyring is an optional argument the may be used to specify another keyring file than the default. When the public key is known to pgp, it may be used to encrypt messages to the recipient. When receiving an encrypted message (a ciphertext) you may decrypt it with the command ‘pgp -d cryptogram’, like shown in figure 6.11 on the facing page. This produces a plaintext file, here called ‘test’ which is identical with the file in figure 6.8. The above examples are written for PGP version 2.6.3ia. It seems that as of PGP version 7 This ASCII format is really called ASCII armor, and is the ASCII radix 64 format that PGP uses for transmitting messages over channels that requires ASCII data. 8 A keyring is a file containing a set of public or secret keys. 6.5. Security 41 nirvana:mtr$ pgp -ea test ’Martin Thorsen Ranang <[email protected]>’ Pretty Good Privacy(tm) 2.6.3ia - Public-key encryption for the masses. (c) 1990-96 Philip Zimmermann, Phil’s Pretty Good Software. 1996-03-04 International version - not for use in the USA. Does not use RSAREF. Current time: 2001/02/20 11:51 GMT Recipients’ public key(s) will be used to encrypt. Key for user ID: John Doe <[email protected]> 1024-bit key, key ID 1CCD4191, created 2000/07/08 . Transport armor file: test.asc Figure 6.9: How to encrypt a file. -----BEGIN PGP MESSAGE----Version: 2.6.3ia hIwDh7G3TxzNQZEBA/9Gt36SvdGgiaqk+HGKgQAjvwI76b2L80gFDMiNPwcGo7Y4 ozx3opYlP9TVTpPoEyTGRy/dJqcQxEJCsu++MC0EuOtX3jAGhklUX+1ad20EULQh Sy9Dd+CBqkxfJkt46gKaqhKA6P6NnAF8XU6AJsPHDlJGCvfnv3IrFTepzo14X6YA AAAqN1TN1DuFkmZAI/1s7a7RjsBlOtF6mrcQ+ZHmKsjG8ND8YLokiIRxPraT =aNBL -----END PGP MESSAGE----Figure 6.10: The contents of the ciphertext file ‘test.asc’. nirvana:mtr$ pgp -d test.asc Pretty Good Privacy(tm) 2.6.3ia - Public-key encryption for the masses. (c) 1990-96 Philip Zimmermann, Phil’s Pretty Good Software. 1996-03-04 International version - not for use in the USA. Does not use RSAREF. Current time: 2001/02/20 14:13 GMT File is encrypted. Secret key is required to read it. Key for user ID: John Doe <[email protected]> 1024-bit key, key ID 1CCD4191, created 2000/07/08 You need a pass phrase to unlock your RSA secret key. Enter pass phrase: Pass phrase is good. Just a moment...... Plaintext filename: test Figure 6.11: How to decrypt a ciphertext stored in the file ‘test.asc’. 42 Chapter 6. Networking Here’s a short summary of commands in PGP 2.6.3i: Generate new key pair: Add key: Extract key: View key(s): View fingerprint: Check & view in detail: Remove userid or key: pgp -kg [keybits] pgp -ka keyfile [keyring] pgp -kx[a] userid keyfile [keyring] pgp -kv[v] [userid] [keyring] pgp -kvc [userid] [keyring] pgp -kc [userid] [keyring] pgp -kr userid [keyring] (Repeat for multiple userids on a key) Edit trust params: pgp -ke userid [keyring] Add another userid: pgp -ke your_userid [keyring] Edit passphrase: pgp -ke your_userid [keyring] Sign a key in pubring: pgp -ks other_id [-u sign_id] [keyring] Remove a sig from key: pgp -krs userid [keyring] Revoke, dis/enable: pgp -kd userid [keyring] Encrypt: pgp -e[a] textfile TO_id [TO_id2 TO_id3...] Sign: pgp -s[a] textfile \ [-u MY_id] Sign & encrypt: pgp -se[a] textfile TO_id [TO_id2 TO_id3...] \ [-u MY_id] Make detached cert: pgp -sb[a] [+clearsig=on] mainfile \ [-u MY_id] (Can do binaries) (clearsig=on may be set in CONFIG.TXT) Encrypt with IDEA only: pgp -c textfile Decrypt or check sig: pgp [-d] [-p] cryptogram (-d to keep pgp data, -p for original file \ name) Check detached cert: pgp certfile [mainfile] (If root of filenames are the same omit \ [mainfile]) Use Use Use Use Use Use Use Use Use [a] for ASCII output [-o outfile] to specify an output file [-@ textfile] to specify additional userids when encrypting [-z"pass phrase"] to specify your pass phrase [+batchmode] for errorlevel returns [f] for stream redirection ( pgp -f[ARGS] <infile >outfile ) [w] to wipe plaintext file (encryption operations) [m] to force display of plaintext only (no output file) [t] to alter line endings for unix, etc. Figure 6.12: The help information for PGP version 2.6.3i. 6.5. Security 43 Table 6.3: An overview of how the command line interface for PGP has changed from version 2.x to version 5.0 and later. Version 2.x ‘pgp -e’ ‘pgp -s’ ‘pgp’ ‘pgp -k’ Version 5.0 and later pgpe pgps pgpv pgpk pgpo Functionality Encrypt Sign Verify/Decrypt Key management PGP 2.6.2 command-line simulator 5.0, the command line interface for the program has changed. The new version is invoked from different executables for different operations. The command ‘pgp -e’ under version 2.x has been transformed into the command pgpe under versions 5.0 and later, i.e. the flag representing the most describing operation has been made a part of the command name. An overview of the changes introduced with version 5.0 is shown in table 6.3. Some systems provide some older 2.x version of PGP side by side with a newer version of PGP. Where this is the case, the command to invoke the old version of PGP usually is pgp2 or pgpo. 6.5.2 T HE S ECURE S HELL The Secure Shell (SSH) is a program for logging into a remote computer and for executing commands on remote computers; therefore it is a secure replacement for a lot of different networking programs like rsh, rlogin, rcp, telnet, rexec, rcp and ftp. Many users of these programs might not realize that their password is transmitted across the Internet in plaintext, but it is; and it is very easy for others to pick up, e.g. by using a sniffer. This is called eavesdropping. SSH encrypts all network communications (including passwords) to effectively eliminate eavesdropping, connection hijacking, and other network-level attacks. It also provides various levels of authentication depending on your needs. These secure connections are often called secure channels. With SSH you can also do secure forwarding of TCP/IP ports and X11 connections. The Secure Shell comprises a server, named sshd, and a client program named ssh. As a user, you do not need to worry about the server. If you want to do a secure login into a remote computer, that computer has to run sshd, if not, it does not matter how hard you try. Before you can start to use the Secure Shell package, you must create a pair of authentication keys. This is done by issuing the command ssh-keygen at the command line. An example is shown in figure 6.13 on the next page. The first information you are asked for is where to store the keys. It is safe to just use the default which is suggested in parentheses by the program. The next step is to enter a passphrase. As with the passphrase used to generate your Pretty Good Privacy key, this should be longer than just a usual password. If this passphrase is not long enough, the effect of using the Secure Shell is strongly reduced, because it will be much easier to guess by brute force cracking attacks. This is indicated with asterisk, to denote a hypothetical passphrase, in figure 6.14 on the following page. Now, if you type the same passphrase twice, you should be informed that your identity has been 44 Chapter 6. Networking nirvana:mtr$ ssh-keygen Initializing random number generator... Generating p: ........++ (distance 148) Generating q: .......++ (distance 88) Computing the keys... Testing the keys... Key generation complete. Enter file in which to save the key (/home/mtr/.ssh/identity): Figure 6.13: Creating a pair of Secure Shell authentication keys. Enter passphrase: ****************************** Enter the same passphrase again: ****************************** Figure 6.14: Entering the Secure Shell passphrase. Your identification has been saved in /home/mtr/.ssh/identity. Your public key is: 1024 35 148382454971274366643698561992246954340984185524810155 25528070692901891133535097888259026158250414905475283935768948 74820107972119013329770664272140610825833371135552852514114489 22519654745970588239128638132648829244055882961906832025639706 06654702203207043026398667511091996615480955584629921032222206 4715641 [email protected] Your public key has been saved in /home/mtr/.ssh/identity.pub Figure 6.15: Secure Shell key generation confirmation. 6.5. Security 45 saved and what your public key looks like, see figure 6.15. To log into a computer with the Secure Shell client you issue one of the following commands: ssh [-l login_name] hostname or ssh login_name@hostname where hostname is the name of the remote host and login_name is the name of the remote user account you want to log into. If the login name of the user is the same on both computers, you do not have to specify the login_name. If you want to copy a file securely over the network, you should use scp (secure copy), which is part of the SSH package. It copies files between hosts on a network and uses ssh for data transfer. It also uses the same authentication and provides the same security as ssh. To copy a file you can issue the command [[user@]host1:]filename1... [[user@]host2:]filename2 where user and host specifies the computer and user account to be used, and filename specifies which file to copy from and to. Any file name may contain a host and user specification to indicate that the file is to be copied to or from that host. This way it is also possible to copy files between two remote hosts. It is possible to use a program called ssh-agent that will make life much easier when operating in a computer network. This agent will represent you, and do the necessary authentication for you every time you log into a remote computer, if you have configured the remote computer likewise. To make this scenario work, you will have to add your local SSH public identity, $HOME/.ssh/identity.pub, to a file called $HOME/.ssh/authorized\ _keys on the remote computer. This can be done with the command cat ~/.ssh/identity.pub | \ ssh user@host "cat >> .ssh/authorized_keys" When this is set up, there are mainly two different methods to start the ssh-agent. The first, and most preferable if you are using X, is to start ssh-agent in your $HOME/ .xsession9 file. This could be done by adding a line like eval ‘ssh-agent‘ && ssh-add & into this file. The command ssh-add adds identities for the authentication agent. When this program is starting up, you will be prompted for your passphrase. After entering your passphrase, your agent will be able to authenticate you when you log into remote hosts that knows your public identity; how to distribute this is described above. You should only need to type in your passphrase once per session. If you are not using X, the following commands should be appropriate to start the agent and add an identity: eval ‘ssh-agent‘ && ssh-add Note the absence of the ‘&’, as the command now will prompt for input from the terminal and thus cannot be running in the background. 9 This file is sometimes called $HOME/.xinitrc. If this is the case for your system, it can be fixed with a symbolic link. C HAPTER 7 T HE X W INDOW S YSTEM If you can’t make it good, at least make it look good. — Bill Gates, Microsoft Founder The X Window System, often simply referred to as X or X11, is a network transparent window system which runs on a wide range of computing and graphics machines. X is the native windowing system on all versions of Unix and VMS. There are X servers available for both Microsoft Windows and OS/2. Some X servers are commercial, contrary to the freely redistributable open source implementation XFree86. The X Window System is the underlying software that is between the hardware and graphical user interface (GUI). The X Window System is based on a client/server architecture; it comprises a X server and a X client. The server runs on a computer with bitmap display, i.e. a monitor. It distributes user input to, and accepts output requests from, various client programs through interprocess communication. Most commonly the X Window System client programs are run on the same computer as the X Window System server, but it is possible to run the clients transparently from other computers. The remote program may run on machines with different hardware architectures and operating systems. The X Window System supports overlapping hierarchical sub-windows and text and graphics operations. The background of the X Window System screen is often referred to as the root window. 7.1 W INDOW M ANAGERS The layout of windows on the screen is controlled by special programs called window managers. Although many window managers will honor geometry specifications as given, others may choose to ignore them (requiring the user to explicitly draw the window’s region on the screen with the pointer, for example). The window manager also controls the appearance of the borders of your windows, their behavior and all user interaction, including positioning, killing, resizing, moving, iconifying, shading, etc of your windows. Some window managers also supports the use of virtual desktops and menus attached to windows. 47 48 Chapter 7. The X Window System Figure 7.1: An example Unix desktop. 7.2. Desktop Environments In addition they usually provide access to menus in the root windows and can control the background images used for your desktop. Window managers are regular client programs, so a variety of different user interfaces can be built. An example of a very small and simple window manager is the Tab Window Manager (TWM). This window manager supports overlapping windows, pop-up menus, point-and-click or click-to-type input models, title bars and icons. There are also a large set of other window managers for X Window System. Among these are Aferstep, F(?) Virtual Window Manager (FVWM)1 , F(?) Virtual Window Manager 2 (FVWM2), WindowMaker and Enlightenment (E). There are a lot of other window managers that could be mentioned here, but this gives a taste of all the flavors available. What window manager you prefer is a matter of taste. TWM is a small and fast window manager. It does not look to good, but if you are running X on a slow computer, TWM would be a rational choice. On the other hand there are some window managers that can be configured to look very nice. 7.1.1 E NLIGHTENMENT Enlightenment is a window manager written by Carsten Haitzler and Geoff Harrison et al. Enlightenment is a large and quite complex program. It has evolved to be a satisfactory stable window manager with many advanced features. It includes lots of eye candy and special effects. This can be very distracting at times, but Enlightenment is designed to be highly configurable; therefore you can configure it to behave exactly as you like. Figure 7.1 on the preceding page shows a Unix desktop running Enlightenment. 7.2 D ESKTOP E NVIRONMENTS A desktop environment is not a window manager, but it may include the functionality of a window manager, or it may coexist with one. The desktop environment is a metaphor referring to desktops in a real office. A desktop environment may include a file manager, a “panel2 ” for task switching, launching programs, and docking windows, a “control center” for configuration, and several smaller bells and whistles. These programs hide the traditional Unix shell behind an easy-to-use graphical interface. 7.3 GNOME GNOME, see figure 7.2(a) on the following page, is a free (or “open source”) software development project started in 1997 by Miguel de Icaza of the Mexican Autonomous National University and a small team of programmers from around the world. The GNOME Project 1 Rob Nation (the original author of FVWM) does not really remember what the F stood for originally, so we have several potential answers: Feeble, Fabulous, Famous, Fast, Foobar, Fantastic, Flexible, F!@#$%, Flashy, FVWM (the GNU recursive approach), Free, Final, Funky, Fred’s (who the heck is Fred?), Freakin’, Flawed, Father-of-all, Feivel (the mouse from “An American Tail”), etc. 2 The author often refers to this as the handicap bar, among friends. It takes up space on the screen, and it cannot do anything you cannot manage with usual window manager operations and a Xterm. 49 50 Chapter 7. The X Window System has built a complete free and easy-to-use desktop environment. It is also a very powerful application framework for the software developer. GNOME is an acronym for “GNU Network Object Model Environment” and, as the name reveals, it is part of the GNU Project. (a) The GNOME logo. Courtesy of Tuomas Kuosmanen. (b) The KDE logo. Courtesy of The KDE Project. Figure 7.2: The logos of GNOME and KDE. 7.4 KDE The Kool Desktop Environment (KDE), see 7.2(b), is a network transparent contemporary desktop environment for Unix workstations. KDE seeks to fill the need for an easy to use desktop for Unix workstations, similar to the desktop environments found under the Mac OS or Microsoft Windows 95/NT. The KDE Project is based on a GUI toolkit for the Unix platform called Qt. Qt is developed by a Norwegian company called Trolltech. KDE is very similar to the GNOME approach, but when first started, the Qt library was released under a non-free license. One of the consequences of this was that KDE applications was not “free” in the GNU sense. This changed on April 9, 2000, when Trolltech announced that Qt as of version 2.2 would be licensed under the GPL. The main differences between KDE and GNOME is that KDE resembles Microsoft Windows 95/NT much more than GNOME does. So what you prefer is simply a matter of taste. 7.5 O PERATING IN THE X W INDOW S YSTEM Most users are accustomed to using graphical user interfaces. But there are a few points to note when using a Unix system. The X Window System usually expects a mouse with three or more buttons. If you have a mouse with only two mouse buttons, you can emulate the third one by pressing both button simultaneously. In some operating systems it is usual practice to double-click on graphical items when you want something to happen. Under Unix, how this is handled is decided by the window manager and it is common to only require a single click to perform an action. In KDE, it is possible to configure this behavior, but the default is to use double-clicks. 7.5. Operating in the X Window System In addition you should note that to copy text from one window to another, you only have to mark the text in question3 . Now, all you have to do is to either press the middle mouse button or to press ‘Shift-Insert’ on the keyboard. So, there is no need to specify that you want to copy the marked area before you can paste it. Again, the X Window System way of doing may feel a little awkward to start with, but if you think of it there really are no good reasons to do this any other way4 . Most users should find their way around the desktop pretty fast, but it is worth mentioning that depending on what window manager you are using, the main menus are either available from a button on the handicap bar or they will appear when pressing different mouse buttons on the background of the desktop. 7.5.1 T HE X TERM One of the most important X Window System programs is the Xterm. It is a terminal emulator for X and it is usually found in one of the main menus. It can be started from an already existing terminal with the command xterm. Many new users of Unix prefer to execute all programs from a menu somewhere, but if you do this you will not get access to the full power of Unix. All programs that can be started from a menu may certainly be started from a terminal, but not vice versa. In addition, it is not possible to use the functionality of redirection and filtering in the same way and with the same elegance graphically as on the command line. So, try to make it a rule to always have at least one Xterm, or similar program on your desktop. 3 This is usually done by moving the X Window System cursor to the start of the area you want to mark, pressing the button down and moving to the end of the area before you release the button. 4 What other reasons are there to mark an area of text? 51 C HAPTER 8 A PPLICATIONS It has become appallingly obvious that our technology has exceeded our humanity. — Albert Einstein (1879–1955) There are a large set of higher level applications available in the Unix domain. To comment them all here would be meaningless and not worth the effort. However, there are some applications that are often requested. These are among others a set of office applications, i.e., programs to write documents, create spreadsheets and to manipulate graphics. Under Unix, there are several packages, or stand-alone programs, that may help you when performing such tasks. Two of these are called WordPerfect Office and StarOffice that provide tools to perform all the standard tasks in an office system. 8.1 LATEX Beware of bugs in the above code; I have only proved it correct, not tried it. — Donald E. Knuth, on a five-page memo entitled “Notes on the van Emde Boas construction of priority deques: An instructive use of recursion.” LATEX is a high-quality typesetting system, with features designed for the production of technical and scientific documentation. LATEX is the de facto standard for the communication and publication of scientific documents. LATEX is a macro package based on Donald E. Knuth’s TEX typesetting language TEX. Knuth started his work on TEX in 1977 to explore the potential of digital printing that was beginning to infiltrate the typesetting industry at the time. Especially, he hoped to reverse the deteriorating quality he saw affecting his own books and articles. LATEX was first developed in 1985 by Leslie Lamport, and is now being maintained and developed by the LATEX3 Project. The Fragrance of Unix was written using Emacs and typeset using LATEX 2ε . Figure 8.1 on the following page shows a simple example of a LATEX document. Commands in LATEX start with a leading ‘\’. The arguments to the commands (enclosed in angle brackets in the example) should all be substituted with more meaningful phrases. To learn more about 53 54 Chapter 8. Applications \documentclass[norsk,a4paper,twocolumn]{article} \usepackage{a4wide} \usepackage[T1]{fontenc} \usepackage{babel} \author{<Author Name>} \title{<Title>} % This document will be typeset in two columns in the form of an % article. \begin{document} \maketitle \section{<First Section>} Some text should go here\ldots % Comments start with a ‘%’. \subsection{<First Subsection>} And you could go on and on and on and on\ldots \end{document} Figure 8.1: An example of how a LATEX file may look like. LATEX, you should visit The LATEX Project on the World Wide Web where there are many freely available introductions to LATEX. To get an even better guide to this typesetting system, you should read (Goossens et al., 1994), which is a comprehensive guide to LATEX. To create a LATEX document, you first write it in an editor. Then, you compile the document using latex, elatex, lambda or pdflatex which will produce files of different output format. If you use the plain latex command, a device independent file (DVI) is produced. This file may be viewed with the command xdvi and it may be converted to a PostScript file with the command dvips. 8.2 T HE GIMP The GIMP is short for the GNU Image Manipulation Program. It is a program suitable for such tasks as photo retouching, image composition and image authoring. It is possible to extend the program by adding some plug-ins and scripts available online1 . Figure 8.2 on the facing page shows the main menu of the GIMP. The graphics for The Fragrance of Unix have been prepared using the GIMP and Xfig. The screen shots where done using the GIMP, while the schematic figures where created in Xfig. 8.3 GNOME O FFICE GNOME Office is a meta-project, with the mission to coordinate productivity applications for the GNOME Desktop. The GNOME office suite is not defined by an arbitrary, fixed number of applications. The suite is defined by the underlying technologies of GNOME. 1 Please see section A.1 on page 61 for more information. 8.3. GNOME Office 55 Figure 8.2: The main menu window of the GIMP. By permitting multiple applications in several categories, users can select the application most suited to their needs. Table 8.1 on the next page shows the main categories of GNOME office and the applications that represent each of them. 56 Chapter 8. Applications Table 8.1: Some applications available under the GNOME Office umbrella. Gnumeric OpenCalc AbiWord OpenWriter Gfax Galeon Sodipodi OpenDraw Sketch Eye of GNOME GIMP Balsa Evolution Guppi Dia Toutdoux GnuCash Impress GNOME-DB Spreadsheets A powerful spreadsheet application. The OpenOffice spreadsheet. Word Processors A popular multi platform word processor. The OpenOffice word processor. Communications Allows you to easily send and receive faxes. Browsing A fast and standards compliant web browser. Vector Graphics A Vector drawing package. The OpenOffice drawing application. A Vector drawing package. Image Viewers An image viewer. Raster Graphics An extremely powerful and versatile image editing program. Email/Group Ware A flexible and powerful email client. An integrated calendaring, email application and personal information manager. Plotting A plotting and graphing program. Diagraming A structured diagrams program similar to Visio. Project Management A tool for project management. Finance A personal finance manager. Presentation The OpenOffice presentation program. Database Tools Provides database connectivity. C HAPTER 9 C ONTINUING O N Y OUR O WN If A equals success, then the formula is: A = X + Y + Z, X is work. Y is play. Z is keep your mouth shut. — Albert Einstein (1879–1955) So far, The Fragrance of Unix has hopefully given you an informative introduction to Unix systems and how to manage typical tasks. This chapter is meant to be your guide to a state of autonomy. Here you will learn how to get help and how to find more information about the problems you may stumble across in a Unix system. One of the main keys to mastering any computer system is not to be afraid of it. You have to dare to make experiences and to gain knowledge of the system. It may seem scary that the Unix system is so unforgiving about errors and mistakes done by the user, i.e. you get few or no warnings. But, please do not focus to much on that aspect, but concentrate on the bright side; You will hardly ever be able to break anything in the Unix system as a regular user, and if you do something wrong, chances are that it will only affect you and your files. So, do not be afraid to play with different commands and applications, but remember to make backups of your work1 . 9.1 T HE M ANUAL PAGES As mentioned in chapter 5 on page 25, large parts of the Unix system and its accompanying tools is documented on-line in the man pages. To lookup the manual page for a command or concept, just type ‘man’ followed by the name of the command or concept at the command line. Figure 9.1 on the next page shows the top of the manual page for man, started with the command ‘man man’. Table 9.1 on the following page shows the different categories which the manual pages are divided into. These categories are called sections. There may be different man pages with the same name, residing in different sections. If this is the case, you can specify which section’s man page you want to see by issuing the command ‘man <section> <name>’. 1 In many large networks, e.g. at universities, backups are made automatically, without users knowing it. 57 58 Chapter 9. Continuing On Your Own man(1) man(1) NAME man - format and display the on-line manual pages manpath - determine user’s search path for man pages SYNOPSIS man [-acdfFhkKtwW] [-m system] [-p string] [-C con fig_file] [-M path] [-P pager] [-S section_list] [section] name ... DESCRIPTION man formats and displays the on-line manual pages. This version knows about the MANPATH and (MAN)PAGER environment variables, so you can have your own set(s) of personal man pages and choose whatever program you like to display the formatted pages. If section is specified, man only looks in that section of the manual. You may also specify the order to search the sections for entries and which prepro cessors to run on the source files via command line options or environment variables. If name contains a / then it is first tried as a filename, so that you can do line 1 Figure 9.1: The top of the manual page for man. Table 9.1: The different sections of the online manual pages. Section 0 1 2 3 4 5 6 7 8 Description Introduction. Commands. System calls. Libraries. Devices. File formats. Games. Macros and language conventions. Administration. 9.2. The Info Program Sometimes, you may not know the exact name of the man page you are looking for and you may therefore feel a little bit lost. But do not give up. The command apropos (equivalent to ‘man -k’) may help you. This command searches a set of database files containing short descriptions of system commands for keywords and displays the result on the standard output. Figure 9.2 shows an example where both ‘man -k’ and apropos are used to look up a man page related to the word coffe. nirvana:unix_intro$ man coffe No manual entry for coffe nirvana:unix_intro$ man -k coffe c (1) - genericised soft drink generator (ie coffee, coke etc) nirvana:unix_intro$ apropos coffe c (1) - genericised soft drink generator (ie coffee, coke etc) Figure 9.2: Examples on how to use the commands apropos and man -k. The man pages is usually displayed in the terminal using more or less. This means that to scroll up and down or to search for certain parts of the document, you can use the same commands as in these programs. You may also view man pages from within Emacs. This is done with the command ‘M-x man [man page] ENTER’. There has also been developed an X Window System application for viewing man pages called Xman. It is started with the command xman. 9.2 T HE I NFO P ROGRAM Info is a program for reading documentation. Just as for the man pages, many program packages supply documentation about themselves on-line, but some packages uses a format used by info. Info is a hypertext documentation system. This means that you may follow references to documentation elsewhere by following links in an info page2 . To start Info, you may just enter ‘info’ at the terminal. Then you will start Info at the directory node where you can see a list of all major topics available. To select a topic, type ‘m’ followed by the name of the topic. You may also specify an Info node to view at the command line. This is done by issuing a command like ‘info gcc’, where gcc is the name of the topic. Emacs has very good support for Info. In fact Info is integrated into Emacs as a major mode. To start it, just type ‘C-h i’. 9.3 O THER S OURCES OF I NFORMATION There are a lot of other sources of information. Of course, there is a lot of information available on the World Wide Web. There are also written many good books and articles on subjects concerning the Unix system. You should be able to get your hands on a couple of good books by visiting a well equipped library or bookstore3 . 2 3 Just like you do when you follow a link on the Internet. A couple of books and articles that might be of interest for further studies are listed under References. 59 A PPENDIX A R ESOURCES As mentioned earlier in this book, one of the most important skills any user of a computer can learn, is where to get help and where to find more information about a topic. The goal of this appendix is to inform the reader of some very central and important sources of information. A.1 O N THE W ORLD W IDE W EB Table A.1 on the following page shows a list of URLs to resources on the World Wide Web and a short description of each resource1 . A.2 P RINTED M ATTER All the books listed in the references on page 63 are very well worth reading if you want to learn more about certain parts of the Unix system. The articles written by Dennis M. Ritchie and Ken Thompson are very concise in their description of the Unix operation system. 1 The URLs were checked and valid on 28th August 2003. 61 62 Appendix A. Resources Table A.1: The URL and a short description of each online resource. URL ftp://ftp.kernel.org/pub/linux/ kernel/ http://linuxguiden.linpro.no/ http: //www.bell-labs.com/history/unix/ http://www.freebsd.org/ http://freshmeat.net/ http://www.gimp.org/ http://www.gnome.org/ http://www.gnu.org/ http://www.gnus.org/ http://www.latex-project.org/ http://www.linpro.no/ http://www.linux.com/ http://www.netbsd.org/ Description The Linux kernel archives, provided by Transmeta Corporation. A nice GNU/Linux portal in Norwegian. Web pages at Bell Labs that tells the history of Unix. Official home page of the FreeBSD OS. Freshmeat maintains the Web’s largest index of Linux and Open Source software. The home page of the GIMP. It contains information about downloading, installing, using, and enhancing GIMP. The official home page of the GNOME project. The official home page of the GNU Project. The home page of the GNUS Newsreader. The home page of the LATEXproject. The home page for Linpro AS, a Norwegian GNU/Linux consulting company where the author is employed. Linux.com’s mission is to enrich the Linux community by providing a centralized place for individuals of all experience levels to learn (and teach) the power and virtues of the Linux Operating System. Home page of the NetBSD Project. Bibliography Goossens, M., Mittelbach, F., and Samarin, A. (1994). The LATEX Companion. Tools and Techniques for Computer Typesetting. Addison-Wesley. Green, F. E. (1999). Brain and learning research: Implications for meeting the needs of diverse learners. Education, 119(4):682–688. ISSN: 00131172. Kernighan, B. W. and Pike, R. (1999). Regular expressions; languages, algorithms, and software. Dr. Dobb’s Journal, 24(4):19–22. Preece, J., Rogers, Y., Sharp, H., Benyon, D., Holland, S., and Carey, T. (1994). HumanComputer Interaction, chapter Knowledge and Mental Models, pages 130–139. AddisonWesley. Raymond, E. S. (1999). The Cathedral & the Bazaar; Musings on Linux and Open Source by an Accidental Revolutionary. O’Reilly & Associates, 1st edition. Ritchie, D. M. (1979). The evolution of the UNIX time-sharing system. In Proc. of Symp. on Language Design & Programming Methodology, Sydney. Also in BLTJ, 63 (8, Part 2), pp. 1897-1910, October, 1984. Ritchie, D. M. and Thompson, K. (1974). The UNIX time-sharing system. Comm. Assoc. Comp. Mach., 17(7):365–375. Stallman, R. (1998). The GNU Emacs Manual; for Emacs Version 20.3. The Free Software Foundation, 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA, thirteenth edition. ISBN: 1-882114-06-X. Westerman, S. J. (1997). Individual differences in the use of command line and menu computer interfaces. International Journal of Human-Computer Interaction, 9(2):183–198. 63 Command Index a2ps, 11 apropos, 59 awk, 23 lambda, 54 latex, 54 less, 10, 29, 59 ln, 8, 29 locate, 26 lpq, 26 lpr, 11, 26 ls, 7, 26 lynx, 32 bash, 26 bg, 9, 28 cat, 10, 29 cd, 7, 12, 26 chgrp, 30 chmod, 30 chown, 30 chsh, 26 cp, 29, 30 man, 57 mkdir, 7, 26 mkpasswd, 27 more, 10, 29, 59 mosaic, 32 mount, 29 mozilla, 32 mv, 29, 30 dvips, 29, 54 egrep, 29 elatex, 54 exit, 28 ncftp, 32–34 netscape, 32 nl, 28 fg, 9, 28 fgrep, 29 file, 29 find, 23, 26 for, 27 ftp, 32–34, 43 pdflatex, 54 pgp, 38, 40 pgp2, 43 pgpe, 43 pgpk, 43 pgpo, 43 pgps, 43 pgpv, 43 pine, 33 ps, 9, 10, 28, 29 pwd, 7 gcc, 59 grep, 11, 22, 23, 29 gunzip, 29, 30 gv, 29 gzip, 29, 30 info, 59 init, 2, 3 rcp, 32, 43 rexec, 32, 43 rlogin, 31, 32, 43 rm, 29, 30 rsh, 32, 43 jobs, 28 kill, 28, 29 killall, 28, 29 65 66 Command Index scp, 45 sed, 22 sh, 26 sort, 29 ssh, 43, 45 ssh-add, 45 ssh-agent, 45 ssh-keygen, 43 sshd, 43 tar, 29, 30 telnet, 31, 43 top, 28 touch, 26 tr, 10, 29 type, 29 uniq, 29 until, 27 vi, 17 vim, 17 which, 29 while, 27 xdvi, 29, 54 xhost, 32 xman, 59 xterm, 51 zcat, 29 Index AbiWord, 56 access permissions, 8 mode, 8 Adleman, Len, 37 Advanced Research Projects Agency Network, 31 Aferstep, 49 anonymous FTP, 33 appending redirected output, 28 ARPANET, see Advanced Research Projects Agency Network Awk, 22 Digital Equipment Corporation, 1 Digital Unix, 1 directory, 7 DVI, see device independent file E, see Enlightenment eavesdropping, 43 editor, 15 Emacs, 15, 33, 53, 59 echo area, 15 frame, 15 menu bar, 15 Meta key, 16 minibuffer, 15 mode line, 16 modifiers, 16 point, 16 window, 15 encryption, 37 Enlightenment, 49 environment, 3 values, 3 variable, 9 variables, 3 escaping, 20 unescaped, 20 Evolution, 56 Eye of GNOME, 56 background, 9 Balsa, 56 BASH, see Bourne-Again Shell Bell Labs, 1 URL, 62 booting up, see bootstrapping bootstrapping, 2 Bourne-Again Shell, 26 broken links, 8 browser, 32 BSD, 1 character class, 20 ciphertext, 37 command completion, 27 command line, 9 interface, 11, 13 conditional constructs, 26 cryptography, 35, 37 F(?) Virtual Window Manager, 49 F(?) Virtual Window Manager 2, 49 file descriptors, 10 file path, 7 absolute path, 7 relative path, 7 file system, 6, 12 File Transfer Protocol, 32 filename completion, 27 filter, 11 de Icaza, Miguel, 49 DEC, see Digital Equipment Corporation decryption, 37 desktop environment, 49 device independent file, 54 Dia, 56 Diffie, Whitfield, 37 67 68 Index foreground, 9 fork, 3, 9 free, 1 free software, 4 FreeBSD, 1 Freshmeat, 62 FTP, see File Transfer Protocol FVWM, see F(?) Virtual Window Manager FVWM2, see F(?) Virtual Window Manager 2 Galeon, 56 Gfax, 56 GIMP, 54, 56, 62 GNOME, 49 URL, 62 GNOME Office, 54 GNOME-DB, 56 GNU, 4 GNU/Linux, 1 GnuCash, 56 Gnumeric, 56 GNUS, see Gnus Network User Services Gnus Network User Services, 33 graphical user interface, 47 group ID, 3 GUI, see graphical user interface Guppi, 56 Haitzler, Carsten, 49 hard link, 8 Harrison, Geoff, 49 Hellman, Martin, 37 history, 26 I/O devices, 8 Impress, 56 Info, 59 info directory node, 59 inode, 6, 8 International PGP, 38 Internet, 31 interprocess communication, 47 KDE, see Kool Desktop Environment kernel, 2 keyring, 40 kill, 28 Knuth, Donald E., 53 Kool Desktop Environment, 50 Lai, Xuejia, 37 Lamport, Leslie, 53 LATEX, 53 LATEX3 Project, 53 linking, 8 Linux, 1 literal, 20 login, 3 login shell, 26 looping constructs, 26 Lynx, 32 man pages, 25, 57 Massey, James, 37 mental model, 11 functional mental model, 12 structural mental model, 12 metacharacters, 20 Microsoft Windows, 47 Mosaic, 32 mounting, 30 mouse, 50 Mozilla, 32 Nation, Rob, 49 National Science Foundation, 31 NcFTP, 32 NetBSD, 1 Netscape, 32 Netscape Communicator, 32 Netscape Navigator, 28, 32 NFSA, see non-deterministic finite state automata non-deterministic finite state automata, 21 Norwegian University of Science and Technology, 35 NSF, see National Science Foundation NTNU, see Norwegian University of Science and Technology octal number, 8 open source, 3 OpenBSD, 1 OpenCalc, 56 Index OpenDraw, 56 OpenWriter, 56 Opera, 32 Opera Software, 32 operating system, 1 OS, 1 ordinary files, 6 OS/2, 47 PATH, 9 path, see file path PDP-11, 1 PDP-7, 1 Perl, 22 PGP, see Pretty Good Privacy PGPi, see International PGP pid, 29 Pine, 33 pipes, 11 plaintext, 37 POSIX, 1 PostScript, 11, 54 Pretty Good Privacy, 38–42 private key, 37 process, 2 process ID, 29 public key, 37, 40 public key cryptography, 37 public key server, 40 Python, 22 Qt, 50 RE, see regular expression recall, 12 recognition, 12 redirection, 10 regexps, see regular expressions regular expressions, 19 grouping construct, 21 Ritchie, Dennis M., 1, 61 Rivest, Ron, 37 root, 3, 29 root directory, 7 RSA, 37 Secure Shell, 32, 43 Sed, 23 69 semantic content, 12 SGI IRIX, 1 Shamir, Adi, 37 shell, 2, 9 Sketch, 56 sniffer, 43 Sodipodi, 56 spatial intelligence, 12 spawning, 3 special files, 8 SSH, see Secure Shell Stallman, Richard, 4 standard I/O, 9 error, 9 input, 9 output, 9 StarOffice, 53 stop, 9 storage devices, 30 Sun Solaris, 1, 29 SunOS, 1 symbol name completion, 27 symbolic link, 8, 12 syntactic form, 12 SYSV, 1 Tab Window Manager, 49 tab-completion, 17, 26 terminal, 9 TEX, 53 Thompson, Ken, 1, 61 Torvalds, Linus, 1 Toutdoux, 56 Trolltech, 50 truncated, 28 TWM, see Tab Window Manager U.S. Department of Defense, 31 University of Oslo, 38 University of Washington, 33 Unix, 1 user ID, 3 variable name completion, 27 Vi, 17, 18 Vim, 17 WindowMaker, 49 70 Index WordPerfect Office, 53 World Wide Web, 31, 32 WWW, see World Wide Web X, see X Window System X Window System, 15, 47, 59 client, 47 server, 47 X11, see X Window System Xfig, 54 XFree86, 47 Xman, 59 Xterm, 51 Ytteborg, Ståle Schumacher, 38 Zimmermann, Phil R., 38