Week Thirteen Online Notes



Introduction:

Link of the week

The Common Gateway Interface (CGI)



Not just another editor. This script has many industry first features that you simply have to see for yourself. Update and maintain articles and news items on your web site with this full-featured and extremely flexible content management system.

csMembership is a mySQL driven user management program built around PayPal's subscription services. csMembership interfaces with PayPal to allow for automated user sign-up, cancellations, and reoccurring billing.

A Chat-R-Box is a simple web based chat script that allows you to offer your web site visitors a place to go and chat with each other. Each user can select their own nickname and text color. This script is very easy to install and it is very easy to use. Download it today!

CGI Scripts and PERL

Tommy Yip

The University of Calgary

Calgary, AB

 ABSTRACT

This paper highlights the usage of CGI Scripts and PERL. The Common Gateway Interface (CGI) is a standard for interfacing external applications with information servers, such as HTTP or Web Servers. It has provided the first means by which these information servers to be extended to do new or dynamic behaviors and beyond HTML file serving. One of the languages of choices for CGI processing is Practical Extraction and Reporting Language (PERL). PERL is used often because it is specifically designed to butcher multiple text files and format them nicely, making it exceptional for writing HTML. Users benefit from a consistent, powerful, and usable interface environment that can do just about anything web browsers are able to handle. This illustrates the practical usefulness of CGI Scripts and PERL in the demanding interactive World-Wide Web.

INTRODUCTION

The World-Wide Web is a distributed hypermedia information network. Users navigate through the information in mainly static but context-sensitive ways with browsing tools. Browsers are client programs that run on the user's local machine, request information from server programs on remote machines, and display the information to the user. Documents for the World-Wide Web are usually written in the Hypertext Markup Language (HTML). Support for interactive applications on the World-Wide Web is provided by the Common Gateway Interface (CGI). This interface allows transferring information from the browser back to the server, processing the information by programs invoked by the server on the remote machine, and sending the results back to the browser. With this interface users have the choice of using a number of programming platforms/languages, depending on what is available on the system. A popular choice is the Practical Extraction and Reporting Language (PERL) for CGI applications. PERL is an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information.

This paper presents CGI and PERL concurrently. It will go over basics and specifics involved with the Common Gateway Interface. A look at some PERL basics will be covered and finally issues on overhead and security will be described.

COMMON GATEWAY INTERFACE

A DESCRIPTION

The main components of Common Gateway Interface can be derived in the following way. When we look at the word "Common", we are assuring the user that CGI can be used by many languages and interact with many different types of systems therefore not limiting the user to doing things one way. "Gateway" suggests that CGI's strength lies not only in what it does by itself but in the potential access it offers to other systems such as databases and graphic generators. Finally, "Interface" means that CGI provides a well-defined way to call up its features -- in other words, that the user can write programs that use it.

WEB SERVER INTERACTION

All CGI Scripts interact with a web server the same way. Someone reading an HTML page with a web browser invokes a specific CGI script at a specific web server through its Universal Resource Locator (URL). The HTML page may contain a form to gather some information from the reader. The web server at that location gets the web browser's request and relays user-defined inputs to the CGI Script. The CGI scripts wakes up and reads any inputs provided to it by the web server. The CGI script sees to it that the request is carried out. A response or the results of the request is sent back to the web server that asked for the behavior. These results may be in the form of a new HTML file, a graphics file, a text file - Basically anything that a web browser can read and parse. The CGI script can even write a custom HTML file - Dynamic HTML. The web server then takes the results from the CGI script and sends it back to the web browser. Finally, the user of the web browser sees (or hears) the results.

The manner (or Protocol) in which the web server passes information to the CGI script and the manner (or Protocol) in which the CGI script returns the results to the web server are fixed and totally described in a Common Gateway Interface standard. Anyone who builds a commercial, shareware or freeware HTTP/Web server supporting CGI follows this Common Gateway Interface standard.

THE BIG PICTURE: WHERE DOES CGI FIT IN?

The web is composed of clients and servers. CGI is used on the server to provide additional services and functionality to the client. The following are other methods of accomplishing similar tasks:

Server Side Includes (SSI): An HTML page is parsed for SSI commands before being sent to the server. Allows for limited dynamic components to be included in the web page.

Internet Server API (ISAPI): A DLL module used by the server ISAPI (for Microsoft IIS), NSAPI (for Netscape's server), and other server-specific packages duplicate CGI functions but integrate more tightly with the server.

JAVA: Allows programs to be run on the client rather than the server. This means that animations and interactivity run more quickly, but some processes, most notably file access, are limited.

JavaScript/VBScript: Client side commands included in the HTML page

ActiveX: "subroutines" which are accessed via VBScript. Functionality is much like Java

ShockWave: Multimedia content including audio, video, animation and interactivity.

Although the above mentioned methods mirror many of CGI's functions, but CGI is a common standard agreed and supported by all major HTTPD's because of its greater portability:

EXAMPLES OF CGI USAGE

With support of Common Gateway Interfaces on web servers, one can do just about anything a web browser can handle. CGI can be used to create forms on web sites that allow the user to enter information, which is processed by CGI and mailed to an administrator or logged. It can be used for on-the-fly pages, which are web pages created dynamically (as needed) with up-to-date information, database interaction, which are an application of on-the-fly pages that use information read from a database, or a web site form can allow a user to update database entries. Logging/Counters are also a common application for CGI. A log file can record traffic data updated with information on each visitor. A counter can be included on the web page to advertise traffic. Further, CGI can be used for animation in which "server-push" programs can be used to feed the client successive images in an animated sequence.

THE SPECIFICATION

Currently, the specification for CGI is version 1.1, or CGI/1.1. Further revisions of this protocol are guaranteed to be backward compatible.

The server and the CGI script communicate in four major ways - Environment Variables, the Command line, Standard Input, and Standard Output. In order to pass data about the information request from the server to the script, the server uses command line arguments as well as environment variables. These environment variables are set when the server executes the gateway program.

ENVIRONMENT VARIABLES

The following environment variables are not request-specific and are set for all requests:

• SERVER_SOFTWARE - The name and version of the information server software answering the request (and running the gateway). Format: name/version

• SERVER_NAME - The server's host name, DNS alias, or IP address as it would appear in self referencing URLs.

• GATEWAY_INTERFACE - The revision of the CGI specification to which this server complies. Format: CGI/revision

The following environment variables are specific to the request being fulfilled by the gateway program:

• SERVER_PROTOCOL - The name and revision of the information protocol this request came in with. Format: protocol/revision

• SERVER_PORT - The port number to which the request was sent.

• REQUEST_METHOD - The method with which the request was made. For HTTP, this is "GET", "HEAD", "POST", etc.

• PATH_INFO - The extra path information, as given by the client. In other words, scripts can be accessed by their virtual pathname, followed by extra information at the end of this path. The extra information is sent as PATH_INFO. This information should be decoded by the server if it comes from a URL before it is passed to the CGI script.

• PATH_TRANSLATED - The server provides a translated version of PATH_INFO, which takes the path and does virtual-to-physical mapping to it.

• SCRIPT_NAME - A virtual path to the script being executed, used for self referencing URLs.

• QUERY_STRING - This information which follows the ? in the URL which referenced this script. This is the query information. It should not be decoded in any fashion. This variable should always be set when there is query information, regardless of command line decoding.

• REMOTE_HOST - The hostname making the request. If the server does not have this information, it should set REMOTE_ADDR and leave this unset.

• REMOTE_ADDR - The IP address of the remote host making the request.

• AUTH_TYPE - If the server supports user authentication, and the script it protects, this is the protocol-specific authentication method used to validate the user.

• REMOTE_USER - If the server supports user authentication, and the script is protected, this is the username they have authenticated as.

• REMOTE_IDENT - If the HTTP server supports RFC 931 identification, then this variable will be set to the remote user name retrieved from the server. Usage of this variable should be limited to logging only.

• CONTENT_TYPE - For queries which have attached information, such as HTTP POST and PUT, this is the content type of the data.

• CONTENT_LENGTH - The length of the said content as given by the client.

In addition to these, the header lines received from the client, if any, are placed into the environment with the prefix HTTP_ followed by the header name. Any - characters in the header name are changed to _ characters. The server may exclude any headers which it has already processed, such as Authorization, content-type, and Content-length. If necessary, the serer may choose to exclude any or all of these headers if including them would exceed any system environment limits.

COMMAND LINE

The command line is only used in the case of an ISINDEX query. It is not used in the case of an HTML form or any as yet undefined query type. The server should search the query information (the QUERY_STRING environment variable) for a non-encoded = character to determine if the command line is to be used, it is finds one, the command line is not to be used. This trusts the clients to encode the = sign in ISINDEX querries, a practice which was considered safe at the time of te design of this specification. If the server does find a "=" in the QUERY_STRING, then the command line will not be used , and no decoding will be performed. The query then remains intact for processing by an appropriate FORM submission decoder. Since this QUERY_STRING contained an unencoded "=", nothing was decoded, the script didn't know it was being submitted a valid query, and just gave the user the default finger form. If the server finds that it cannot send the string due to internal limitations (such as exec() or /bin/sh command line restrictions) the server should include NO command line information and provide the non-decoded query information in the environment variable QUERY_STRING.

STANDARD INPUT

For requests which have information attached after the header, such as HTTP POST or PUT, the information will be sent to the script on stdin. The server will send CONTENT_LENGTH bytes on this file descriptor. Remember that it will give the CONTENT_TYPE of the data as well. The server is in no way obligated to send end-of-file after the script reads CONTENT_LENGTH bytes.

STANDARD OUTPUT

The script sends its output to stdout. This output can either be a document generated by the script, or the instructions to the server for retrieving the desired output.

LANGUAGE OF CHOICE: PERL

There are many choices when it comes down to selecting a programming language to apply to a CGI application. It all comes down to a personal preference. PERL has become one of the most popular choices. Some other widely used languages are C, C++, TCL, BASIC and shell scripts. Reasons for choosing PERL include its powerful text manipulation capabilities (in particular the 'regular' expression) and the fantastic WWW support modules available.

PRACTICAL EXTRACTION AND REPORTING LANGUAGE

A DESCRIPTION

Practical Extraction and Reporting Language (PERL) is an interpreted language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. PERL is also a good language for many system management tasks. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). It combines some of the best features of C, SED, AWK, and SH, so people familiar with those languages should have little difficulty with it. Unlike most UNIX utilities, PERL does not arbitrarily limit the size of your data. PERL uses sophisticated pattern matching techniques to scan large amounts of data very quickly. Although optimized for scanning text, PERL can also deal with binary data, and can make dbm files look like associative arrays (where dbm is available).

PERL is not a platform-dependent language. PERL was originally designed for UNIX systems but it has been ported to a variety of platforms. A PERL program written for use on a UNIX box will run (more or less) perfectly on a PC box. There are incompatibilities between versions and platforms, but they are minor.

THE BASICS

As mentioned in the previous section, PERL has a mix of features from a variety of other programming languages. Simple semantics and syntax will not be covered here, since it is beyond the scope of this report. However, receiving user input from forms and sending information back to the user will be discussed.

RECEIVING USER INPUT FROM FORMS

Commonly, most interactive environments on the Web involve forms. The HTML codes for generating forms, has the underlying tag that requires two arguments: METHOD and ACTION. The ACTION is the URL representing the script which is to receive the form information. The METHOD (either GET or POST) represents the way in which the information will get passed to the script. GET is slightly more limited (mostly in maximum length of information it can pass), but is slightly easier to deal with. If there are substantial text entry fields (esp. TEXTAREAs), the POST method should be used. The difference between these two methods is in the way information is passed.

Using METHOD="GET" we have:

1. FORM elements' (INPUTs, TEXTAREAS, SELECTs, etc.) names are paired with their contents. As an example, suppose the following HTML is part of a form:

into which the user entered 10003. These would be joined together with an = to make: zip=10003.

2. All such name/value pairs are joined together with an &.

3. The entire string is URL encoded. The resulting string from the example above is:

Name=Jane+Doe&address=35+W%27+4th+St%27&zip=10003

The string is then passed to the ACTION script in the environment variable QUERY_STRING.

With METHOD="POST", its much the same except for the number 3. For a POST, the encoded string is passed to the script's STDIN, and the length of the string in bytes is passed in the environment variable CONTENT_LENGTH.

The advantage of the "GET" method is that it can process command line variables. The disadvantage is that the input string is of limited length.

SENDING INFORMATION BACK TO THE USER

There are only a couple of basic things needed for sending appropriate information back to the user:

• Print Commands - Generally, print commands will send information to the STDOUT, which will get passed directly to the user's browser. This is generally what is passed. In more advanced PERL applications, information can be printed to a file, in which case an awareness of where print commands are sending information needs to be made

• Header Information - The HTTP standard includes header information which tells the browser what to do with what the information it receives. The browser will interpret everything it receives, up until the first blank line, as header information. Providing outgoing header information with user output is required.

1. CONTENT-TYPE - This is a borrowed element from the MIME standard. The browser at the receiving end doesn't know what sort of information it's going to get in response to the query it just sent, so it has to be told. Generally, the first thing that should be printed is "CONTENT-TYPE: text/html\n\n"; Anything that is printed after that will be interpretted by the user's browser as HTML, just as if it had come from a regular HTML text file.

2. LOCATION - Sometimes the owner wouldn't want to print their own HTML to a user, but will want to send that user to some other URL. The location header can be used to accomplish this. If "Location:\n\n"; is printed with nothing else (no other HTML or "CONTENT-TYPE" or anything), the user's browser will send a request to the specified server for the page at that URL.

COMMON PROBLEMS

Often times, problems will be encountered when engineering a CGI Script with PERL. The following is a list of common problems as well as a brief description of them.

• Control-M: UNIX and PC machines use a different control code to indicate a linefeed. If a script is uploaded from a PC to a UNIX box, each line will terminate with a control-M character. This will cuase an error in the first line of code. The control-M must be removed from

#!/usr/local/bin/perl^M

This problem can be avoided by FTPing the files in ASCII mode rather than "auto"

• CGI-BIN Privilege: A server is usually configured to only execute CGI programs located in particular subdirectories, traditionally the /cgi-bin/. If access to this filespace is restricted, the server administrator must place the scripts in the proper location and give the required URL.

• Executive Privilege: The script must be marked as executable. Since the server generally runs as "nobody" rather than under a specific account, it must be marked as "world-execute" (just because it runs from the command line does not mean the server can execute it). On a UNIX box, use "chmod" to change permissions.

chmod 755 myscript.pl

If there is a file input or output, the files must be marked "world-read". Use:

chmod 777 myscript.pl

for full read and write privileges.

• Output Headers: The HTTP header must be correct for any output intended for the browser. Leaving out the header or the blank line below it will result in output being misinterpreted or lost by the browser.

• Syntax: PERL is very picky about syntax. Forgetting semi-colons at the end of lines or using commas where they don't belong, such as:

print FILEHANDLE, "log this";

will result in errors. It is useful to run the script from the command line to check for errors before testing through the server.

• Comparison Operators: Mixing string and numeric comparators will result in errors or unexpected results.

$name="Bob";

if($name == "Bob") {

# doesn't matter, this is never true

}

CGI OVERHEAD

The CGI Overhead is a consequence of HTTP being a stateless protocol. This means that a common gateway interface process must be initialized for every "hit" from a browser.

In the first instance, this usually means the server forking a new process. This in itself is a very small overhead, but it can become important on a heavily-used server if the number of processes grows to problem levels. If the CGI programs are themselves long-running, this is heavily exacerbated.

In the second place, the CGI program must initialize. In the case of a complied language such as C or C++ this is negligible, but there is a penalty to pay for scripting languages such as PERL.

Thirdly, CGI is often used as 'glue' to a backend program, such as a database, which may take some considerable time to initialize. This represents a major overhead, which must be avoided in any serious application. The most usual solution is for the backend program to run as a separate server doing most of the work, while the actual CGI simply caries messages.

Fourthly, some CGI scripts are just plain inefficient, and may take hundreds of times the resources they need. Programs using system() or 'backtick' notation often fall into this category.

Note that there are ways to reduce or eliminate all these overheads, but these tend to be system- or server-specific. The best-supported server is probably Apache, as commercial server-vendors like to push their proprietary solutions in preference to CGI.

CGI SECURITY

A CGI program is prone to security problems no matter what language it is written in. Any time that a program is interacting with a networked client, there is the possibility of that client attacking the program to gain unauthorized access. Even the most innocent looking script can be very dangerous to the integrity of the system.

The following is a list of guidelines to make sure a program does not come under attack:

• Beware the eval statement: Languages like PERL and the Bourne shell provide an eval command which allow the construction of a string and have the interpreter execute that string. This can be very dangerous. Observe the following statement in the Bourne shell:

eval 'echo $QUERY_STRING | awk 'BEGIN{RS="&"} {printf "QS_%s\n",$1}'

This clever little snippet takes the query string, and converts it into a set of variable set commands. Unfortunately, this script can be attacked by sending it a query string which starts with a ;. This exhibits the premise of the innocent-looking scripts being dangerous.

• Do not trust the client to do anything: A well-behaved client will escape any characters which have special meaning to the Bourne shell in a query string and thus avoid problems with the script misinterpreting the characters. A mischevious client may use special characters to confuse the script and gain unauthorized access.

• Be careful with popen and system: If usage of any data from the client to construct a command line for a call to popen() or system(), be sure to place backslashes before any characters that have special meaning to the Bourne shell before calling the function. This can be achieved easily with a short C function.

• Turn off server-side includes: If the server is unfortunate enough to support server-side includes, turn them off for the script directories. The serer-side includes can be abused by clients which prey on scripts which directly output things they have been sent.

SUMMARY

 In conclusion, we can see the many benefits that the Common Gateway Interface (CGI) can offer to users and servers alike. CGI offers a more interactive environment for the user and can easily be created with many common computer languages. One of the more popular "languages" is the Practical Extraction and Reporting Language (PERL). PERL provides easy to use tools that make tasks such as manipulating data and accessing external programs/applications, a cinch.

Since CGI involves the interaction between servers and clients, there are high possibilities of overhead and security breaches. All languages are susceptible to such problems. Even with constant changes and revisions to CGI specification, PERL, and other CGI tools, these concerns would still be an issue.

Several resources already exist to assist in CGI programming. The web is filled with many web pages like, The CGI Resource Index, which has hundreds of scripts already written, that can provide templates or an idea of what/how CGI scripts run and operate.

POSSIBLE EXAM QUESTIONS

The following is a list of probable exam questions that could be derived from this paper:

1. What is CGI?

2. What are some things commonly seen on the Web that uses CGI?

3. What is the current specification for CGI?

4. Describe and/or name CGI: (a) Environment Variables (b) Command-Line (c) Standard Input (d) Standard Output

5. Name some languages used for writing CGI Scripts.

6. When receiving user inputs, what is the difference between using the Post vs.Get Method?

7. What are some of the common problems involved with writing CGI Scripts?

8. What is CGI Overhead?

9. Describe a situation where CGI Security can be breached.

REFERENCES

The following is a list of resources used for this paper.

Websites



The Common gateway interface National Center for Super Computing Applications, University of Illinois at Urban-Champaign, IL

 



The CGI Resource Index



"How to Create Your Own Home Page" Home Page



CGI Programming FAQ by Nick Kews



Introduction to CGI Scripts



CGI Application Provider



Perl and CGI FAQ Page



CGI Programming MetaFAQ



AOL Netfind: Reviews: CGI



CGI Forms Outline



CGI Made Easy



Open CGI programming



What is CGI and How do I use it?



CGI Tutorial



CGI and Perl Page

Books

The Most Complete Reference: Special Edition: Using HTML

Book by Savola, Copyright @1995 by QUE Corporation

 

[pic]

VMware Software

Review week twelve lab assignments

Security

In the Unix operating system environment, files and directories are organized in a tree structure with specific access modes. The setting of these modes, through permission bits (as octal digits), is the basis of Unix system security. Permission bits determine how users can access files and the type of access they are allowed. There are three user access modes for all Unix system files and directories: the owner, the group, and others. Access to read, write and execute within each of the usertypes is also controlled by permission bits (Figure 1). Flexibility in file security is convenient, but it has been criticized as an area of system security compromise.

Aside from user ignorance, the most common area of file compromise has to do with the default setting of permission bits at file creation. In some systems the default is octal 644, meaning that only the file owner can write and read to a file, while all others can only read it. (3) In many "open" environments this may be acceptable. However, in cases where sensitive data is present, the access for reading by others should be turned off. The file utility umask does in fact satisfy this requirement. A suggested setting, umask 027, would enable all permission for the file owner, disable write permission to the group, and disable permissions for all others (octal 750). By inserting this umask command in a user .profile or .login file, the default will be overwritten by the new settings at file creation.

The CHMOD utility can be used to modify permission settings on files and directories. Issuing the following command, chmod u+rwd,g+rw,g-w,u-rwx file

be made secure using a restrictive umask. By responsible application of such utilities as umask and chmod, users can enhance file system security. The Unix system, however, restricts the security defined by the user to only owner, group and others. Thus, the owner of the file cannot designate file access to specific users. As Kowack and Healy have pointed out, "The granularity of control that (file security) mechanisms is often insufficient in practice (...) it is not possible to grant one user write protection to a directory while granting another read permission to the same directory. (4) A useful file security file security extension to the Unix system might be Multics style access control lists. With access mode vulnerabilities in mind, users should pay close attention to files

Directory protection is commonly overlooked component of file security in the Unix system. Many system administrators and users are unaware of the fact, that "publicly writable directories provide the most opportunities for compromising the Unix system security" (6). Administrators tend to make these "open" for users to move around and access public files and utilities. This can be disastrous, since files and other subdirectories within writable directories can be moved out and replaced with different versions, even if contained files are unreadable or unwritable to others. When this happens, an unscrupulous user or a "password breaker" may supplant a Trojan horse of a commonly used system utility (e.g. ls, su, mail and so on). For example, imagine

For example:

Imagine that the /bin directory is publicly writable. The perpetrator could first remove the old su version (with rm utility) and then include his own fake su to read the password of users who execute this utility.

Although writable directories can destroy system integrity, readable ones can be just as damaging. Sometimes files and directories are configured to permit read access by other. This subtle convenience can lead to unauthorized disclosure of sensitive data: a serious matter when valuable information is lost to a business competitor.

As a general rule, therefore, read and write access should be removed from all but system administrative directories.

PATH is an environment variable that points to a list of directories, which are searched when a file is requested by a process. The order of that search is indicated by the sequence of the listed directories in the PATH name. This variable is established at user logon and is set up in the users .profile of .login file.

If a user places the current directory as the first entry in PATH, then programs in the current directory will be run first. Programs in other directories with the same name will be ignored. Although file and directory access is made easier with a PATH variable set up this way, it may expose the user to pre-existing Trojan horses.

To illustrate this, assume that a Trojan horse, similar to the cat utility, contains an instruction that imparts access privileges to a perpetrator. The fake cat is placed in a public directory /usr/his where a user often works. Now if the user has a PATH variable with the current directory first, and he enters the cat command while in /usr/his, the fake cat in /usr/his would be executed but not the system cat located in /bin.

User authentication in the Unix system is accomplished by personal passwords. Though passwords offer an additional level of security beyond physical constraints, they lend themselves to the greatest area of computer system compromise. Lack of user awareness and responsibility contributes largely to this form of computer insecurity. This is true of many computer facilities where password identification, authentication and authorization are required for the access of resources - and the Unix operating system is no exception. Password information in many time-sharing systems are kept in restricted files that are not ordinarily readable by users. The Unix system differs in this respect, since it allows all users to have read access to the /etc/passwd file (FIGURE 2) where encrypted passwords and other user information are stored. Although the Unix system implements a one-way encryption method, and in most systems a modified version of the data encryption standard (DES), password breaking methods are known. Among these methods, brute-force attacks are generally the least effective, yet techniques involving the use of heuristics (good guesses and knowledge about passwords) tend to be successful. For example, the /etc/passwd file contains such useful information as the login name and comments fields. Login names are especially rewarding to the "password breaker" since many users will use login variants for passwords (backward spelling, the appending of a single digit etc.). The comment field often contains items such as surname, given name, address, telephone number, project name and so on.

UUCP system. The most common Unix system network is the UUCP system, which is a group of programs that perform the file transfers and command execution between remote systems. (3) The problem with the UUCP system is that users on the network may access other users' files without access permission. As stated by Nowitz (9),

The uucp system, left unrestricted, will let any outside user execute commands and copy in/out any file that is readable/writable by a uucp login user. It is up to the individual sites to be aware of this, and apply the protections that they feel free are necessary.

This emphasizes the importance of proper implementation by the system administrator.

There are four UUCP system commands to consider when looking into network security with the Unix system. The first is uucp, a command used to copy files between two Unix systems. If uucp is not properly implemented by the system administrator, any outside user can execute remote commands and copy files from another login user. If the file name on another system is known, one could use the uucp command to copy files from that system to their system. For example:

%uucp system2!/main/src/hisfile myfile

will copy hisfile from system2 in the directory /main/src to the file myfile in the current local directory. If file transfer restrictions exist on either system, hisfile would not be sent. If there are no restrictions, any file could be copied from a remote user - including the password file. The following would copy the remote system /etc/passwd file to the local file thanks:

%uucp system2!/etc/passwd thanks

System administrators can address the uucp matter by restricting uucp file transfers to the directory /user/spool/uucppublic. (8) If one tries to transfer a file anywhere else, a message will be returned saying "remote access to path/file denied" and no file transfer will occur.

The second UUCP system command to consider is the uux. Its function is to execute commands on remote Unix computers. This is called remote command execution and is most often used to send mail between systems (mail executes the uux command internally).

The ability to execute a command on another system introduces a serious security problem if remote command execution is not limited. As an example, a system should not allow users from another system to perform the following:

%uux "system1!cat/usr/spool/uucppublic"

which would cause system1 to send its /etc/passwd file to the system2 uucp public directory. The user of system2 would now have access to the password file. Therefore, only a few commands should be allowed to execute remotely. Often the only command allowed to run uux is rmail, the restricted mail program.

The third UUCP system function is the uucico (copy in / copy out) program. It performs the true communication work. Uucp or uux does not actually call up other systems; instead they are queued and the uucico program initiates the remote processes. The uucico program uses the file /usr/uucp/USERFILE to determine what files a remote system may send or receive. Checks for legal files are the basis for security in USERFILE. Thus the system administrator should carefully control this file.

In addition, USERFILE controls security between two Unix systems by allowing a call-back flag to be set. Therefore, some degree of security can be achieved by requiring a system to check if the remote system is legal before a call-back occurs.

The last UUCP function is the uuxqt. It controls the remote command execution. The uuxqt program uses the file /usr/lib/uucp/L.cmd to determine which commands will run in response to a remote execution request. For example, if one wishes to use the electronic mail feature, then the L.cmd file will contain the line rmail. Since uuxqt determines what commands will be allowed to execute remotely, commands which may compromise system security should not be included in L.cmd.

Conclusion

Security is a complex issue for UNIX systems. The system administrator must be concerned with the physical security of the computers, with the security of the network connection, with the security of the file system, and with the security of user accounts.

Security of user accounts is of primary concern. Users must choose passwords that are difficult to crack. Note that passwords can be easy to crack for a number of reasons. Automated password crackers will compare passwords against dictionary words, so a simple word that is in the dictionary is an easy password to crack

What is the function of the cron daemon?

The cron daemon is where all timed events are initiated. It is executed during system initialization and remains active while the system is operating in multi-user mode. Cron wakes up every minute and examines all the stored configuration files, called crontabs, to check each them for commands that may be scheduled to be executed at the current time. Some systems have limits to the number of tasks that can be scheduled during the one minute time period. Most notably, because of the low number, is SGI's IRIX 5.3 which has a limit of 25 jobs.

The cron daemon is where all timed events are initiated. It is executed upon system initialization and remains active while the system is operating in multi-user mode. Cron wakes up every minute and examines all the stored configuration files, called crontabs, to check each them for commands that may be scheduled to be executed at the current time. Some systems have limits to the number of tasks that can be scheduled during the one minute time period. Most notably, because of the low number, is SGI's IRIX 5.3 which has a limit of 25 jobs.

/proc directory

Observe on Einstein the /proc directory

Chapter 5. The proc File System

The Linux kernel has two primary functions: to control access to physical devices on the computer and to schedule when and how processes interact with these devices. The /proc/ directory contains a hierarchy of special files which represent the current state of the kernel — allowing applications and users to peer into the kernel's view of the system.

Within the /proc/ directory, one can find a wealth of information detailing the system hardware and any processes currently running. In addition, some of the files within the /proc/ directory tree can be manipulated by users and applications to communicate configuration changes to the kernel.

5.1. A Virtual File System

Under Linux, all data are stored as files. Most users are familiar with the two primary types of files: text and binary. But the /proc/ directory contains another type of file called a virtual file. It is for this reason that /proc/ is often referred to as a virtual file system.

These virtual files have unique qualities. Most of them are listed as zero bytes in size and yet when one is viewed, it can contain a large amount of information. In addition, most of the time and date settings on virtual files reflect the current time and date, indicative of the fact they are constantly updated.

Virtual files such as /proc/interrupts, /proc/meminfo, /proc/mounts, and /proc/partitions provide an up-to-the-moment glimpse of the system's hardware. Others, like /proc/filesystems and the /proc/sys/ directory provide system configuration information and interfaces.

For organizational purposes, files containing information on a similar topic are grouped into virtual directories and sub-directories. For instance, /proc/ide/ contains information for all physical IDE devices. Likewise, process directories contain information about each running process on the system.

5.1.1. Viewing Virtual Files

By using the cat, more, or less commands on files within the /proc/ directory, users can immediately access an enormous amount of information about the system. For example, to display the type of CPU a computer has, type cat /proc/cpuinfo to receive output similar to the following:

|processor : 0 |

|vendor_id : AuthenticAMD |

|cpu family : 5 |

|model : 9 |

|model name : AMD-K6(tm) 3D+ Processor |

|stepping : 1 |

|cpu MHz : 400.919 |

|cache size : 256 KB |

|fdiv_bug : no |

|hlt_bug : no |

|f00f_bug : no |

|coma_bug : no |

|fpu : yes |

|fpu_exception : yes |

|cpuid level : 1 |

|wp : yes |

|flags : fpu vme de pse tsc msr mce cx8 pge mmx syscall 3dnow k6_mtrr |

|bogomips : 799.53 |

When viewing different virtual files in the /proc/ file system, some of the information is easily understandable while some is not human-readable. This is in part why utilities exist to pull data from virtual files and display it in a useful way. Examples of these utilities include lspci, apm, free, and top.

|[pic] |Note |

|  |Some of the virtual files in the /proc/ directory are readable only by the root user. |

5.1.2. Changing Virtual Files

As a general rule, most virtual files within the /proc/ directory are read only. However, some can be used to adjust settings in the kernel. This is especially true for files in the /proc/sys/ subdirectory.

To change the value of a virtual file, use the echo command and a > symbol to redirect the new value to the file. For example, to change the hostname on the fly, type:

|echo > /proc/sys/kernel/hostname |

Other files act as binary or boolean switches. Typing cat /proc/sys/net/ipv4/ip_forward returns either a 0 or a 1. 0 indicates that the kernel is not forwarding network packets. Using the echo command to change the value of the ip_forward file to 1 immediately turns packet forwarding on.

/bin directory

The /bin Directory

| | |

| |/bin is a standard subdirectory of the root directory in Unix-like operating systems that contains the executable |

| |(i.e., ready to run) programs that must be available in order to attain minimal functionality for the purposes of |

| |booting (i.e., starting) and repairing a system. |

| |The root directory, which is designated by a forward slash ( / ), is the top-level directory in the hierarchy of |

| |directories (also referred to as the directory tree) on Unix-like operating systems. That is, it is the directory |

| |that contains all other directories and their subdirectories as well as all files on the system. |

| |A directory in a Unix-like operating system is merely a special type of file that contains a list of the names of |

| |objects (i.e., files, links and directories) that appear to the user to be in it along with the corresponding |

| |inodes for each object. A file is a named collection of related information that appears to the user as a single, |

| |contiguous block of data and that is retained in storage (e.g., a hard disk drive or a floppy disk). An inode is a|

| |data structure on a filesystem that stores all the information about a filesystem object except its name and its |

| |actual data. A data structure is a way of storing data so that it can be used efficiently. A filesystem is the |

| |hierarchy of directories that is used to organize files on a computer system. |

| |The full names (also referred to as the absolute pathnames) of all of the subdirectories in the root directory |

| |begin with a forward slash, which shows their position in the filesystem hierarchy. In addition to /bin, some of |

| |the other standard subdirectories in the root directory include /boot, /dev, /etc, /home, /mnt, /usr, /proc and |

| |/var. |

| |Among the contents of /bin are the shells (e.g., bash and csh), ls, grep, tar, kill, echo, ps, cp, mv, rm, cat, |

| |gzip, ping, su and the vi text editor. These programs can be used by both the root user (i.e., the administrative |

| |user) and ordinary users. |

| |A list of all the programs in /bin can be viewed by using the ls command, which is commonly used to view the |

| |contents of directories, i.e., |

| |ls /bin |

| |/bin is by default in PATH, which is the list of directories that the system searches for the corresponding |

| |program when a command is issued. This means that any executable file (i.e., runnable program) in /bin can be run |

| |just by entering the file name at the command line and then pressing the ENTER key. The contents of PATH can be |

| |seen by using the echo command as follows: |

| |echo $PATH |

| |There are several other directories on Unix-like systems that also contain the string (i.e., sequence of |

| |characters) bin in their names, including /sbin and /usr/bin. The former contains additional programs that are |

| |used to boot the system as well as administrative and system maintenance programs that are only available to the |

| |root user. The latter contains executable programs that are not required for booting or repairing the system. |

| | |

| | |

| | |

/etc directory

This is the nerve center of your system, it contains all system related configuration files in here or in its sub-directories. A "configuration file" is defined as a local file used to control the operation of a program; it must be static and cannot be an executable binary. For this reason, it's a good idea to backup this directory regularly. It will definitely save you a lot of re-configuration later if you re-install or lose your current installation. Normally, no binaries should be or are located here.

/etc/X11/

This directory tree contains all the configuration files for the X Window System. Users should note that many of the files located in this directory are actually symbolic links to the /usr/X11R6 directory tree. Thus, the presence of these files in these locations can not be certain.

/etc/X11/XF86Config, /etc/X11/XF86Config-4

The 'X' configuration file. Most modern distributions possess hardware autodetection systems that enable automatic creation of a valid file. Should autodetection fail a configuration file can also be created manually provided that you have sufficient knowledge about your system. It would be considered prudent not to attempt to type out a file from beginning to end. Rather, use common configuration utilities such as xf86config, XF86Setup and xf86cfg to create a workable template. Then, using suitable documentation commence optimization through intuition and/or trial and error. Options that can be configured via this file include X modules to be loaded on startup, keyboard, mouse, monitor and graphic chipset type. Often, commercial distributions will include their own X configuration utilities such as XDrake on Mandrake and also Xconfiguration on Redhat. Below is a sample X configuration file from the reference system

|### BEGIN DEBCONF SECTION |

|# XF86Config-4 (XFree86 server configuration file) generated by dexconf, the |

|# Debian X Configuration tool, using values from the debconf database. |

|# |

|# Edit this file with caution, and see the XF86Config-4 manual page. |

|# (Type "man XF86Config-4" at the shell prompt.) |

|# |

|# If you want your changes to this file preserved by dexconf, only |

|# make changes |

|# before the "### BEGIN DEBCONF SECTION" line above, and/or after the |

|# "### END DEBCONF SECTION" line below. |

|# |

|# To change things within the debconf section, run the command: |

|# dpkg-reconfigure xserver-xfree86 |

|# as root. Also see "How do I add custom sections to a dexconf- |

|# generated |

|# XF86Config or XF86Config-4 file?" in /usr/share/doc/xfree86- |

|# common/FAQ.gz. |

| |

|Section "Files" |

|FontPath "unix/:7100" |

|# local font server |

|# if the local font server has problems, |

|# we can fall back on these |

|FontPath "/usr/lib/X11/fonts/misc" |

|FontPath "/usr/lib/X11/fonts/cyrillic" |

|FontPath "/usr/lib/X11/fonts/100dpi/:unscaled" |

|FontPath "/usr/lib/X11/fonts/75dpi/:unscaled" |

|FontPath "/usr/lib/X11/fonts/Type1" |

|FontPath "/usr/lib/X11/fonts/Speedo" |

|FontPath "/usr/lib/X11/fonts/100dpi" |

|FontPath "/usr/lib/X11/fonts/75dpi" |

|EndSection |

| |

|Section "Module" |

|Load "GLcore" |

|Load "bitmap" |

|Load "dbe" |

|Load "ddc" |

|Load "dri" |

|Load "extmod" |

|Load "freetype" |

|Load "glx" |

|Load "int10" |

|Load "pex5" |

|Load "record" |

|Load "speedo" |

|Load "type1" |

|Load "vbe" |

|Load "xie" |

|EndSection |

| |

|Section "InputDevice" |

|Identifier "Generic Keyboard" |

|Driver "keyboard" |

|Option "CoreKeyboard" |

|Option "XkbRules" "xfree86" |

|Option "XkbModel" "pc104" |

|Option "XkbLayout" "us" |

|EndSection |

| |

|Section "InputDevice" |

|Identifier "Configured Mouse" |

|Driver "mouse" |

|Option "CorePointer" |

|Option "Device" "/dev/psaux" |

|Option "Protocol" "NetMousePS/2" |

|Option "Emulate3Buttons" "true" |

|Option "ZAxisMapping" "4 5" |

|EndSection |

| |

|Section "InputDevice" |

|Identifier "Generic Mouse" |

|Driver "mouse" |

|Option "SendCoreEvents" "true" |

|Option "Device" "/dev/input/mice" |

|Option "Protocol" "ImPS/2" |

|Option "Emulate3Buttons" "true" |

|Option "ZAxisMapping" "4 5" |

|EndSection |

| |

|Section "Device" |

|Identifier "Generic Video Card" |

|Driver "nv" |

|# Option "UseFBDev" "true" |

|Option "UseFBDev" "false" |

|EndSection |

| |

|Section "Monitor" |

|Identifier "Generic Monitor" |

|HorizSync 30-38 |

|VertRefresh 43-95 |

|Option "DPMS" |

|EndSection |

| |

|Section "Screen" |

|Identifier "Default Screen" |

|Device "Generic Video Card" |

|Monitor "Generic Monitor" |

|DefaultDepth 16 |

|SubSection "Display" |

|Depth 1 |

|Modes "800x600" "640x480" |

|EndSubSection |

|SubSection "Display" |

|Depth 4 |

|Modes "800x600" "640x480" |

|EndSubSection |

|SubSection "Display" |

|Depth 8 |

|Modes "800x600" "640x480" |

|EndSubSection |

|SubSection "Display" |

|Depth 15 |

|Modes "800x600" "640x480" |

|EndSubSection |

|SubSection "Display" |

|Depth 16 |

|Modes "800x600" "640x480" |

|EndSubSection |

|SubSection "Display" |

|Depth 24 |

|Modes "800x600" "640x480" |

|EndSubSection |

|EndSection |

| |

|Section "ServerLayout" |

|Identifier "Default Layout" |

|Screen "Default Screen" |

|InputDevice "Generic Keyboard" |

|InputDevice "Configured Mouse" |

|InputDevice "Generic Mouse" |

|EndSection |

| |

|Section "DRI" |

|Mode 0666 |

|EndSection |

| |

|### END DEBCONF SECTION |

As you can see, the layout of the file is quite simple and tends to be quite standard across most distributions. At the top are the locations of the various font files for X (note - X will not start if you do not specify a valid font), next is the "Modules" section. It details what modules are to be loaded upon startup. The most well known extensions are probably GLX (required for 3D rendering of graphics and games) and Xinerama which allows users to expand their desktop over several monitors. Next are the various "Device" sections which describe the type of hardware you have. Improper configuration of these subsections can lead to heartache and trauma with seemingly misplaced keys, bewitched mice and also constant flashing as X attempts to restart in a sometimes never ending loop. In most cases when all else fails the vesa driver seems to be able to initialise most modern video cards. In the "Screen" section it is possible to alter the default startup resolution and depth. Quite often it is possible to alter these attributes on the fly by using the alt-ctrl-+ or alt-ctrl- set of keystrokes. Lastly are the "ServerLayout" and "DRI" sections. Users will almost never touch the "DRI" section and only those who wish to utilise the Xinerama extensions of X will require having to change any of the ServerLayout options.

/mnt directory

The /mnt directory and its subdirectories are intended for use as the temporary mount points for mounting storage devices, such as CDROMs, floppy disks and USB (universal serial bus) key drives. /mnt is a standard subdirectory of the root directory on Linux and other Unix-like operating systems, along with directories such as /bin, /boot, /dev, /etc, /home, /proc, /root, /sbin, /usr and /var. As is the case with all other first tier directories in the root directory, /mnt's name always begins with a forward slash.

Mounting is the process of attaching an additional filesystem, which resides on a CDROM, hard disk drive (HDD) or other storage device, to the currently accessible filesystem of a computer. Filesystem in this context refers to the hierarchy of directories (also referred to as the directory tree) that is used to organize files on a computer. On Unix-like operating systems, the directories start with the root directory, which is the directory that contains all other directories and files on the system and which is designated by a forward slash. The currently accessible filesystem is the filesystem that is currently in use in the computer.

The mount point is the directory in the currently accessible filesystem (typically an empty directory) to which the additional filesystem is attached (i.e., mounted). It becomes the root directory of the subtree from the newly added storage device, and that subtree becomes accessible from that directory. Any original contents of the mount point become invisible and inaccessible until the filesystem is unmounted (i.e., detached from the main filesystem).

/mnt can be empty, or it can contain subdirectories for mounting individual devices. Its subdirectories on a typical system include /mnt/cdrom and /mnt/floppy; other subdirectories can be created as desired.

Although /mnt exists specifically for mounting storage devices, other directories can also be used for this purpose. Major filesystems on non-root partitions (i.e., logically independent sections) of the hard disk drive (HDD) are typically mounted in the root directory, but they can likewise be mounted in other directories, including those created by a user for the purpose.

/opt directory

This directory is reserved for all the software and add-on packages that are not part of the default installation. For example, StarOffice, Kylix, Netscape Communicator and WordPerfect packages are normally found here. To comply with the FSSTND, all third party applications should be installed in this directory. Any package to be installed here must locate its static files (ie. extra fonts, clipart, database files) must locate its static files in a separate /opt/'package' or /opt/'provider' directory tree (similar to the way in which Windows will install new software to its own directory tree C:\Windows\Progam Files\"Program Name"), where 'package' is a name that describes the software package and 'provider' is the provider's LANANA registered name.

Although most distributions neglect to create the directories /opt/bin, /opt/doc, /opt/include, /opt/info, /opt/lib, and /opt/man they are reserved for local system administrator use. Packages may provide "front-end" files intended to be placed in (by linking or copying) these reserved directories by the system administrator, but must function normally in the absence of these reserved directories. Programs to be invoked by users are located in the directory /opt/'package'/bin. If the package includes UNIX manual pages, they are located in /opt/'package'/man and the same substructure as /usr/share/man must be used. Package files that are variable must be installed in /var/opt. Host-specific configuration files are installed in /etc/opt.

Under no circumstances are other package files to exist outside the /opt, /var/opt, and /etc/opt hierarchies except for those package files that must reside in specific locations within the filesystem tree in order to function properly. For example, device lock files in /var/lock and devices in /dev. Distributions may install software in /opt, but must not modify or delete software installed by the local system administrator without the assent of the local system administrator.

The use of /opt for add-on software is a well-established practice in the UNIX community. The System V Application Binary Interface [AT&T 1990], based on the System V Interface Definition (Third Edition) and the Intel Binary Compatibility Standard v. 2 (iBCS2) provides for an /opt structure very similar to the one defined here.

Generally, all data required to support a package on a system must be present within /opt/'package', including files intended to be copied into /etc/opt/'package' and /var/opt/'package' as well as reserved directories in /opt. The minor restrictions on distributions using /opt are necessary because conflicts are possible between distribution installed and locally installed software, especially in the case of fixed pathnames found in some binary software.

The structure of the directories below /opt/'provider' is left up to the packager of the software, though it is recommended that packages are installed in /opt/'provider'/'package' and follow a similar structure to the guidelines for /opt/package. A valid reason for diverging from this structure is for support packages which may have files installed in /opt/ 'provider'/lib or /opt/'provider'/bin.

/dev directory

UNIX has a beautifully consistent method of allowing programs to access hardware. Under UNIX, every piece of hardware is a file. To demonstrate this, try view the file /dev/hda

|  |less -f /dev/hda |

/dev/hda is not really a file at all. When you read from it, you are actually reading directly from the first physical hard disk of your machine. /dev/hda is known as a device file, and all of them are stored under the /dev directory.

Device files allow access to hardware. If you have a sound card install and configured, you can try:

|  |cat /dev/dsp > my_recording |

Say something into your microphone and then type:

|  |cat my_recording > /dev/dsp |

Which will play out the sound through your speakers (note that this will not always work, since the recording volume may not be set correctly, nor the recording speed.)

If no programs are currently using your mouse, you can also try:

|  |cat /dev/mouse |

If you now move the mouse, the mouse protocol commands will be written directly to your screen (it will look like garbage). This is an easy way to see if your mouse is working.

At a lower level, programs that access device files do so in two basic ways:

• They read and write to the device to send and retrieve bulk data. (Much like less and cat above).

• They use the C ioctl (IO Control) function to configure the device. (In the case of the sound card, this might set mono versus stereo, recording speed etc.)

Because every kind of device that one can think of can be twisted to fit these two modes of operation (except for network cards), UNIX's scheme has endured since its inception and is considered the ubiquitous method of accessing hardware.

Block and character devices

Hardware devices can generally be categorised into random access devices like disk and tape drives, and serial devices like mouses, sound cards and terminals.

Random access devices are usually accessed in large contiguous blocks of data that are stored persistently. They are read from in discrete units (for most disks, 1024 bytes at a time). These are known as block devices. Doing an ls -l /dev/hda shows that your hard disk is a block device by the b on the far left of the listing:

|  |brw-r-----   1 root     disk       3,  64 Apr 27  1995 /dev/hdb |

Serial devices on the other hand are accessed one byte at a time. Data can be read or written only once. For example, after a byte has been read from your mouse, the same byte cannot be read by some other program. These are called character devices and are indicated by a c on the far left of the listing. Your /dev/dsp (Digital Signal Processor -- i.e. sound card) device looks like:

|  |crw-r--r--   1 root     sys       14,   3 Jul 18  1994 /dev/dsp |

Major and Minor device numbers

Devices are divided into sets called major device numbers. For instance, all SCSI disks are major number 8. Further, each individual device has a minor device number like /dev/sda which is minor device 0). The major and minor device number is what identifies the device to the kernel. The file-name of the device is really arbitrary and is chosen for convenience and consistency. You can see the major and minor device number (8,   0) in the ls listing for /dev/sda:

|  |brw-rw----   1 root     disk       8,   0 May  5  1998 /dev/sda |

Miscellaneous devices

A list of common devices and their descriptions follows. The major numbers are shown in braces. The complete reference for Devices is the file /usr/src/linux/Documentation/devices.txt.

/dev/hd??

hd stands for Hard Disk, but refers here only to IDE devices -- i.e. common hard disks. The first letter after the hd dictates the physical disk drive:

/dev/hda (3)

First drive, or primary master.

/dev/hdb (3)

Second drive, or primary slave.

/dev/hdc (22)

Third drive, or secondary master.

/dev/hdd (22)

Fourth drive, or secondary slave.

When accessing any of these devices, you would be reading raw from the actual physical disk starting at the first sector of the first track, sequentially, until the last sector of the last track.

Partitions17.1are named /dev/hda1, /dev/hda2 etc.indicating the first, second etc.partition on physical drive a.

/dev/sd?? (8)

sd stands for SCSI Disk, the high end drives mostly used by servers. sda is the first physical disk probed and so on. Probing goes by Scsi ID and has a completely different system to IDE devices. /dev/sda1 is the first partition on the first drive etc.

/dev/ttyS? (4)

These are serial devices devices numbered from 0 up. /dev/ttyS0 is your first serial port (COM1 under DOS). If you have a multi-port card, these can go up to 32, 64 etc.

/dev/psaux (10)

PS/2 mouse.

/dev/mouse

Is just a symlink to /dev/ttyS0 or /dev/psaux. There are other mouse devices supported also.

/dev/modem

Is just a symlink to /dev/ttyS1 or whatever port your modem is on.

/dev/cua? (4)

Identical to ttyS? but now fallen out of use.

/dev/fd? (2)

Floppy disk. fd0 is equivalent to your A: drive and fd1 your B: drive. The fd0 and fd1 devices auto-detect the format of the floppy disk, but you can explicitly specify a higher density by using a device name like /dev/fd0H1920 which gives you access to 1.88MB formatted 3.5 inch floppies.

See Section 18.3 on how to format these devices.

|Floppy devices are named /dev/fdlmnnnn |

|l |0 |A: drive |

|  |1 |B: drive |

|m |d |``double density'', ``360kB'' 5.25 inch |

|  |h |``high density'', ``1.2MB'' 5.25 inch |

|  |q |``quad density'' 5.25 inch |

|  |D |``double density'', ``720kB'' 3.5 inch |

|  |H |``high density'', ``1.44MB'' 3.5 inch |

|  |E |Extra density 3.5 inch. |

|  |u |Any 3.5 inch floppy. Note that u is now replacing D, H and E, thus leaving it up to the user to |

| | |decide if the floppy has enough density for the format. |

|nnnn |360 410 |The size of the format. With D, H and E, 3.5 inch floppies only have devices for the sizes that are |

| |420 720 |likely to work. For instance there is no /dev/fd0D1440 because double density disks won't manage |

| |800 820 |1440kB. /dev/fd0H1440 and /dev/fd0H1920 are probably the ones you are most interested in. |

| |830 880 | |

| |1040 | |

| |1120 | |

| |1200 | |

| |1440 | |

| |1476 | |

| |1494 | |

| |1600 | |

| |1680 | |

| |1722 | |

| |1743 | |

| |1760 | |

| |1840 | |

| |1920 | |

| |2880 | |

| |3200 | |

| |3520 | |

| |3840 | |

/dev/par? (6)

Parallel port. /dev/par0 is your first parallel port or LPT1 under DOS.

/dev/lp? (6)

Line printer. Identical to /dev/par?.

/dev/random

Random number generator. Reading from this device give pseudo random numbers.

/dev/st? (9)

SCSI tape. SCSI backup tape drive.

/dev/zero (1)

Produces zero bytes, and as many of them us you need. This is useful if you need to generate a block of zeros for some reason. Use dd (see below) to read a specific number of zeros.

/dev/null (1)

Null device. Reads nothing. Anything you write to the device is discarded. This is very useful for discarding output.

/dev/pd?

parallel port IDE disk.

/dev/pcd?

parallel port ATAPI CDROM.

/dev/pf?

parallel port ATAPI disk.

/dev/sr?

SCSI CDROM.

/dev/fb? (29)

Frame buffer. This represents the kernels attempt at a graphics driver.

/dev/cdrom

Is just a symlink to /dev/hda, /dev/hdb or /dev/hdc. It also my be linked to your SCSI CDROM.

/dev/ttyI?

ISDN Modems.

/dev/tty? (4)

Virtual console. This is the terminal device for the virtual console itself and is numbered /dev/tty1 through /dev/tty63.

/dev/tty?? (3) and /dev/pty?? (2)

Other TTY devices used for emulating a terminal. These are called pseudo-TTY's and are identified by two lower case letters and numbers, such as ttyq3. To non-developers, these are mostly of theoretical interest.

The dd command and tricks with block devices

dd probably originally stood for disk dump. It is actually just like cat except it can read and write in discrete blocks. It essentially reads and writes between devices while converting the data in some way. It is generally used in one of these ways:

|  |dd if= of= [bs=] \ |

|  |       [count=] [seek=] \ |

|  |       [skip=] |

|  | |

|5 | |

|  | |

|  | |

|  | |

|  | |

/root directory

Observe the /root directory on Einstein.

Unix Directory Tree

Like any computer, Unix arranges files/directories in an inverted tree topology. The root directory, shown at the top, is written /. It contains several main directories, which you can see using the command ls /. Normally this includes the directories /bin, /home, and /usr. These directories in turn contain subdirectories and ordinary files.

When you log on to a Unix computer, you are initially put into your home directory. Users' home directories are specified in the file /etc/passwd, which you can view (type the command more /etc/passwd).

Special symbols:

/   refers to the root directory

~   refers to your home directory

.   refers to your current working directory

..   refers to the parent directory -- the directory one level up

[pic]

pwd  At any time, you can type the command pwd to see where on the tree you are currently sitting. (pwd = present working directory)

ls  To list the files located in your current directory, type ls. Useful variations on this command are

• ls -l   list full information about the files

• ls -a   list all files, including ``dot files''

• ls -al   list full information about all files

To list the files in a directory other than the one you are currently sitting in, just add the directory name, e.g.,

ls /         (the root directory)

ls -l /usr/local

ls -al ..       (one directory higher)

ls -F bin       (in subdirectory named ``bin'')

cd  To change directory, use the command cd, e.g.,

cd /usr/local

cd             go to your personal home directory

cd ..     move up one level

cd bin     descend one level into subdirectory ``bin'' (assuming it exists)

cd ../lib     move up one level and back down into subdirectory ``lib'' (assuming it exists)

file  Every object represented on disk or in memory is a ``file'', including text files, commands and programs, directories, disks, printers, etc. When in doubt about the identity of something you see with ls, just ask the computer using the file command, e.g.,

file README

file /

file /usr/local/bin/netscape

file /bin/ls

which  The commands you type usually correspond to actual files somewhere on the directory tree. If you are curious, find out where a command is located using the command which, e.g.,

which vi

which netscape

which dir

which which

Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, you can give them pointers to the same resource. This fiction can be maintained until a caller tries to modify its "copy" of the resource, at which point a true private copy is created to prevent the changes becoming visible to everyone else. All of this happens transparently to the callers. The primary advantage is that if a caller never makes any modifications, no private copy need ever be created

Next Lab Assignment

Super-Block

The super-block information should be stored in a secure location, so that it can be used in case someone accidentally overwrites the real superblock. This is important as the superblock stores few pieces of static or "mostly static" information that can be critical in recovering data. Some examples include blocksize, clustersize, sysdir location, rootdir location, file system generation, number of slots.

The super-block defines the file system to which it belongs. It holds information such as file system, the inode pointers, block size and more. It basically allows the virtual file system to understand the physical file system that it's communicating. If the super-block is damaged the system will not be able to use that file system which is why multiple copies of the super-block are kept.

Section of a computer hard disk drive that contains information about the file system. The majority of computer file system have some type of a super block.

5. Device Identifier

Inode Pointers

Block Size

Superblock Operations (pointer to list of operations)

File System Type

File System Pointer

6. The magic number is the number inside the super-block that allows the kernel to know that the super-block and the file system being mounted match. It needs to confirm the relationship in order to be able to correctly use the file system. The ext2 magic number is 0xEF53.

• The super block owes its name to its heritage, from when the first data block of a disk or partition was used to hold meta information about the partition itself. The super block is now detached from the concept of data block, but it still contains information about each mounted file system. The actual data structure in Linux is called struct super_block and holds various housekeeping information, like mount flags, mount time and device block size. The 2.0 kernel keeps a static array of such structures to handle up to 64 mounted file systems.

• An inode is associated with each file. Such an ``index node'' holds all the information about a named file except its name and its actual data. The owner, group, permissions and access times for a file are stored in its inode, as well as the size of the data it holds, the number of links and other information. The idea of detaching file information from file name and data is what allows the implementation of hard-links--and the use of ``dot'' and ``dot-dot'' notations for directories without any need to treat them specially. An inode is described in the kernel by a struct inode.

• The directory is a file that associates inodes to file names. The kernel has no special data structure to represent a directory, which is treated like a normal file in most situations. Functions specific to each file system type are used to read and modify the contents of a directory independently of the actual layout of its data.

• The file itself is associated with an inode. Usually files are data areas, but they can also be directories, devices, fifos (first-in-first-out) or sockets. An ``open file'' is described in the Linux kernel by a struct file item; the structure holds a pointer to the inode representing the file. file structures are created by system calls like open, pipe and socket, and are shared by father and child across fork.

These days, a major system without a Web server is a rarity. Naturally, it falls to the system administrator to make certain that the Web server is available reliably and to diagnose problems. Large companies may have a separate person to administer the Web server itself. In this module you’ll take a look at administering Web servers and writing scripts that are executed in response to Web-based actions.

Key Points

7.1 Apache Web Server

The Apache Web server is freely available, which makes it an ideal example for us to use for exploring Web servers. You’ll learn to install and configure the Apache Web server.

At the Apache Web site, you’ll find more information about the various products the Apache Software Foundation provides. You’re interested in the Web server, referred to on the Apache Web site at the HTTP server. The Apache Web site contains documentation for the HTTP server, including a guide to compiling on UNIX systems.

The Apache HTTP server is distributed as source files. Using the HTTP server will require that you go through several steps:

• Configure the makefiles for your system using the configure script.

• Compile the HTTP server using the makefile.

• Install the HTTP server, again using the makefile.

• Configure the HTTP server itself, if necessary.

• Run the HTTP server.

The Apache HTTP server docs cover this process in detail in the section titled, "Compiling and Installing."

Definition: Apache is generally recognized as the world's most popular Web server (HTTP server). Originally designed for Unix servers, the Apache Web server has been ported to Windows and other network operating systems (NOS). The name "Apache" derives from the word "patchy" that the Apache developers used to describe early versions of their software.

The Apache Web server provides a full range of Web server features, including CGI, SSL, and virtual domains. Apache also supports plug-in modules for extensibility. Apache is reliable, free, and relatively easy to configure.

Apache is free software distributed by the Apache Software Foundation. The Apache Software Foundation promotes various free and open source advanced Web technologies.

7.2 CGI Scripts

A Web site that consists of only HTML pages can provide only the data that was written into the HTML page itself. In the early days of the World Wide Web, that was sufficient. These days, any Web site that does not provide dynamic data is out-of-date a few days after it is written. There are many ways to insert dynamic data into an HTML page. We’ll look at one of the ways, using CGI scripts.

CGI stands for Common Gateway Interface. It is a mechanism for a Web server to be able to interact with an external program. These external programs are referred to as CGI scripts. A CGI script can be written in any programming language; Perl is a common choice because it is a full-featured programming language, and yet relatively easy to write short programs.

To use CGI scripts with the Apache HTTP server, the HTTP server must be told where to look for the CGI scripts. There are several ways to configure this: Look at the HTTP server docs on the Apache Web site for more details. Once the HTTP server knows where to find the CGI scripts, you need two further pieces. The first is a program in that directory to be used as a CGI script, and the second is a link in an HTML page to you CGI script.

The CGI script itself simply produces output to stdout. The HTTP server captures that output and sends it to the Web browser. The format of the output must be in a form that the Web browser can understand (typically HTML). The difference between a normal HTML page and an HTML page created by a CGI script is that the CGI script creates the HTML page every time it is run. So the data provided on the HTML page can change each time the script is run.

To link to a CGI script, you simply need to provide a URL that points to the CGI script. The exact URL will vary widely depending on your host name and the way you configured the HTTP server to run CGI scripts. For example, if you configured your HTTP server to look for CGI scripts in a directory called "cgi-bin," and your Web server is available at , and your CGI script is a Perl program named "test.pl," then the URL might look like this: .

When you are testing a Web server on your home machine, you usually do not have a domain name assigned to use for the Web server. You can, however, test it from a Web browser on the same machine by using the host name "localhost." For example, going to the URL will try to find a Web server on the same machine as the Web browser.

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

| |

|[pic] |

|[pic] |

|[pic] |

| |

|[pic] |

|[pic] |

|[pic] |

| |

| |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

|[pic] |

| |

| |

| |

|[pic] |

| |

| |

| |

|[pic] |

| |

| |

| |

|[pic] |

| |

|[pic] |

| |

| |

| |

| |

| |

|[pic] |

| |

| |

| |

|[pic] |

| |

| |

| |

|[pic] |

| |

|If you plan on using CGI scipts, please read the following, provides a fair amount of information on what you will need for setting up|[pic] |

|CGI scripts. |Welcome |

|CGI-SCRIPTS |Students |

|This is a list of cgi-scripts on mes5. If you find or create a cgi-script, feel free to talk to Jerry Nolan about getting it put on |Faculty & Staff |

|mesa5 for others to use. |Media Services |

|Guest books - The following html code allows the use of a guestbook in your homepages. Create an empty file called gbook.html with 666|Banner |

|permissions in your public_html directory. gbook.html will be the collection file for folks who register in your guestbook. |Telephone |

|In the code below you will need to replace xxxxx with your mesa5 login name. The code below should be a separate file that is called |Acceptable Use |

|from a hypertext link from one of your pages. | |

| | |

| | |

| |[pic] |

|Your Name: | |

| | |

| |Information for: |

|Your E-Mail Address: |Prospective Students |

| |Current Students |

| |Faculty & Staff |

|Your Homepage Location: |Parents & Family |

| |Alumni & Friends |

| |Mesa in the Community |

|Include Your Email Address and Homepage? |Top of Form |

| |[pic][pic] |

| |[pic] |

| |[pic] |

| |Bottom of Form |

| | |

|CGI Wrapper | |

|The CGI Wrapper allows you to run your own cgi scripts: | |

|To use it, the end-user has to | |

|have a valid cgi-bin program | |

|place the program in ~/public_html/cgi-bin | |

|access it as: | |

|This will attempt to execute the script | |

|~user/public_html/cgi-bin/scriptname | |

|Limitations: | |

|The script HAS to be owned by the user. | |

|The script HAS to belong to the primary group that the user belongs to. (typically faculty or students | |

|The script CANNOT be set-uid or set-gid (this is easily bypassed, though). | |

|Debugging: | |

|It is possible to do some debugging as well. To debug your CGI script, change the URL reference from: | |

|.... | |

|to: .... | |

|Note that this program is the ONLY way that the server will support end-user cgi-scripts. | |

|Look at: | |

|as an example. | |

|For more information on cgi wrapper, check the man page (man cgiwrap) | |

|Perl FAQs: | |

|Is Perl installed on the Mesa State system? | |

|Yes, Perl is installed on mesa5. Here is where to find it: | |

|/usr/local/bin/perl version 4.036 | |

|/usr/local/bin/perl4 version 4.036 | |

|/usr/local/bin/perl5 version 5.003_01+ | |

|What is the comment line to make a Perl script file execute (e.g., > #!/usr/bin/perl)? | |

|It's a '#!' on the first line (and first column) of the script, followed by the full path to the perl interpreter. For example: | |

|#!/usr/local/bin/perl4 or: #!/usr/local/bin/perl5 > | |

|What, if any, are the required filename extensions for CGI programs (e.g., .cgi, .pl, .bin )? | |

|There are no mandatory extension names. Just make sure that they are in the cgi-bin subdirectory (under your public_html | |

|subdirectory), and that they are executable (i.e.: chmod ugo+x {my_cgi_script(s)}) (Except if you wan to PHP: PHP scripts need a .php | |

|extention). | |

|Are my scripts required to reside in a special directory? If so, what ? | |

|is it called (e.g., cgi-bin, cgi, htbin etc) | |

|Are there any directory names I cannot use? Do I add this directory under "public_html"? | |

|Yes. It has to be in: ~/public_html/cgi-bin. | |

|Can I use the SSI #exec command on this Web server? SSI, as in: Server Side Includes? | |

|It's there, but it's not officially supported. If the following works, use it, but if it doesn't, tough cookies. To use SSI, make sure| |

|the web page has a .shtml extention (the file mode doesn't have to be executable, though), and to include something, use the following| |

|construct in the file: If you want to include "active" content, you'll want to use | |

|something like the following: > | |

|What is the absolute path to my root Web directory (e.g., /home/users/my_login_name/public_html/)? | |

|It's the public_html subdirectory under your home directory. The best way to find where your home directory is, is to login to mesa5, | |

|and type: echo $HOME The path that's printed out is your home directory. For example, if your home directory was: | |

|/home/students/j/johndoe, the absolute path to the root of your web directory would be: /home/students/j/johndoe/public_html > | |

|What should my URL look like to access my Web pages (e.g., )? | |

|The "default" page is: index.html When people go to: They're looking at: | |

|/home/students/j/johndoe/public_html/index.html | |

|Which, if any, Perl CGI function modules or libraries are installed in the default Perl library path? (e.g., cgi-lib.pl, CGI.pm etc.) | |

|Which if > any, are installed elsewhere? | |

|There are too many to type in by hand. Log into your mesa5 account (a.k.a. mesastate.edu), and look in the following locations: | |

|For perl4: look in /usr/local/lib/perl For perl5: look in /usr/local/lib/perl5 and /usr/local/lib/perl5/site_perl | |

|Does the Web server have a sendmail command installed? If so, what is > the path (e.g., /usr/sbin/sendmail)? If not, is an equivalent | |

|SMTP mail > sending command available and what is its path? If the syntax is different > than sendmail 's what do I change (from a | |

|sendmail script) to send mail from a CGI script using this command? | |

|Yes, sendmail is installed on mesa5 (the web server). The best path to it is: /usr/sbin/sendmail | |

|Where can I get more information? | |

|Try the online Perl tutorials at WWW. and stein. Another wornderfull source of information concerning | |

|anything to do with computers and programming . Because the Tomlinson Library subscribes to , Mesa State | |

|students may check out and read books from . | |

| | |

| |[pic] |





Perl Double quoted strings are subject to various forms of character interpolation, many of which will be familiar to programmers of other languages.

Perl Single quoted strings must be separted from a preceding word bya space because a single quote is a valid character in an identifier. Its use has been replaced by the more visually distinct :: sequence ($main::var).

A system call is a call to the kernel in order to execute a specific function that controls a device or executes a privileged instruction. Usually a call to the kernel is due to an interrupt or exception.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download