Faculty - Naval Postgraduate School



OA 4108 Notes on Cygwin (and OX S “Terminal” and Linux console) 3 Oct 12

Introduction: Cygwin is a Windows system for emulating the command-line of Linux and other Unix-style operating systems (including the “terminal” window of Apple OS X). Personally, I install Cygwin on every Windows machine I get my hands on. Cygwin is installed on the machines in the Glasgow lab and at least a few of the machines downstairs in the STBL.

I have access to Apple machines, but I am not really an OS X user. I have no access to any Linux machines except at the NPS high-performance computing facility. I do have access to 32-bit Windows XP but I’ve mostly moved to Windows 7. So in some cases I may make claims about your computer that just aren’t true. As they say, “your mileage may vary.” This document reflects my thoughts at one point in time, as I prepare a class. I want to spread my facility with Cygwin as efficiently as possible. However, there are certainly things I’ve missed and better ways to do things. Feedback is welcome!

Installing and Starting Cygwin: The page will guide you through the installation of the Cygwin system. I recommend the packages listed at the bottom of this document. Once you start Cygwin for the first time, you’ll see the dollar sign prompting you for input. This program is called bash, the “sh” standing for “shell” since this is a program that “surrounds” other programs in some sense. There are other possible shells with other fine features; in most cases generally they’re similar to bash which is, in any case, the default shell on Apple computers as well.. We will be working with bash in this course.

Once you’ve started Cygwin you can type commands. Perhaps the most basic commands are ls (list all the files in this directory), cd (change directories), mkdir (make a new directory), rm (remove a file) and man (look at the manual). So try this (some of this stuff will be explained below):

cd c:/ # “#” is the comment character. Switch to c drive.

mkdir OA4108 # create a new directory

cd oa4108 # Cygwin is case-sensitive; Windows’s file system isn’t

echo “Hi there” # sends output to the screen

echo “Hi there” > myresult # sends output into a new file

cat myresult # show the contents of that file

ls –l myresult # “long” listing: why 10 chars?

rm myresult # remove the file

Startup Options: Right-click the window bar of your Cygwin screen to set the properties. I recommend you pick both “QuickEdit Mode” and “Insert Mode” in the Edit Options section. To copy text to the Windows clipboard, highlight it and press Enter. To paste to a Cygwin window, right-click on the title bar and select paste.

Some Background: Unix systems go ‘way back, to the 1970’s. The philosophy of the developers was to construct a large number of small portable tools, each of which did a small number of tasks, and enable a way to connect the tools. The tools tend to have very short names, which saves on typing at a cost of readability. Linux is named after one Linus Torvalds, who (with collaborators) re-wrote the “kernel,” the central part of the operating system, so that it could be released as open source. The web is filled with information on the history of the development of Unix and Linux; I won’t go over that here. “Unix” is a trademarked name; I will use “Linux” in the generic sense to mean any Linux- or Unix-like operating system.

Cygwin versus and cmd.exe: Windows offers not one but two command-line interfaces. is the old-fashioned one that requires short file names. The newer command interface is cmd.exe. This is, in fact, a fully-featured program that should be more widely known, but we will use Cygwin partly because I know how, and partly because essentially the same commands are available on Apple and Linux sytems.

Special Place of the C Language: Much of Linux is written in the C language, and perhaps as a result, references to the C language are common in the Linux world. I will introduce any vocabulary as needed, but just be aware that C holds a special place in the Linux world.

The File System: The Linux file system starts at the “root,” designated by a single forward slash. Directory names are separated by forward slashes, whereas of course Windows uses backslashes; this will force us to pay attention when we use Linux tools in Cygwin. Under the root there is a more-or-less standard set of directories; most important among these is the /bin directory, which holds executable (that is, “binary”) programs.

The Path: The “path” is a list of directories. When you ask the system to run a command, it looks down this list of directories, in order, until it finds a program by that name, and then it runs it. So if you have two versions of “myprog,” for example, the system will run whichever one is associated with the first directory in the path it encounters. This is very much analogous to Java’s “classpath,” but for executables. To see your path, type echo $PATH: the dollar sign indicates that you want a value of an “environment variable.” (env will show you the current set of environment variables.) Your path is set in Windows and your “startup script,” which is in a file named .bashrc (note the dot) in your home directory. This is an important concept and one to which we will return.

Basic Differences between Windows and Cygwin/Linux:

(a) The path separator (forward versus backward slash) is one primary difference between the two flavors of Operating System. There are a few others that bear remembering:

(b) Case-sensitivity: Linux is case-sensitive with regard to file names. Windows is not. In general, Cygwin is case-sensitive – cat is not the same as Cat – but when it comes to the file system, Cygwin uses Windows.

(c) Disk or Volume id: in Windows a physical (or logical) disc is identified by a colon (as in C:) In Linux all drives appear as directories under the root; in Cygwin they will normally be under /cygdrive, although the colon operator is usually interpreted correctly if you use it in commands.

(d) CR/LF: lines in Windows text files end with two characters, CR and LF (\r\n); lines in Linux text files just end with \n. Just be aware that this can get in your way if you move from one to the other.

(e) Long file names: Linux has no problem with long file names, but certain special characters cause pain. Primary among them are embedded spaces. You can access a file with an embedded space either by enclosing the whole file name in double quotation marks, or by preceding each space with a backslash. (This forms an example of an escape sequence, below.)

The Role of the Extension: The “extension” is the last name of a file, the part that comes after the final dot. Windows uses the extension (and, often, nothing but the extension) to determine what sort of a file it is. So, for example, a file whose name ends in .XLS is presumed to be an Excel file. If you double-click it from the Windows Explorer, Excel will be started (even if the file in question is not, in fact, an Excel file). It’s rarely a good idea to give a file a name that has no extension at all unless it’s an executable program. Even then, you might consider adding .exe as the extension so that you yourself know that it’s executable.

In Cygwin, extensions generally mean nothing. In a nod to Windows, you can start the program associated with a file using the cygstart command. So if you have a Excel document named mything.xls, the command cygstart mything.xls will start Excel and open that document. If you re-name your Excel file so that it ends in, say, .doc, then if you double-click on it, Windows will try to open it with Word. However, if you give it an extension that isn’t already in use (maybe .ZLS), Windows will correctly deduce that it is an Excel file and open it with a warning. In this case cygstart will fail. The file command can help to to figure out what sort of data you have, but it’s not 100% accurate.

Special Characters: Some characters have special meanings. The special-est of them all, maybe, is the backslash, but here we describe some of the places you need to be careful when typing.

Backslash: The backslash (\) has a couple of purposes in Cygwin. First, it acts as a line-continuation character. Place a backslash as the very last character on the line to indicate that two lines are acting as one. Second, the backslash introduces other special characters: \n means new-line, as it does in Java, and \t means tab. (It takes two keystrokes to write these, but they are single characters.) Inside a long file name, \_ (backslash space) indicates an embedded space. The backslash also “protects” against characters being expanded when you don’t want them to. So, for example, contrast:

rm me* # remove all files whose names start with me (wild card)

rm me\* # remove the file whose name is m, e, asterisk

When the backslash is used in this way we call the resulting set of characters an escape sequence.

Quotes: there are three kinds of quotes on your keyboard (although Word tends to mess up their display with its smart-quote function): the double-quote, which surrounds text; the apostrophes, which do the same (well, not quite, but almost); and the backtick (above the tilde in the top-left of my keyboard). A pair of backticks act as a run command. We will see an example in a minute.

Wild cards: Bash supports the familiar wild-card expansion characters, * for any characters, ? for exactly one character, and square-bracket range like [a-f] for “one character in that range.” If you need to include an asterisk, question mark or square bracket in your command, escape it with the backslash. Expressions using these characters are called globs for historical reasons. That name serves to distinguish them from regular expressions, which we will meet later.

Special Characters in File Names: I can’t recommend putting special characters into file names. Naturally the backslash, forward slash, quotes, asterisk and question mark seem like bad choices. Because of their special place in bash programming, the dollar sign (used in variable expansion), hash mark (comment character) and semicolon (command separator) are bad choices, too. If you need to get at a file with these characters in its name, use the backslash on each suspect character. When a file has embedded spaces, as we mentioned, enclose the name in double-quotes or escape the spaces.

Dot and double-dot: Wherever you are, a single dot refers to the current directory, and two dots to its parent. So you can always move up one directory with the command cd ..

Running Programs: To run a program, type its name. You don’t have to type the ending if it’s .exe. If a file by that name is found on the path, the program runs. The current directory is often not on your path (putting a dot in the path is considered a security risk, I think), so for a program named tester in the current directory, you may have to run it with ./tester. As noted above, you can run the program associated with a file’s extension, and have the program start up that file, with cygstart.

(a) Shell scripts: Many programs (like ls, for example) are compiled programs whose source code is invisible to us. However, it’s also possible to put a set of commands into a file and execute them. (Sometimes bash will complain that you don’t have permission to run a particular file. You can change the permission with a command like chmod a+x .) Bash will read and execute the commands one at a time. If the first line of the file starts with #!, then it specifies a particular shell or interpreter; so, for example, a script designed to run in Python might start #!/cygdrive/h/python/python, which is where Python lives on my machine.

Example: Here’s a simple shell script in action. Create a file called tester and give it this one line:

echo "I like" $1

The $1 part refers to the first argument (if there is one). This script, then, writes “I like” and then the first argument to the standard output. Try entering ./tester bunnies. Look! Bunnies!

Here’s an example using the backticks from above:

echo bunnies > bunfile # write text to file

./tester bunfile # “bunfile” interpreted as text

./tester 'cat bunfile' # apostrophes enclose text passed as argument

./tester `cat bunfile` # backticks cause command to be executed first

Shell programs include variables and if-tests and all sorts of thing that allow the production of powerful programs. Today, though, this approach is considered to be an old-fashioned one that produces code that may be inefficient and difficult to maintain. Instead many programmers would use a dedicated scripting language to take care of simple and repetitive tasks. These languages include Perl, Ruby and Python; we may look at the last of these in this class, but of course we can also do almost anything in R.

(b) Command-line arguments: Most programs accept command-line arguments, which are pieces of information you pass in at run time. Originally command-line arguments were specified by a hyphen and a single character. So, for example, ls –l means “list the files in this directory with a long-style listing”; the “dash l [el]” is a command-line argument. You can combine arguments: ls –al is the same as ls –a –l. The newer convention is that arguments are words introduced by two hyphens: ls -–format=long is the same as ls –l. Some arguments stand alone; others introduce values, either with an equals sign (as above) or separately. For example man –M looks for man pages in specific places; here the “dash M” argument precedes the argument’s value. By the way, on the Windows command line, arguments are predeced by a forward slash, not a hyphen.

(c) Redirection: most commands respect a convention that data comes from a particular file or “stream” called “standard input” and goes to another called “standard output.” (A third stream called “standard error” is where error, warning and status messages get put.) By default, standard input is the keyboard and standard output, the screen. Redirection gives the command-line approach a lot of power. The redirection operators are:

>: send standard output to a file (starting with the file empty)

>>: append standard output to the end of an existing file

nine, which creates a file named “nine” containing the digit 9.

Example: Surprisingly often I want to answer the question “how many columns wide is that tab-separated variable file?” I cut off the top line with head -1, convert the tabs to new-lines with tr, and count the number of lines that result using wc –l. So for a file named myfile, that command would be

head -1 myfile | tr '\n' '\t' | wc –l.

Filename expansion: One last tip. If you have only one file starting with e, then when you type e you get that file name (or, if you’ve set shopt -s nocaseglob, one starting with E); press a couple of times for a listing if you have several files starting with that character.

Further Resources: If you had to pick the one subject where the Internet had the most information, you might well pick computer stuff. Seriously, it’s everywhere. Type “bash” into your favorite search engine and you will find more than you can digest. It may not be great reading, but you can’t go wrong with the semi-official manual: .

Installation: If you have a Linux or Apple machine, you don’t have to install anything, since all or most of the tools will be available automatically. For Windows, visit the site and download setup.exe. Then run it. You’ll set a huge set of possible things to install, broken down by category. Among the packages I get are these (and then the system automatically adds other stuff):

Archive: unzip, zip

Base (everything)

Doc man

Editors ed (I also get vim, and you will need a good editor of some sort, but I can’t

recommend vim, which is old and primitive)

Python I hope to look at Python in this course, although in fact I got my Python from

somewhere else.

Shells bash completion

System util-linux

Text I get flip, to manipulate line endings, but you probably won’t need it

Utils bzip, cygutils, diffutils, file, which

If you’re curious about some of the Cygwin tools, try listing /usr/bin and looking at the man pages for all the .exe files. You might discover something useful!

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download