Revision information:



Adapted from

The critical commands are in bold (but others you may eventually find useful). Gavin Schnitzler 11/2011.

The best way to communicate with the cluster on a Mac is to open a command window and use ssh. Normal mac copy and paste key-combinations work here, and up down arrows review recent commands. To transfer files use sftp (below).

The best way to communicate with the cluster from a PC is to use the free program “Tera Term VT”. To copy and paste, you need to use [alt]-c and [alt]-v (since control-c is the UNIX interrupt command). Up/down arrows review recent commands. To transfer files most easily, use the WinSCP program.

* A wildcard character that represents any character in filenames etc. for most commands. Thus ‘head *.txt’ prints the top 10 lines of all files in the current directory ending in ‘.txt’.

> or < Indicators of direction for command input (). ‘command file1 > file2’ is extremely useful for directing output of command execution on file1 to file2 (instead of, by default, to the screen).

| (pipe) Used to link commands together, taking the output of leftmost commands as input to rightmost. E.g. ‘head –n 100 | more’ prints the top 100 lines of a file one screenful at a time.

awk ‘OFS=”\t” {print $1;$4+$5,”etc”}’ infile > outfile Create columns in outfile based on specified tab-separated columns in infile (\t means tab), where $1=column 1. In this example column 2 gets the sum of infile columns 4 & 5, and column 3 is always “etc”.

bjobs Tells you about jobs you have running in batch. ‘bjobs –l’ gives detailed information.

bkill job# Kills a job you have running in batch (find job number with ‘bjobs’).

bsub –oo output.logfile command [parameters] Submits run of any command to background batch process (a job). You must do this for processes that take more than about 30 seconds or the sysadmins will get upset. The –oo file contains information about the run, including any errors or anything sent to the screen (thus if you use “bsub command infile > outfile” the output will end up in the –oo file.

bsub –Ip –q int_public Makes the bjob run “interactive”. Normal program output goes to the screen (or is redirected by > as appropriate). You can’t do anything in that window while it’s running, but the sysadmins won’t kill your program (which is what they do for long-running programs started w/o bsub).

bzip2 –d file.bz2 Opens bzip2 archives, deleting originals

cat f1 prints the contents of f1 to the screen (‘more’ is generally better).

cat f1 f2 > f3 appends f2 to the end of f1 and puts the results in f3

cd directoryname Change directory. ‘cd ..’ moves you up one directory, ‘cd ../..’ moves up 2. ‘cd ../dirname’ moves up one & then down into dirname. ‘cd /’ takes you to the root (top level) directory, and you can specify subdirectories between /’s. Thus ‘cd /cluster/shared/gschni01’ takes you to my shared directory. ‘cd ~’ takes you to your home directory (the only place that is not cleaned out every 28 days on the cluster… but has limited storage). One last thing about pathnames: “.” stands for the current directory. This is sometimes useful to get UNIX to recognize an executable file in the current directory – if ‘command’ doesn’t work, try ‘./command’.

clear Clears the screen

cp f1 f2 Copies file1 into file 2 (file 1 is unchanged).

cp –r d1 ../d2 Copies the directory d1 to d2 (in this case d2 will be placed up one directory).

^-c (ctrl c) Kills the current running process

diff f1 f2 Lists differences between f1 & f2, line by line.

export NAME=definition Generally used to tell UNIX where to find things (often necessary to run programs), as in…

export CREAD=/mydir/cread_folder/ … export PATH=$PATH:CREAD/bin This adds the folder /mydir/cread_folder/bin to the end of the places UNIX searches for programs to run, where $PATH was your original list. Be sure to include ‘$PATH:’ or you may need to logout & restart.

find ./ -name “*namepart*” –print Finds a file containing “namepart” in the current or lower directories

find . –exec touch ‘{}’ \; The magic phrase that “touches” all files in your shared directory, so their access date becomes the current data, and are not deleted when the sysadmins wipe old files every 28 days.

du lists all subdirectories and their sizes

expr ‘first_number’ operator ‘2nd_number’ (e.g. expr ‘1’ + ‘1’) Does simple math. The ‘’ quotes can also contain any command that returns a number.

env lists current environment settings, such as PATH, etc.

ftp account (or sftp for secure connection) Makes an ftp connection to a machine account you have access to (login & password). Once there, ‘ls’ and ‘cd’ commands work on the remote account. Use ‘lls’ and ‘lcd’ for local directories. ‘get’ gets a file from that account. ‘put’ puts a file there (using current local & remote directories). Use sftp your_account@cluster.uit.tufts.edu to transfer files between the cluster and a UNIX shell on your mac or PC.

grep “pattern” file > output Searches for ‘pattern’ within a file and outputs only those lines containing it. In the pattern ‘*’ means any number of characters, so grep “*dog*” would find any line with ‘dog’ in it. ‘^’ means start of line, ‘$’ means end of line and ‘.’ means any one character.

grep –c “pattern” file Tells the number of lines containing “pattern”, where –c means ‘count’.

gunzip file.gz Decompresses a .gz file (removing the .gz file)

gzip file Compresses a file into .gz format, removing the original file (useful to save space on the cluster)

head –n # file Prints only the top # lines of a file (default 10). If # is negative, prints all lines except the last # lines.

history > file Records all of your past ~1000 or so command entries to a file.

ls Lists the contents of the current directory. ‘ls –l’ gives details. ‘ls path/directory/’ lists that directory’s contents.

man command Gives usage help on a command (often full of many un-needed details). Use [space] to move forward, “q” to stop.

mkdir D Makes a new directory named ‘D’

module available Tells what “modules” (special environments to run specific programs) are available on the cluster. To start one, type “module add “ and the name before the / in that listing (the number after the / is the version number).

module add R Starts R on the cluster, follow the instructions to begin (usually typing ‘bsub –Ip –q int_public R’).

module add python Makes python 2.6.5 your default version, instead of 2.4 (necessary to run some programs).

more F Lists contents of file F one screen at a time. To move forward hit [space], to stop hit “q”

mv f1 f2 Changes the name of f1 to f2

mv f1 directory Moves f1 to the specified directory (in current directory, or specified by ../dir/dir, /dir/dir, etc. path). See ‘cd’ for details.

perl program.pl parameters Runs a perl program with associated program-specific parameters. Without parameters, most programs will print a brief usage summary.

printenv PATH Prints just the PATH variable of your environment settings.

python program.py parameters Runs a python program with associated pogram-specific parameters

pwd Gives the pathname of the current directory (stands for “print working directory”)

rm f Removes a file. ‘rm *.extension’ removes all files ending with “.extension” (use caution)

rm –r directory Removes directory and all files in it (use extreme caution)

rmdir D Removes directory D (only if empty)

s f1 > f2 Alphabetically sort file f1 and put the results in f2

sed –n ‘#p’ file Prints line number ‘# ‘of a file

sed ‘#d’ file1 > file2 Deletes line number # from file1 (not changing the original file) and writes the new file to file2

sftp Secure ftp. See ‘ftp’.

ssh account@cluster.uit.tufts.edu connects you to your cluster account from any UNIX shell (e.g. on your PC).

tar file[s] Combines multiple files into one archive or, more usefully, unpacks such archives. ‘tar –xf file’ unpacks anything with a .tar extension. ‘tar –xjf’ unpacks most archives. ‘tar –xvzf’ unpacks archives with .tgz extensions. Sometimes files have multiple packings, such as file.tar.gz, in which case you’d have to use gunzip first & then tar.

tail –n # Prints the last ‘#’ lines of a file, default (without –n) is 10.

unzip Unpacks files with .zip extensions

vi Opens vi text editor. The basics are: Use arrow keys to move. Type “i” to begin entering text, “[esc]” to stop. “:wq” to save changes and quit, and “:q!” to quit without saving. “/” to find a pattern, “:#” to go to line #, “:$” to go to the end of the file, “$” to go to the end of a line and “^” to go to the beginning of a line. “dd [#]’ deletes # lines starting with current line (default 1 line), “yy[#]”= yank, copies # lines from current. “p” pastes lines most recently deleted or yanked.

wc < filename “Word count”: tells lines, words & characters in file. ‘wc –l’ (only lines),’ wc –w’ (only words)

wget URL Gets a file from a URL. To get the full URL, right- or command-click on a link & choose ‘copy link location’.

which command Gives directory where ‘command’ is run from by default.

whereis command Tells locations of all executable versions of ‘command’

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download