Graham Kemp > Teaching > Programming Tools


Practical: UNIX 1

Aims

Objectives

After this practical you will:

Exercises

  1. Many of the UNIX commands that you have been using reside in /usr/bin. Experiment with regular expressions that match file names by listing subsets of the files in this directory e.g. list those commands that begin with a particular letter or that end with a particular letter; list all three-letter commands; list commands with two adjacent vowels (a, e, i, o, u) in their names.

  2. Type "ls /usr/bin > filename" to create a file containing a list of the files in directory /usr/bin. Use grep to find lines in this file that match the criteria in the previous question.

  3. Directory /users/mdstud/kemp/ptools/beta_trefoil contains several files taken from the Protein Data Bank. All of these files contain at least one beta-trefoil domain.

    You should use one grep command in answering each of the following questions, although this may be connected to other UNIX commands using pipes.

    (a) Display all lines in these files that contain the word "HUMAN". How many lines are displayed?

    (b) How many files contain the word "CYTOKINE" on their first line?

    (c) Find files which contain data published in the journal "BIOCHEMISTRY".

    (d) Which data files were described in journal articles published between 1990 and 1995?

    (e) Which data files were produced by someone called "REES"?

    (f) Which data files have an author with "A" as one of their initials?

    (g) Data sets determined by X-ray crystallography have a resolution value indicating the accuracy of the data set. Display the line containing this value for each file that has a resolution value (display only one line per file).

  4. As an alternative to using Netscape, the program lynx can be run in a terminal window. This is best viewed in a terminal window that has been resized to fill a large area of the screen. Run this program and view a few web pages.

    Find all lines in the manual page for Lynx that contain the string "HTML".

  5. The Lynx program can be used to send the HTML source of a web document to standard output, e.g.

    unix> lynx -source http://www.cs.chalmers.se/~kemp/publications/

    Use UNIX commands to count the number of publications listed on that web page. Try to find at least two different ways to do this.

  6. The LaTeX source file described in the lecture is in /users/mdstud/kemp/ptools/latex/file.tex
    Copy this file, replace my name with your own, then create and view a PostScript file produced from this file.

    Find all lines in file.tex that start with a backslash character.

  7. The UNIX reference manual pages can be viewed using the command man. Look at the manual pages for a few commands, including the manual page for man itself.
    Use "man -k" to display one-line summaries that contain a given keyword. How can we find one-line summaries that contain two or more keywords?

Supplementary Material

The University of Edinburgh has developed UNIXhelp for Users - a web site with useful information for users of the UNIX operating system.

Protein Data Bank entry 2AAI contains the structure of ricin from the castor bean plant. Chain B contains two beta trefoil domains (although RasMol does not detect any beta strands in this chain). This chain is a lectin that binds to a carbohydrate molecule, galactose, on the cell surface and thus enables the molecule to enter the cell where chain A can attack the ribosome and irreversibly inhibit protein synthesis. Ricin was used to assassinate a Bulgarian journalist in London in 1978. There have been some more recent news stories about ricin.