Practical: UNIX 1

Aims

Objectives

After this practical you will:

Exercises

  1. Some of the UNIX commands that you have been using are in directory /bin. Experiment with regular expressions that match file names by listing subsets of the files in this directory e.g. list those commands that begin with a particular letter or that end with a particular letter; list all three-letter commands; list commands with two adjacent vowels (a, e, i, o, u) in their names.

  2. Type "ls /bin > filename" to create a file containing a list of the files in directory /bin. Use grep to find lines in this file that match the criteria in Exercise 1.

  3. Directory /chalmers/users/kemp/DAT160/beta_trefoil contains several files from the Protein Data Bank. All of these files contain at least one beta-trefoil domain.

    You should use one grep command in answering each of the following questions, although this may be connected to other UNIX commands using pipes.

    (a) Display all lines in these files that contain the word "HUMAN". How many lines are displayed?

    (b) How many files contain the word "CYTOKINE" on their first line?

    (c) Find files that contain data published in the journal "BIOCHEMISTRY".

    (d) Which data files were described in journal articles published between 1990 and 1995?

    (e) Which data files were produced by someone called "REES"?

    (f) Which data files have an author with "A" as one of their initials?

    (g) Data sets determined by X-ray crystallography have a resolution value indicating the accuracy of the data set. Display the line containing this value for each file that has a resolution value (display only one line per file).

  4. The program lynx is a web browser that can be run in a terminal window. This is best viewed in a terminal window that has been resized to fill a large area of the screen. Run this program and view a few web pages, e.g.

    unix> lynx http://www.cs.chalmers.se/Cs/Research/Bioinformatics/
    

    Find all lines in the manual page for Lynx that contain the string "HTML".

  5. The Lynx program can be used to send the HTML source of a web document to standard output, e.g.

    unix> lynx -source http://www.rcsb.org/pdb/files/1crn.pdb
    

    Use UNIX commands to count the number of alpha-carbon atoms in that data file.

    Use UNIX commands to count the number of nitrogen atoms (main chain and side chain) that are in that data file. Find two ways to do this.
  6. The LaTeX source file described in the lecture is in /chalmers/users/kemp/DAT160/latex/file.tex
    Copy this file into your own filespace, replace my name with your own, then create and view a PostScript file produced from this file.

    Use the dvipdf command to create a PDF file, and then use acroread to view the PDF file.

    Find all lines in file.tex that start with a backslash character.

  7. The UNIX reference manual pages can be viewed using the command man. Look at the manual pages for a few commands, including the manual page for man itself.
    Use "man -k" to display one-line summaries that contain a given keyword. How can we find one-line summaries that contain two or more keywords (e.g. PostScript and PDF)?

Supplementary Material

The University of Edinburgh has developed UNIXhelp for Users - a web site with useful information for users of the UNIX operating system.

Protein Data Bank file format