Research School in Genomics and Bioinformatics
Research School in Genomics and Bioinformatics

Basics in Bioinformatics and Databases

Part 1: Basics in Databases

Practical: Bioinformatics Schemas

Aim

Objectives

After this practical you will:


Ensembl

Ensembl is a joint project between EMBL-EBI and the Sanger Institute to develop a software system which produces and maintains automatic annotation on eukaryotic genomes.

The Ensembl documentation includes a description of the schema.

The Ensembl relational schema is available on-line. Go to the Ensembl documentation web page, and follow link to "WebCVS - Ensembl Code Repository". From there, follow "ensembl/" then "sql/" then "table.sql". You will see from the revision numbers and dates that the schema is still evolving. To see the table creation statements for any particular revision click on the revision number. Look at the most recent schema on the "MAIN" branch.

ArrayExpress

ArrayExpress is a public repository for microarray based gene expression data.

The ArrayExpress web site contains a description of the schema.

The ArrayExpress relational schema is available on-line. Go to the ArrayExpress Implementation web page, follow the link to "ArrayExpress database scripts" then click on MAGE-RS.tab to see the table creation statements.

SWISS-PROT

A relational schema is being designed for SWISS-PROT in a project at the European Bioinformatics Institute. Look at some of the schema diagrams.

Functional Data Model

Look at the ensemblFDM and aeFDM schemas and web interfaces. See how Daplex queries against the ensemblFDM schema are translated to SQL.


Last Modified:
kemp@cs.chalmers.se