Original text Marcus Andersson 2000
Revision Anders Gidenstam 2002
Translation Anders Egneus 2002

Web pages, lecture notes

Basic HTML | General commands | Lists | Text marking | Tables | Frames | Character codes | Good Coding Style | Different browsers give different results | HTML editors | CSS | Building a site consisting of several pages | Links

Introduction

One of the cornerstones of the WWW is the URL (Uniform Resource Locator), which provides a convenient way of referring to a file on a computer anywhere in the world. A typical URL consists of three parts and might look like this:

http :// www.cs.chalmers.se / Cs/Grundutb/Kurser/index.html

Protocol

Domain name

File name, including the search path

Sometimes the filename is left out, in which case the server typically defaults to a standard file, usually index.html. The URL can also contain extra information besides the three basic parts, some examples are

An URL can also be relative, i.e. instead of providing a complete path to a file the position of the file is shown relative to the current page. This can be practical when providing the URL to pictures that are part of a web page and these pictures are in the same directory as the page. In this case it is sufficient to provide the file name, e.g. mypic.gif. If the picture file is in a subdirectory the directory name is included, my_pics/mypic.gif. It is also possible to reach the parent directory by using the special directory name ...

A typical web transaction involves two computers; the server, which is the computer which has the web page file, and the client, which is the computer running the browser, i.e. the computer that you are using. The basic chain of events is as follows:

  1. You ask your browser for a file with a certain location (URL), either by entering the address by hand, or by following a link on another page.
  2. The browser contacts the server which has the file (the server address is in the URL) and asks to be sent a copy.
  3. The server sends a copy of the file to the computer you are using.
  4. The browser receives the file, interprets the HTML code in it and presents the result on the screen.
  5. If there is extra content on the page (e.g. pictures) this is retrieved in the same way as any other file and the content is placed in its proper place in the document.

Where to put your files

In practice, this varies depending on which system you are using. The common procedure is to create a directory with a special name in your home directory. Alternatively everyone on the system puts their web pages in a common place, e.g a particular server. This is how it works on some of the systems on Chalmers:

Datavetenskap/Matematik/Bioinformatik - mdstud
Create a library with the name .www (note lower-case letters!) in your home directory and put the files there. The address to the pages will be http:// www.mdstud.chalmers.se/~account name/filename
Teknisk Fysik / GU-Fysik - dd
Create a library with the name WWW (note capital letters!) in your home directory and put the files there. The address to the pages will be http:// www.dd.chalmers.se/~accountname/filename
Datorteknik/Elektroteknik/IT-linjen - tekno
Use the .public_html directory in your home directory.

Regardless on what system you are on it is important to ensure that all files and directories are accessible to all users. This is done using the command chmod. All files, the www directory and the home directory (if the www directory is a subdirectory to the home directory) has to be readable. This also holds for pictures used in the web pages. Setting incorrect permissions for files is the most common beginner's error, so check this first if your web pages are not working correctly. A useful set of access rights (which doesn't relax security) is -rw-r--r-- for files and drwxr-xr-x for directories. (To see the current access rights, use the command ls -l.)

Basic HTML

HTML is a text format which makes it possible to produce good looking documents using only text. In principle HTML uses common text and adds layout and formatting using a number of commands called elements (or tags). These elements are written in a specific way to separate them from the text.

Text content is simple - with the exception of some characters (we will look at these exceptions further on in these notes), you just write text as in any ordinary document. However, there is one important rule to remember: All blanks, tabulations and line feeds are translated by the browser into a blank, and multiple blanks are compressed into a single blank. So if you want the text to be displayed with a line break, you use an element to tell the browser to insert a line break.

There are two ways to write elements in HTML:

The stop marker (</ELEMENT_NAME>) is there to make it possible to contain text and other elements inside an element. The element will then affect everything inside the start and stop markers. The browser doesn't care if you use capitals or lower-case letters for the elements, but for legibility it is recommended that you are consistent with either.

The extra information in a element (also called attributes to the element) are sometimes necessary but in most cases optional. Examples of where extra information is necessary is that it is meaningless to insert a picture in your web page without telling the browser where the picture file is to be found, or to show a link without showing where it leads. Usually the extra information only modifies the element in some way. It is possible to have any number of attributes to each element by separating them with blanks (or line feeds if the lines become too long). An element of the first type mentioned above might look like this:

<ELEMENT_NAME ATTRIBUTE1="Value1" ATTRIBUTE2="Value2" .....> Text and other elements affected by the element </ELEMENT_NAME> <ELEMENT_NAME any_extra_information>

Most attribute are written as a combination of the attribute name and a value within quotes, ATTRIBUTE_NAME="Value", but some attributes can be written without any value. It is not necessary to include the quotation marks for all values, but it is a good general rule to include them anyway, both to increase legibility and to ensure that you do not forget them when they are needed.

We will now run through some commonly used elements that are supported by most browsers. There are many more elements than these, and there are also more attributes to the elements shown here.

Basic page structure

A HTML page is almost always unchanged. Outermost there is the element <HTML>, which encloses the entire page. Inside this element there are two parts, a <HEAD> part, and a <BODY> part. In the <HEAD> part there is information about the page, e.g. the title (which is often shown in the browsers title bar) and in the <BODY> part we put everything that makes up the visible part of the page. A typical framework for a page might thus look like this:


<HTML>

<HEAD>

 <TITLE>

 

  Page title

 

 </TITLE>



 

  Commands for the <HEAD> part

 



</HEAD>

<BODY>



 

  Page contents

 



</BODY>

</HTML>

Elements used in the <HEAD> part

The optional element <!DOCTYPE> shows which version of HTML the page is written in and is useful to include if you wish to use a validation tool for your page, e.g. World Wide Web Consortiums HTML validator.

If you use non-ASCII characters (e.g. 'Å', 'Ä', 'Ö') you should include the character encoding used in the <META> element.

Elements used in the <BODY> part

For starters, the <BODY> element itself can use a number of attributes:

Attribtes to the <BODY> command

BGCOLOR
The background color of the page. A color can - in this and all other cases in HTML - be written in two ways:
  • As a color name, e.g RED, YELLOW or BLACK. Which color names are supported varies between browsers.
  • As a hexadecimal value. This is written as the character # followed by three double digit hexadecimals. The three numbers in turn describe the amount of red, green and blue in the color. The lowest value is 00 and the largest is FF. Examples of colors are #000000 (black), #FFFFFF (white), "FF0000 (red) and #FF8000 (orange).
BACKGROUND
A background image. The image is given as an URL and is, if smaller than the page, be replicated until it fills the page. Do not choose a messy image which makes the contents of the page hard to read.
TEXT
The standard color of text on the page. The color used is selected in the same way as for background color above.
LINK
The standard color of a link on the page. Links are usually shown underscored and in a different color, which can be set here.
VLINK
The color of a 'visited' link. Many browsers use a different color for visited links, in order to ease navigation. Set this color here.
ALINK
The color of the link when clicked on. Perhaps not the most useful attribute around.

A more advanced <BODY> line might look something like this.

<BODY BGCOLOR="#C8DEFC" BACKGROUND="mybackgroundpicture.gif" TEXT="BLACK" LINK="BLUE" VLINK="#000040" ALINK="RED">

Images

To include an image in a web page we use the <IMG> element. Images can be in the same directory as the page, in a different directory or on a different server altogether. This element has no stop tag as it isn't meaningful to let an image 'enclose' other images or text. There is a number of more or less necessary attributes:

SRC
The address to the page as an URL. Necessary for obvious reasons.
ALT
Shows the text which is used as an alternative if the browser is unable to or is set not to show images. This is an important attribute in order for the page to be legible with such browsers.
BORDER
Creates a border around the image. This attribute shows the thickness of the border as an integer, equal to or larger than 0. If no number is given the browser uses a default value, which varies with the browser.
ALIGN
Shows where you want the image positioned in relation to the text. If no value is given the image is placed as part of the text. Possible values are e.g. LEFT, RIGHT, TOP and BOTTOM, but which exact values are available and their interpretation is browser specific.
WIDTH
An integer showing the width of the image in pixels. It is not strictly necessary to include this as the data is stored in the image, but there is an advantage to include it anyway; since the browser loads images after loading the web page it won't know how much space to reserve for the picture and might thus have to rebuild the page after downloading the image. If you provide the WIDTH attribute the browser at least knows how much space to put aside and can just insert the image later on after downloading it.
HEIGHT
Shows the height of the picture and works just as the WIDTH attribute. It is possible to create an extra effect with these two attributes: If you state a different width or height than the actual one, the browser will scale the image to fit the shown size.

Links

Links are written with the element <A>. The attribute <HREF> tells where the link leads. Links can enclose text and other elements like images. Here are two examples of links:

The second example shows a link consisting of an image. When using images as a link take care to put the element for link and image next to each other (without blanks or line feeds) to avoid having the link include a blank space (line feeds are as we mentioned earlier replaced with blanks).

Paragraph layout

<P>
This element encloses paragraphs of text and adds space above and below the paragraph. Many people use the <P> element without the stop part, as a divider, but the element is intended to mark paragraphs of text and should be used in that way. It can also use the ALIGN attribute to justify the text (left, right or centre). ALIGN is a good example of an attribute that works on several elements.
<Hx>
Shows text as a header. x is an integer between 1 and 6, where 1 gives the largest header.
<BR>
Forces a line break in the text. No end tag. The attribute CLEAR can take the values NONE, LEFT, RIGHT or ALL and forces the element to break more than one line to achieve an 'empty' left and/or right edge. This is practical if you have placed images on the edges and want to be sure the text or new images don't end up inside but below the first images. The following examples show a left- justified image, a right-justified image and text divided in three parts. The <BR> element is in front of the third (and largest) part.
Värdet på CLEAR är NONE Värdet på CLEAR är LEFT Värdet på CLEAR är RIGHT Värdet på CLEAR är ALL
<BR CLEAR="NONE"> <BR CLEAR="LEFT"> <BR CLEAR="RIGHT"> <BR CLEAR="ALL">
Most of the time we do not use the <BR> element, it is easier to enclose paragraphs with the <P> element.
<HR>
Creates a horizontal line across the page. No end tag.
<BLOCKQUOTE>
Longer quotes can be enclosed by this element which puts the text in a separate paragraph which is also indented somewhat relative to the surrounding text.
Here is an example of some different layout elements, first the HTML code and then the result:

<H2>A headline</H2>

<P>Some introductory text</P>

<H3>A subheading</H3>

<P>Some more text.</P>

<HR>

<P>Another paragraph. These are the better way to introduce line breaks,

instead of using the <BR> command, which was done here.</P>



<BLOCKQUOTE>Here is a quotation. This text should be indented. </BLOCKQUOTE>

A headline

Some introductory text.

A subheading

Some more text.


Another paragraph. These are the better way to introduce line breaks, instead of using the <BR>
command, which was done here.

Here is a quotation. This text should be indented.

Lists

Lists are a bit complicated to write. They are written by enclosing the entire list in a single element, and then enclosing each list element in another. Each list element can then contain other elements, such as other lists. Lists inside of other lists are called nested lists. There are three kinds of lists, all with somewhat different features.

Ordered lists

This list type automatically numbers the elements. The element is <OL> (ordered list) and each element is enclosed by the element <LI> (list item).


<OL>

<LI>first element</LI>

<LI>second element</LI>

<OL>

The code above gives the following result:
  1. first element
  2. second elementet

Unordered lists

Works as ordered lists, but each element is preceded by a dot instead of a number. The element is <UL> (unordered list), but we still use <LI> for the list elements.

<UL> (unordered list), but for the list elements you still use <LI> .

  • first element
  • second element

Definition lists

Definition lists are the most complicated kind of lists. They have two kinds of list elements; one for whatever is to be defined and another for the definition itself. These are best written in pairs. The elements are <DL> (definition list), <DT> (definition tag) and <DD> (definition data). If we want the list to take up less space we can use the attribute COMPACT (which uses no value).


<DL COMPACT>

<DT>First element</DT>

<DD>Definition of first element</DD>

<DT>Second element</DT>

<DD>Definition of second element</DD>

<DT>3</DT>

<DD>Definition of third element</DD>

<DL>

The code above gives the following result:
First element
Definition of first element
Second element
Definition of second element
3
Definition of third element

The COMPACT attribute only has effect if the name of whatever is defined is short enough to let the definition start on the same line. Without the COMPACT attribute the definition always starts on a new line.

Text marking

In HTML there are two ways to change how text looks: logical text marking and physical text marking. These differ at the conceptual level but will look the same in the browser. Often you will see recommendations to use only logical marking and abstain from physical marking, but this is not the best use of HTML. The reason for these recommendations is that many users resort to physical marking when logical would have been a better choice. However, both kinds of marking have their uses.

Logical text marking

Logical text marking means that you mark a word or piece of text for its meaning. This could be to mark a word as extra important, or that a part of the text should be shown as program code. Different browsers show the different logical formats in different ways, but always changes the text in a way that tries to show the intended significance (by using the ordinary physical formats - italics, bold, underscore, using a different font and so on). Text formats always has an end marker since they affect part of the text, the part between the element and the stop tag of the element. The following logical formats are available (in the examples the two middle letters are marked. Remember that the actual look might vary with the browser used.)

<EM>
"Emphasis", for text that is to be marked as more important. Example: abcdef
<STRONG>
"Strong emphasis", for even more important text. Example: abcdef
<DFN>
Definition. Example: abcdef
<CODE>
Program code. Example: abcdef
<KBD>
A key on the keyboard. Example: abcdef
<VAR>
Variable. Example: abcdef
<CITE>
Title of a cited work. Example: abcdef

The meanings of the above elements are not exactly defined, so you have to use what fits best from case to case.

Physical text marking

Physical text marking is used when the actual looks of the text on the screen is more important than the meaning. With these you can choose the exact result shown on the screen. As mentioned earlier, many people use this instead of logical marking, but physical marking comes with a few disadvantages. If you use a physical format which the browser doesn't support the text won't be marked in any way, but if you use logical formats you can be sure that the text will look different in some way (if the browser supports the logical element), even if you cannot be sure how it will look. This is especially important for speech generating browsers.

<I>
Italics. Example: abcdef
<B>
Boldface. Example: abcdef
<U>
Underscored text. Example: abcdef
<TT>
"Teletype", all letters are of the same width. Example: abcdef
<BIG>
Larger text. Example: abcdef
<SMALL>
Smaller text. Example: abcdef
<SUB>
Subscript. Example: abcdef
<SUP>
Superscript. Example: abcdef

There is yet another way of changing the looks of the text, and that is with the multi-faceted <FONT> element. This element has three possible attributes (which are supported will as usual depend on the browser):

FACE
Changes the typeface of the text. The value is a list of typefaces, and the text will be shown in the first typeface installed on the system. If none are installed the element will have no effect. Example:FACE="Arial, Helvetica" (abcdef)
COLOR
Changes the color of the text. The color is set in the same way as shown earlier, that is either as a keyword or as a # followed by three two-digit hexadecimal numbers. Example:COLOR="RED"( abcdef)
SIZE
Changes the size of the text. The value can be given in two ways; either as an absolute value (an integer) between 1 and 7 or as a relative value. Relative values are given as a + or a - followed by an integer and the text will change that many steps in size. Example:SIZE="-2"(abcdef)

Tables

HTML tables were intended to make it possible to present data in tables, but are nowadays used more for layout purposes. Tables in HTML are very flexible, but may at first glance seem difficult to write. Using tables is also complicated by the large differences in support and interpretation by the different browsers.

To build a table we use three elements; <TABLE>, <TR> and <TD>. There is also a fourth element, <TH>, but this is in principle used in the same way as <TD>. The outermost element of a table is <TABLE>, which encloses one or more <TR> elements (one per row in the table) which in turn encloses one or more <TD> elements (one per cell in the row). If you use a <TH> element instead of <TD> this will look like a header for the actual row or column. Here is a simple example showing the Swedish winners of the Eurovision Song Contest:


<TABLE>

 <TR>

<TH>Year</TH>

<TH>Title</TH>

<TH>Artist</TH>

 </TR>

 <TR>

<TD>1974</TD>

<TD>Waterloo</TD>

<TD>ABBA</TD>

 </TR>

 <TR>

<TD>1984</TD>

<TD>Diggi-loo diggi-ley</TD>

<TD>Herreys</TD>

 </TR>

 <TR>

<TD>1991</TD>

<TD>Fångad av en stormvind</TD>

<TD>Carola</TD>

 </TR>

 <TR>

<TD>1999</TD>

<TD>Take me to your heaven</TD>

<TD>Charlotte Nilsson</TD>

 </TR>

</TABLE>

År Titel Artist
1974 Waterloo ABBA
1984 Diggi-loo diggi-ley Herreys
1991 Fångad av en stormvind Carola
1999 Take me to your heaven Charlotte Nilsson

There are lots of attributes to the table elements, but with large differences in support from the browsers.

Some attributes for the <TABLE> element

BORDER
This attribute uses a value equal to or larger than zero and gives the table a border of this thickness. The border is often drawn as to give a 3D-effect, i.e. it is drawn using two colors, one lighter and one darker. If this attribute isn't used you get the default value, which takes up space as if you had used BORDER="1" but doesn't show a visible border! This is a good reason to use the BORDER attribute. You can also give the BORDER attribute without a value, which gives the same effect as if you had written BORDER="1".
WIDTH
How many pixels wide you want the table to be, stated as either number of pixels or as a percentage of the window (or the surrounding table cell if the table is enclosed by another table). Examples: WIDTH="130" (gives a table 130 pixels wide), WIDTH="50%" (gives a table of half the window).
HEIGHT
Gives the table's height in the same way as the WIDTH attribute.
ALIGN
Just as with images it is possible to show where a table should be placed relative to the surrounding text. Possible values are LEFT and RIGHT, which work as with the <IMG> element. If you do not set this attribute, the browser will not allow text on the sides of the table but will use an entire row for the table.
BGCOLOR
Used to give a background color for the table, just as with the <BODY> element. The color is also specified as in the <BODY> element, as either a keyword or in hexadecimals.
BACKGROUND
Sets a background image, used in the same way as the same attribute in the <BODY> element. The value is given as an URL.
CELLPADDING
An integer value larger than zero which shows how much space is to be reserved around the contents in the table cells.
CELLSPACING
Set in the same way as for CELLPADDING and shows how much space to leave between table cells and between cells and the table edge.
What CELLPADDING and CELLSPACING affects

The code for the example looks like this:


<TABLE CELLSPACING="20" CELLPADDING="20" BORDER><TR>

 <TD>tre</TD>

 <TD>fyra</TD>

</TR></TABLE>

Some attributes for the <TR> element

Several of the attributes for the <TR> element doesn't affect the <TR> element itself but is a shortcut to affect all <TD> (and <TH>) which the <TR> element encloses. Examples are ALIGN, VALIGN and BGCOLOR.

Some attributes for the <TD> and <TH> elements

WIDTH
Cell width. The value can be set as either a number of pixels or as percentage of the screen, just as for table width. If you give different values for cells in the same column or if the total width of the cells is larger than the table width the browser will try to work out a compromise. If the browser cannot fit the contents of a cell into the given size it will expand the cell to fit the contents. It is best to make sure that the table is consistent with itself (at least theoretically), i.e. that there is a decent chance for it to follow all the set values. If you don't set a width on the cell it will become as large as it needs to be, or of a size that allows other cells in the column to be as large as they need to.
HEIGHT
The height of a cell. Works in the same way as WIDTH, with the same potential problems for cells in the same row.
ALIGN
Where cell content is placed inside the cell. Possible values are LEFT, CENTER and RIGHT. Default value is LEFT.
VALIGN
Vertical alignment of cell content inside the cell, if the cell is larger than the content. Possible values are TOP, CENTER and BOTTOM, with CENTER being the default value.
NOWRAP
This attribute uses no value. If you wish to use the attribute, you write <TD NOWRAP any_other_elements>. The effect of NOWRAP is that the browser won't insert any line breaks in the text. (Normally the browser is allowed to break a line anywhere there is a blank space, but with NOWRAP set the browser will only break where you insert line break elements <BR>).
BGCOLOR
Background color of the cell, exactly as with the entire table.
BACKGROUND
Sets a background image for the cell, as with the entire table.

The following table shows how some table cell attributes work. In this example all cells are WIDTH="20%" and HEIGHT="20%", and the entire table is "WIDTH=60%", HEIGHT="60%", BORDER="1" and ALIGN="CENTER".

ALIGN="LEFT"
VALIGN="TOP"
BGCOLOR="#EFABCD"
ALIGN="CENTER"
VALIGN="TOP"
ALIGN="RIGHT"
VALIGN="TOP"
ALIGN="LEFT"
VALIGN="CENTER"
ALIGN="CENTER"
VALIGN="CENTER"
ALIGN="RIGHT"
VALIGN="CENTER"
BGCOLOR="WHITE"
ALIGN="LEFT"
VALIGN="BOTTOM"
ALIGN="CENTER"
VALIGN="BOTTOM"
BACKGROUND="b.gif"
ALIGN="RIGHT"
VALIGN="BOTTOM"

There are a few other elements which are used with tables, but the four mentioned above are the commonly used ones.

Frames

One problem with the basic HTML format is that when you follow a link to a new page, the entire page changes. Sometimes it is practical to change only part of the page, and this is where frames are useful. Frames allows you to have several independent documents open in the same window by splitting that window in several parts. Each part then contains an ordinary HTML document, making it possible to change the contents of part of the window while still leaving the other parts unchanged. Frames have become very popular on the WWW, unfortunately sometimes because they are perceived as "fresh" and "cool" instead of for their improvement to the page.

When you use frames it becomes somewhat more complicated to keep track of the documents shown, since the documents can now end up in one of several possible places. Extra attributes allow us to place the documents where we want them to go.

The first thing to consider is that you will have to handle more files, one file which just tells how to divide the window into frames and then one HTML document for each frame. (The ordinary HTML files do not have to be modified by any special elements - in fact you can use any HTML document as the content of a frame.) Frame-related elements are used only in the file which defines the frame structure.

Lets look at an example of HTML code that defines the frames used in a page. Note the three elements used, <FRAMESET>, <FRAME> and <NOFRAMES>.


<HTML>

 <HEAD>

<TITLE>Winners of the Eurovision Song Contest</TITLE>

 </HEAD>

 <FRAMESET ROWS="100,*">

<FRAME SRC="title.html" NAME="title"

 SCROLLING="NO" RESIZE="NO">

<FRAMESET COLS="20%,*">

 <FRAME SRC="menu.html" NAME="menu">

 <FRAME SRC="2000.html" NAME="contents">

</FRAMESET>

<NOFRAMES>

 Gå <A HREF="noframesversion.html">hit</A> to get to

 a non-frames version of this side.

</NOFRAMES>

 </FRAMESET>

</HTML>

Beside the code in winners.html we need the files title.html, menu.html and 2000.html. These are ordinary HTML files and are not shown here (for reasons of space).

The result of the above code

It is now possible to exchange the contents of the three framesets independent of each other. If you push the link named "1999" the contents of the largest frame is exchanged from the file 2000.html to the file 1999.html. We will look at exactly how to write the link below.

If you press the link '1999', the contents of the lower right frame are changed.

The <FRAMESET> element

In a document using frames there is no <BODY> element. This is because you do not use any content but instead divide the entire page into frames which you fill with content from other files. The <BODY> element in a document using frames replaced with the <FRAMESET> element. This is the element used to partition the screen, by stating how many rows and columns of frames you want, as well as their sizes.

The attributes ROWS and COLS show how the screen is divided. The values for this is set as a list of sizes. Example:


<FRAMESET ROWS="100, *, 100" COLS="40%, 60%">

...

</FRAMESET>

The example states that the screen is to be divided in six frames in the following manner:

There are three ways to state the size (width and height) value:

It is possible to mix the three ways in any way, but you should be careful to maintain internal consistency. If you leave out the ROWS or COLS attributes, the browser defaults to ROWS="*" or COLS="*" respectively. It is common to just give values for either ROWS or COLS meaning that you just wish to divide the space in one direction.

The <FRAME> element

<FRAMESET> is used to divide available space. <FRAME> is used to put a HTML document in a frame. To fill the frames we specify the contents from left to right, from top to bottom (just the way we read text). A frame can contain two kinds of documents:

In a <FRAME> element we have two important attributes:

SRC
Shows which document that is to be shown in the frame. The value is an URL.
NAME
Sets a name for the frame. These names are used later on to direct linked documents to the correct frame (see more below).

There are also a few other attributes which allows us to specify if the frame should have a scrollpane, if frame size is fixed or if the user can resize the frame and finally color and thickness of the borders between the frames.

The <NOFRAMES> element

Many people use browsers which cannot handle frames. Others turn off frames in their browsers just because they do not like them. These people shouldn't be ignored, which is why there is a <NOFRAMES> element. This element specifies what is shown if frames cannot be shown, <NOFRAMES> provides the alternative content. If the browser can handle the frame content it ignores the <NOFRAMES> element. (This is one of the bothersome aspects of frames, you have to do everything in two versions, one with frames and one without).

Opening documents in the correct frame

When a page has multiple frames, we have to direct in which frame a link shall open a new document. We do this using an additional attribute in the link (<A>) element, the TARGET attribute. The value of TARGET is simply the name of the frame in which we want the new document to be shown. In the example above there are only links in one frame, the frame named "menu". Once you click on a link in this frame a new document should open in the frame named "contents". This example shows a small extract from menu.html which illustrates how to use the TARGET attribute.


...

<UL>

 <LI><A TARGET="contents" HREF="2000.html">2000</A></LI>

 <LI><A TARGET="contents" HREF="1999.html">1999</A></LI>

 <LI><A TARGET="contents" HREF="1998.html">1998</A></LI>

 <LI><A TARGET="contents" HREF="1997.html">1997</A></LI>

...

If you do not specify a TARGET the new document will open in the same frame as the link is in (just as when not using frames). In the example above that would not have been very practical, but sometimes that is the effect you want. Unfortunately this is also the result if we forget to write a TARGET attribute.

When you link outside your own site you usually want to remove the frame structure and open the new document in the entire browser, exactly as if you were not using frames. To do this we use the special frame name "_top" (note the underscore!). This "frame" is always available and causes all old frames to be removes and the new document gets shown in the entire window. Another possibility is to use TARGET with an unspecified frame name. This causes the new document to be opened in a new window, which sometimes is a practical effect.

Character codes

Some characters have a special meaning in HTML, e.g < and >. If you write these characters in your text the browser will interpret them as part of a element, with unpredictable effects. If you want the browser to display a special character, you have to write the character in a special code.

The notation for character codes is &character_code;. (Note the ; at the end.) The character code can be either a keyword or a # sign followed by a number. Besides < and > we also have to write any & characters with this special notation to avoid the browser interpreting the & as the start of character code. It is also best to write "uncommon" characters, e.g. å, &aauml and ö, using special notation. Most of the time these characters will work, but a user with a different character set might see some completely different character. Characters written in special notation always turn out right. Finally the special notation allows us to get at some characters not very common on keyboards, such as © or ®.

Here is a table of some of the more common and useful character codes. As you can see there is a logic to the keywords and it is often possible to guess the keywords for other characters. There are many, many more character codes available than these and a complete list can be found in e.g. Mikodocs Guide to HTML(http://www.idocs.com/tags/) under the header "Character Entity References".

&
&amp;
&#38;
½
&frac12;
&#189;
ß
&szlig;
&#223;
 
&nbsp;
&#160;
<
&lt;
&#60;
Ä
&Auml;
&#196;
ä
&auml;
&#197;
©
&copy;
&#169;
>
&gt;
&#62;
Å
&Aring;
&#197;
å
&aring;
&#229;
®
&reg;
&#174;
°
&deg;
&#176;
Ö
&Ouml;
&#214;
ö
&ouml;
&#246;
µ
&micro;
&#180;
Æ
&AElig;
&#198;
æ
&aelig;
&#230;
É
&Eacute;
&#201;
é
&eacute;
&#233;
Ñ
&Ntilde;
&#209;
ñ
&ntilde;
&#241;
Ø
&Oslash;
&#216;
ø
&oslash;
&#248;

Good Coding Style

Using all these elements and attributes means an advanced HTML page contains a heavy proporttion of elements and attributes in relation to the amount of text. For this reason it is important that the code is written so that it is readable both by other users and yourself (especially as you will probably update and reuse the code).

CAPITALS or not?

You are not forced to write your HTML elements or attributes using capitals, any way will work as long as you spell correctly. The one thing to stick to is consistency, either write your elements in capitals or lower-case letters, but use the same style throughout the entire document! The advantage to using capitals is that it often becomes very clear what is code and what is not. Lower-case letters are on the other hand easier to write as you won't have to use the SHIFT key all the time.

Indentation

Another good way of making your code easy to read is by indentation. This is a very common practice in programming and pays large dividends in legibility. (In fact, in computer science classes it is not unknown for supervisors and lecturers to return assignments unread if not properly indentated. Good indentation is one of the hallmarks of a programming professional.)

Indentation means that extra blanks are added for those code lines which are contained inside a element (recall that most elements has a start and a stop part). A code line which is contained in another element (also known as nested elements) has more blanks at its beginning and so on. Such blanks will not be shown by the browser, since browsers remove additional blanks. A code example is shown below, with and without indentation.


<TABLE BORDER="0" WIDTH="100%" ALIGN="RIGHT">

<TR>

<TD ALIGN="CENTER">

<IMG SRC="bilder/minbild.gif" ALT="Min bild" BORDER="0">

</TD>

</TR>

<TR>

<TD ALIGN="CENTER"><I>

Den här bilden har jag ritat själv genom att använda

programmet Xpaint.

</I></TD>

</TR>

</TABLE>




<TABLE BORDER="0" WIDTH="100%" ALIGN="RIGHT">

 <TR>

  <TD ALIGN="CENTER">

   <IMG SRC="bilder/minbild.gif" ALT="Min bild" BORDER="0">

  </TD>

 </TR>

 <TR>

  <TD ALIGN="CENTER"><I>

    Den här bilden har jag ritat själv genom att använda

    programmet Xpaint.

  </I></TD>

 </TR>

</TABLE>

(As a parenthesis, the code above shows an image with a legend in italics.)

There is no single way to indentate so you are free to use as much indentation as you are comfortable with. Often some elements are not indentated as to not get too many blanks (e.g. we do not usually indentate for the <BODY> element as this covers most of the file). If you prefer a more visible indentation you can add two blanks or a tab. Find your own style, but try to be consistent.

Different browsers give different results

The same code usually does not look the same in different browsers. This has two reasons:

So you can never be sure that your page looks the same for others as it does for you. The best alternative is to try to write logical HTML code and try to write a page that is not browser dependent. A very bad alternative - unfortunately quite common on the Internet - is to demand that a certain browser be used, encouraging visitors to 'upgrade' if they are using an 'old' browser.

Different image formats

You can save your images in various formats. The ones used on the web are mainly GIF and JPEG. They have very different properties, so it is appropriate to choose the right one depending on picture. Both formats use compression of the data (other formats, e.g. Windows BMP, store images uncompressed) which makes them suitable for Internet use.

GIF JPEG
So-called indexed colours. That is, there is a palette of 256 colours (or less) defining what colours can be used in the picture. Each pixel can be of any colour (in terms of a red/green/blue mixture). Thus, several million colours are available.
Does not lose any image information, given that the original picture did not contain more that 256 different colours. To compress the data, the JPEG format makes some changes in the picture. These are often not even noticeable, but sometimes the image looks a bit 'patterned'. Thus, JPG 'destroys' image information.
Can mark a colour as transparent, giving the opportunity of pictures with transparent areas. Does not support transparency.
Supports animation. Does not support animation.

Another format, specially designed for web usage as a replacement for GIF, is called PNG [ping] (http://www.libpng.org/pub/png/) and has not been much used yet. PNG is better than GIF in every aspect.

Java

People talk about 'having java' on their homepages. This is not a very stringent way of expressing things. Java is a programming language, just like C, C++ or Pascal. 'Java on the homepage' can in practice mean two different things; Javascript and Applets .

Javascript

Javascripts aren't really Java, but small pieces of java code which interacts with the HTML code. For example, they can react on and change the other HTML elements, like images or text. Javascript is not written in the same way as Java but there are some similarities, thus the name.

We won't look into Javascript in detail, since it gets quite complicated and is easier to understand if you already know how to write programs. There are however lots of ready-made Javascripts available on the Internet which can be downloaded if you can't or won't write your own. We will look at an example of a complete Javascript here, specifically how to get an image to change if you hold the mouse pointer over it.

Since the effect isn't visible when you print this page we should put down what happens in writing. When you move the mouse pointer over the image the closed window is exchanged for an open window. When the pointer is moved away from the image the closed window comes back. Two images are used, and are shown here as ordinary pictures.

The closed window
The file name is closed.gif
The open window
The file name is open.gif

The code for the example, including both HTML and Javascript code is this:

<A HREF="http://www.dd.chalmers.se/~f95maan/" OnMouseOver="document.fonster.src= open.gif ;" OnMouseOut="document.fonster.src= closed.gif ;"><IMG NAME="fonster" SRC="closed.gif" BORDER="0"></A>
As you can see this is a normal image inside a link. The difference is that there is also some Javascript code which shows what will happen if we move the pointer over the link. What happens is that the browser switches which file is the source for the image. In order to be able to refer to the image it has also been given a name attribute.

Applets

Applets belong to a special category of complete java programs. What makes them different from other programs, is that they can be run from a web browser. They are then started within a box in the browser whose size is defined in the HTML code. Within its box, the applet has full control, but cannot affect anything outside the box. In order to write applets you have to be able to write programs in Java, which is why we leave that for later courses.

HTML editors

There are alternatives to writing your HTML code by hand. There are several examples of HTML editors, i.e. programs which act as word processors, but are used to construct HTML pages. There are both advantages and disadvantages to using editors, and many people have a strong opinion in either direction.

Advantages of HTML-editors

HTML editor drawbacks

CSS

The HTML format has evolved in a stepwise fashion, and was not originally intended for layout. But people want layout, and achieve this by applying the table element. As an alternative for this, Cascading Style Sheets, CSS, was invented.

The basic rule for style sheets is this:

Separate looks and layout!
Style sheets makes it possible to have just about any layout on your web pages, but is because of this also somewhat extensive. What follows is a small introduction to how to use style sheets and what they do. If you want a complete overview this can be found on Cascading Style Sheets: CSS1, CSS2 Reference and Help (http://www.westciv.com/style_master/academy/css_tutorial/)

There is only one real drawback of using style sheets, and that is that they are not fully supported by the browsers. The amount of support for style sheets vary with the browser (e.g. Internet Explorer appears to have slightly better support than Netscape), but so far there is no browser that provides complete support.

Structure and placement of style definitions

Writing simple style definitions is not very hard. All style definitions (called statements) are composed of two parts, one selector and one list of properties. The selector decides which elements of the web page that is to be affected and the properties decides how the elements are affected. Examples of selectors might be "all headers", "entire document" or "a particular element" and examples of properties might be background color, font size or text alignment. It is possible to use any amount of statements on a page, and as we will see shortly it is possible to place statements in several places.

The structure for style statements is as follows:

selector {property1: value1; property2: value2; ...}
As in HTML line breaks can be placed anywhere where there is a space which helps cut down on line length.

Placement of style statements in HTML documents

The easiest placement for style statements is in the <HEAD> part of the HTML document. To do so, we use the HTML element <STYLE> to enclose the statements. The result might look as follows:


<HTML>

<HEAD>

 <TITLE>titel</TITLE>

 <STYLE>selector {property1: value1; property2: value2; ...}</STYLE>

 <STYLE>selector {property1: value1; property2: value2; ...}</STYLE>

 ...

</HEAD>

<BODY>

 ...

</BODY>

</HTML>

Note that it is possible to have any amount of statements and that they are all placed in a separate <STYLE> element.

Placement of style statements in a separate file

Placing style statements in the beginning of a document may be practical, but if you use lots of statements this can take up large amounts of space. Another consideration is that you may want a uniform style for an entire site. In that case it might be useful if it wasn't necessary to duplicate the style information in each document. This is exactly what CSS allows you to do.

Instead of placing style statements in the HTML document it is possible to have a separate style document which is referred to from the HTML document. Several pages can refer to the same style document. Style documents use the file ending .css and the content is almost the same as if it had been written into the HTML document. The difference is that <STYLE> and </STYLE> is left out (since all the information is style information). The contents of a style document might look like this:

selector {property1: value1; property2: value2; ...}
selector {property1: value1; property2: value2; ...} ...

Then there has to be a reference in the HTML document to the style sheet. This is done using a element like this in the <HEAD> part of the HTML code:

<LINK REL="STYLESHEET" HREF="stylefilens_url" TYPE="text/css">
It is not necessary to choose between the different ways of defining styles. It is possible to combine a separate style sheet with general rules (which is linked into the page) with page-specific statements inside <STYLE> elements in the page. It is also possible to link to more than one style sheet. As long as the style sheets don't contradict each other it works fine, and if they do contradict each other there are rules for how the browser should resolve the conflict.

Placement of style statements in parents to a separate file

It is possible to build very complicated structures of style sheets by letting a style sheet import (refer to) another style sheet. This is done like this:

@import url(address_to_the_other_style-document)

This causes the HTML document to be affected both by the rules that are in the style document referred to in the HTML document and the style sheet referred to in the first style sheet.

An example might be useful: Suppose that Chalmers decides that all (official) web pages inside the Chalmers domain has to conform to some standard (e.g. they have to use a certain typeface). In order to make things easy for everyone who has to write web pages at Chalmers, a style sheet which everyone can use is created, containing all the necessary statements. All that has to be done now is that all (official) pages refer to the style sheet with the rules, like this:

<LINK REL="STYLESHEET" HREF="http://www.chalmers.se/regler.css" TYPE="text/css">
Now every document will automatically get the correct typeface and so forth.

Suppose that the Electrical Engineering section (Elektrosektionen) decides that for some reason all their pages are to have a yellow background and a small picture of Donald Duck in the upper left corner. To make this easy they create a file called eregler.css which they place on the web. Besides style rules which make the background yellow and puts Donald Duck in the top left corner, they add this line to the file:

@import url(http://www.chalmers.se/regler.css)
Now everyone who makes web pages at the E-section can do the following import in the <HEAD> part of their pages:
<LINK REL="STYLESHEET" HREF="http://www.etek.chalmers.se/eregler.css" TYPE="text/css">
Now they get both the correct Chalmers typeface and E-section background and duck without importing both Chalmers and the E-section css file.

Selectors

As we have mentioned earlier selectors are used to describe exactly which of the elements on the HTML page that are to be affected by the following properties. Selectors can be described in many ways and here are a few of the more useful:

HTML-element selectors

This is the simplest type of selector. With this you simply select all HTML elements of a certain type, e.g. all paragraphs or all tables. The selector is written as the name of the element without < and >. Some examples:

P {property1: value1; property2: value2; ...}
TABLE {property1: value1; property2: value2; ...}
BODY {property1: value1; property2: value2; ...}

Class selectors

Sometimes you do not wish to select all elements of a certain type, but only some. In order to mark exactly which elements to select there is a HTML attribute which works for all elements; CLASS. By saying that an element belongs to a certain class, you can later on refer to that classname when writing your style sheet. Some examples of using the CLASS attribute in HTML code:

<P CLASS="question">Hur gör man det då?</P>
<IMG CLASS="photo" SRC="me.jpg" WIDTH="60" HEIGHT="80" ALT="Foto av mig">

You make up the classnames yourself, but they should not contain any weird characters. Afterwards you can refer to the names in your style statements like this:

P.question {property1: value1; property2: value2; ...}
IMG.photo {property1: value1; property2: value2; ...}
.question {property1: value1; property2: value2; ...}

The first example affects all paragraphs of the class question, but no other paragraphs or elements. The second affects all images of the class photo. The third example is a bit different, it selects all elements that is of the class question, no matter what kind of element. In order to be able to mark a part of the page or a part of the text as belonging to a certain class without having to place the part in some HTML element, there are two special HTML elements which can be used, <SPAN> (which marks sections of text), and <DIV> (which marks entire blocks of code). They have no effect on the text, except to make it possible to use CLASS on the section/block.

Link-pseudoclass selector

There are many specific selectors that are used to style links. Links differ from other HTML elements in that they can have several states, such as 'unvisited', 'visited', 'under pointer'. To take these states in account, pseudoclasses were added to CSS. How they work is best shown by an example:

A:normal {property1: value1; property2: value2; ...}
A:visited {property1: value1; property2: value2; ...}
A:active {property1: value1; property2: value2; ...}
A:hover {property1: value1; property2: value2; ...}
A.question:active {property1: value1; property2: value2; ...}

Remember that the element for links is <A>. The examples above ensure that links will have different looks depending on their state. The possible states are

The last example shows that it is possible to combine a normal class with a pseudoclass; the last rule will only affect links belonging to the class question which are being clicked on.

Selector groups

Properties can be given for several selectors at once. This is useful as there is no need to write the same property multiple times. This is done by simply writing all the selectors on a line, separated by commas, like this:

H1, H2, H3, H4, H5, H6 {property1: value1; property2: value2; ...}
TABLE, OL, UL, DL {property1: value1; property2: value2; ...}
IMG.photo, IMG.drawing {property1: value1; property2: value2; ...}
A:normal, A:visited {property1: value1; property2: value2; ...}

The first row affects all headers, independent on size. The second affects all lists and tables. The third affects images, but only if they belong to the classes photo or drawing. The last affects links in either normal or visited state.

Properties

There are a lot of properties that can be set for the HTML elements, and for reasons of space we can only mention a few of them. A very important fact to be aware of is that some properties are inherited to elements which is inside the element. A good example is any property which sets the typeface for <BODY> - this typeface will be inherited to all <P>, <TD>, <BLOCKQUOTE> or whatever kind of element is inside the document.

Text styles

This is some of the most useful text style properties.

color
Sets foreground color, which in principle amounts to the same as text color. Colors can be written in three different ways, the two standard HTML methods of keywords (e.g. green, black) and hexadecimals (#B5C6E3), but also in a way unique to style sheets by setting percentages of red, blue and green. The values are either integers between 0 and 255 or a percentage value. Examples: rgb(255, 0, 128), rgb(30%, 70%, 50%). (Note that percentage values are the amount of that color - 100% equals the value 255).
font-weight
Thickness of text. Possible values are the keywords normal, bold, lighter and bolder. The first two are absolute values, the last two changes thickness relative to the preceding text.
font-family
Typeface used in the text. Just as for the <FONT> element this is a list of typefaces, separated by commas. The first available typeface in the list is used. There is also a set of special "typefaces" available: serif, sans-serif, cursive, fantasy and monospace. Since these are always available, it is a good idea to include one of them at the end of the list.
font-size
Size of the text. There are several different ways to set font size. The easiest is the relative keywords smaller and larger which changes the size relative to preceding text. It is also possible to set the size in pixels or millimetres (e.g. 30px, 10mm respectively). Using a percentage of the standard size is also possible (130%).
font-style
Text "style". Possible values are normal, italic and oblique, where the last two probably will end up looking the same.
text-decoration
This is a collection of properties for the text, and it is possible to use one or more of the keywords underline, overline, line-through and blink, separated by a blank. none gives no decorations and is also the default value.

Text layout

text-indent
Indentation size on the first line of a paragraph or piece oftext. The values are set in the same way as for font size. It is possible to use negative values, which moves the text to the left of the rest of the text.
text-align
Text justification. Possible values are left, right, center and justify.

Background

With style sheets it is possible to set a background on any element, not just as in HTML the elements <BODY>, <TABLE> and <TD>.

background-color
Specifies a solid background color. Colors are set as above, or with the special value transparent which shows any elements "behind" this element.
background-image
Puts an image in the background. The image is given as an url. It is also possible to set the property to none.

Border

With style sheets it is possible to add borders to any element. Borders are defined by their thickness, style and color, and there is no need to have the same border on all four sides of the element. There are several ways of defining a border, depending how complex the border should be. Here we will only show how to make borders that are the same on all four sides.

border-width
The thickness of the border. This is a keyword (thin, medium or thick) or a size. The size is given in pixels or millimetres, just as with text size.
border-color
States the color of the border, the exact color being given as above.
border-style
The style is set by a keyword, which can be none, dotted, dashed, solid, double, groove, ridge, inset or outset. Unfortunately, many of these styles are not supported by today's browsers.

Margins and padding

The difference between margin and padding is small. Both have the purpose of adding space between the element and surrounding elements. The difference is that margins add space between the element and surrounding elements while padding adds space between the edges of the element and the content. Padding is a part of the element while margins are not, which can be important when you add borders or backgrounds.

There is another small difference; margins can have negative values. Apart from this values are given in the same way for both padding and margins. The values are sizes (as above) and it is often useful to use percentages. It is possible to add different values to different sides. The following properties are available:

margin-top
margin-bottom
margin-left
margin-right
margin
(which sets the same value on all four sides. Use this instead of the other four).

padding-top
padding -bottom
padding -left
padding -right
padding
(which sets the same value on all four sides. Use this instead of the other four).

Some complete examples

Finally, here are a few examples on how to use style sheets.

H1, H2, H3, H4, H5, H6, TH {font-family: Arial, Helvetica, sans-serif; color: blue;}
H1 {text-align: center;}

This sets a different typeface and background color to all headers and table headers, and centers the largest header.

BODY {background-color: #ABCDEF; padding: 10%; color: #012345; font-size: 5mm;}

This sets the properties for the <BODY> element, but since all other visible elements are part of this element their properties will also be affected. This gives the document a light blue background color, dark blue text and some padding on the sides of the element. Text size is set to 5 millimetres. (This is really not recommended, as there might be users who for some reason (e.g visual impairment) has set a large standard font size on their browser.

A:normal, A:visited {text-decoration: none;}
A:active, A:hover {text-decoration: underline;}

Usually links are underscored when shown in a browser. The code above changes this and makes links underscored only when the mouse pointer is above them or they are being clicked.

IMG.photo {border-width: medium; border-color: black; border-style: double;}

The code above gives all images that belong to the photo class a border. The border is of middle thickness, black and consists of double lines.

Building a site consisting of several pages

A good site is usually made up from several pages. A logically built site with several small pages is more easily to navigate by the user.

If the site is not to large, one way of achieving this is to have one main page which links to several subpages, which in their turn link back to the main page. This is called a 'flat' tree structure. Sometimes a menu (with links to the different pages in the site) is used, which is a part of all the pages of the site. The menu is usually placed on top of the page, or along the left edge. For a bigger site it may be more appropriate with a more extensive tree structure, where pages link to their 'parents' and 'childs'. This is called a 'deep' tree structure. (With a larger site it will probably not be possible to have a menu with links to all pages). Flat and deep tree structures are illustrated in the figure below. Each circle represents a page within the site and the lines represent links. The circle on top is the main page.

A site with a deep tree structure A site with a flat tree structure

It is not mandatory to arrange the pages as a tree structure; there are no rules. But it is important to consider the structure of your site before creating it. Just typing some pages and linking them randomly usually renders a poor result. Think ahead and check the result afterwards.

Rules and laws

Even if homepages are supposed to follow certain public laws, laws tend to be rather unclear when it comes to the Internet. A reason for this is that technology tends to advance faster than the democratic process. The domain in which your homepage is published may also have certain rules that you are required to follow. At Chalmers, (where you will probably be constructing a few web pages), web pages are subjects to three sets of rules:

  1. SUNET the Swedish university network, which provides Chalmers with its internet connection. the SUNET homepage (http://www.sunet.se).
  2. Chalmers has some rules concerning homepages published within its domain. (That is, where address ends with .chalmers.se ). These rules are given in The Chalmers policy for the spread of information on the WWW (http://www.chalmers.se/ HyperText/regler.html). Here it is said that all pages should have information about author and should contain a so-called disclaimer .
  3. Each section within Chalmers may have local rules, which you have to adhere to if your pages are in that section's domain.

A final tip: steal!

You cannot instantly learn everything there is to know about HTML and making web pages. Most browsers, however, are able to show the HTML source code for any page, which gives you an opportunity to pick up ideas and ways of implementing them. Learning by imitation (or outright borrowing of good ideas) is one of the best methods of learning more HTML. If you find something you like, use it!

However, a final caution to go with the final tip: Be careful when copying text or images - they may be copyrighted, in which case you must get the permission of the copyright owner before using it.

Relevant links