HTML stands for HyperText Markup Language. Of recent years it has got completely out of hand, with all sorts of things not in the original spec being catered for. There is a move to try to tidy this up. Called the XHTML standard, we refer to this later on, and everything said in this web page is compatible with XHTML.
What we do on the Athelstane website is to put books on the web, and what we use is here, in this article. What we've got in this article is somewhere round about HTML 3.0, with a couple of useful things that aren't in that level. We don't indicate how to use JavaScript, although that useful language can be used to do all sorts of semi-desirable things. All we do is to get you off the ground, and we should take no more than two hours over it.
If you are very slack you can use Word to create web pages, but it really is a good idea to create your own web page from the ground up, so as to have a good feel for what it is all about.
The basic HTML file is very simple. In fact it is possible to create an HTML file that does exactly nothing, yet remains within the rules. Here it is:-
<html>
<head>
<title> </title>
</head>
<body>
</body>
</html>
You can see that this HTML file consists of two parts, a HEAD and a BODY. The HEAD of this HTML file contains a null TITLE, while the BODY contains nothing at all. For clarity we have written all the keywords, HTML, HEAD, TITLE, BODY in capitals, though when you come to use these words, and other keywords you will learn, you should write them in lower-case: <html>, <head>, <title>, <body>.
Nevertheless this example of an HTML file is important, for nearly all properly constructed HTML files contain these elements. The exception is seen when we come to use FRAME, which replaces BODY in certain circumstances.
Within the HEAD is the TITLE. You will have noticed that we wrote our null title as <title></title>. These strings with <> round them are called TAGS. Most tags, though not all, have a closing tag, at which the effect of the opening tag ends. For instance <title> is followed by the title, and that title is ended by the closing tag </title>.
The text of the TITLE has to be rather simple, because it is what is to be displayed at the very top of the screen when your HTML file is being interpreted and displayed. It doesn't appear anywhere else, nor does that heading on the screen come from anywhere else.
Also within the <head></head> range you will put the <meta> statements that will tell the web-crawlers what wares you have to offer on your web page. We will return to these later, as there are other important things that appear within the HEAD of an HTML file.
Now we come to the BODY of the files. As we have said, it starts with <body> and ends with </body>.
Within the <body> tag we will put the colours you would like to be used by default on the display. These are four in number, The background, bgcolor (white), the text (black), the highlight for links to other HTML files or to different parts of this one or to images (red), and fourthly, the colour the highlight is to become when you have visited that link (vlink) (green). We can also put the name of a GIF file that we can use to provide a background to our page. The author of this sometimes uses a graphic called graph.gif as a background, so the <body> tag contains BACKGROUND="GRAPH.GIF" within it (though not in capitals).
The BODY code is:
<body bgcolor="#ffffff" text="#000000" link="#ff0000" vlink="#008000" alink="#808080">
Let us explain the codes for the colours. You notice that after the # (hash) there are six numbers or letters. The numbers could be 0123456789, and the letters could be abcdef: sixteen choices, in other words. Although there are six numbers or letters, you should think of them as three pairs. Thus each pair could have any of 256 values, 0 to 255, which we represent as 00 to ff. The first pair represents RED, the second pair GREEN, and the third pair BLUE. That's what we call RGB. So ffffff means RED GREEN and BLUE all at full strength, which makes WHITE. Then also 000000 means RED GREEN and BLUE all at zero strength, which makes BLACK. So the above could resolve to bgcolor="white" text="black" link="red", but the other colours are best left as they are.
We could start the web page with a headline. These come in six different sizes, ranging from <h1></h1>, the biggest, down to <h6></h6>. It is useful to keep to <h3>, <h4> and <h5>, thus avoiding extremes. What you put in your headlines is as important as what you put in your META statements, as far as the search engines are concerned.
Now we come to the text. A paragraph of text begins with the tag <p> and ends with the tag </p>. It doesn't matter how the paragraph is laid out in the source text, it will be displayed as a paragraph, left-justified, and followed by a blank line.
Now is the time to say that all text in an HTML file is supposed to be in 7-bit ascii characters. This, by and large, is the character set on the keyboard, as things like e-acute are not directly obtainable, being an 8-bit character. It is obtained by writing é - for instance ménage-à-trois. What that said was ménage-à-trois. The character & is written as &. We will come back to all this later.
Within a paragraph you may wish to display a word or phrase in italics. For instance it is customary to display the names of newspapers in italics, so we might say <i>The Times</i> - to appear as The Times. Thus you will realise that <i></i> surround italics. You can take note that <b></b> surround bold, <u></u> surround underlines, <center></center> surround centred text, such as headers. You can use combinations of these, if you like. Always make sure that the closing tags appear in the reverse order to the opening tags.
What if you want a series of lines within a paragraph, that end with a
new line? Use <br />. This tag has no corresponding </br>,
which is obvious if you think about it. That is why the opening tag has
"space slash" in it. You can use this to set out a list, such as
the articles in your first-aid kit, where you do not want a blank line
between each item.
Sticking-plaster<br />
Tweezers<br />
Money for phone<br />
This brings us to an unnumbered list. Each item on the list will appear
as starting with a little blob, called a bullet.
<ul>
<li>Item one.</li>
<li>Item two.</li>
<li>Item three.</li>
</ul>
It appears like this:
If you want numbers instead of bullets, use <ol></ol>
instead of <ul></ul>. The O stands for Ordered, while the U
stands for Unordered.
Indicating a link to another web page is done with an "anchor" tag,
<a></a>. The A stands for Anchor.
The first part of the anchor tag contains the link.
<a href="anewpage.htm">Go to a new page</a> is the way you
make this link. The words "Go to a new page" will appear underlined and
highlighted in the colour you declared for links in the HEAD (probably
red, as we have described).
If we want to display as indented (like this paragraph) a letter someone has written, or a proclamation, or a lengthy quotation, we can do so with indented text by surrounding the paragraph with <blockquote></blockquote>.
Tables are useful, but they get a bit difficult if you have too many columns. The TABLE starts with <table> and ends with </table>.
Each row starts with <tr> and ends with </tr>.
Each item on the row normally starts with <td> and ends with </td>. We discuss the alternative, <th></th> later on.
You can put all sorts of things within the elements (cells) of a table. For instance you can put pictures (small ones), links to other pages, or little paragraphs of text: whatever you find useful. These will need their own justify or center instructions.
You can make a rule across a page by using <hr />. Of course there are variants on this tag, but we'll just stick to <hr /> for now. It doesn't have a matching </hr>, and that is why there has to be a "space hash" in the opening tag, just like the line-break BR.
We are going in a few paragraphs later on to include some miscellaneous
items that will prove useful. These are:
One: in your HTML source text you can make the lines as long or short as you like. The ends of line in your source file are ignored when the HTML page is being displayed.
Two: in your HTML source text you can have as many blank spaces as you like, wherever you like, in the source file, but only one blank space is displayed at a time, and that is between words. There are several ways of making it look as though you have more than one blank space at a time.
Three: We have said that the tags can be in upper case, but this is not necessarily so, and most people tend to use lower case for them. The reason is that there is emerging a new standard for HTML, called XHTML, and it has strong rules about what is allowed, what is not allowed, and what is discouraged (which "they" call "deprecated".)
Four: you don't need any whizz-kid program to write an HTML file. You can start it off by using Notepad, but that has a limit of 32 kilobytes, after which you you might prefer to use WordPad to cope with larger files. The author of this little note uses WebIt! for developing web pages, because you can see what the page looks like at the same time as you write it. This works OK on Windows 3.1 or 9x, but possibly will not work on anything else. It also has a 32 kb limit, but it is brilliant for small files, like this one. (Yes, I am using WebIt!)
You can't directly alter the text of someone else's web page, but the source code for most web pages can be looked at on your computer. You click on View Source, and it will load into NotePad or whatever editor you specify. If that doesn't work, then right click anywhere on a blank bit of page, and you will see "View Source", so click on that. So you can look at how things were done, but these days it is often too baffling to do that, as web page construction has gone crackers of recent years. You won't be confused if you look at any of the athelstane pages, or any of the other pages the webmaster of athelstane is responsible for. Here they are:-
A London Canoe Club
A British National Canoe Club
A British boarding school
You ought always to make references to files in lower case, and also any files that are to go onto your website need lower case names.
Now we come to the utterly important item of getting noticed by the web-crawlers, or search engines. Two important things: Make sure that within the first 256 bytes of your page it says what it does. There are two places where the web-crawlers look. On is the header lines, the ones you put within <h3></h3> tags, for instance. The other is what you write in your <meta content> tag. A friend of mine complained that nobody ever visited his website which he had paid a lot to have designed (not by me!) One short look sufficed to pin-point the problem. There were no CONTENT statements, and no HEADLINES, and hardly any words. He was a photographer, and he wanted a graphic web page - hence nothing a web crawler could ever latch onto. Yet the web page looked very attractive, and that was what he had paid for. It just didn't actually attract via a search engine, so it was useless.
The second important thing is that you should arrange to get linked to as many websites as possible, with reciprocal links (from them to you). It is by following up these links that the web-crawlers know where your website is. This is the way to do it for a non-commercial site. For a commercial one you had better pay one of the services that will notify you to several hundred web-crawlers.
Link to a picture (or some other entity, zip file for
instance)
Just put the name of the picture instead of the name of an HTML document
in your anchor statement
<a href="my_pic.jpg">See my pic</a>.
You will probably know that most images seen on the web are either of GIF
format or of JPG format. There are other formats, but those are the main
ones. I once counted 36 different formats, not counting variants.
Or you could put anything else that you wanted to be downloaded, a ZIP file for instance, or a TXT file, or a LIT file (a Microsoft eBook), or whatever else was desired.
How to click on an image to jump to something else (page or
picture)
<map name="itsname">
<area shape="rect" coords="0,0,94,61"
href="itsname.jpg" />
</map>
Here the coords
are the size of the small image you are going to declare in the next
line. itsname.jpg could be itsname.htm if you were going to
link to a new page.
<img src="smallimage.gif" alt="some text" align="middle" usemap="#itsname" />
Now you may put some text, if that is not included in the GIF image. The reference to "some text" was an option in case the picture didn't load.
Link to a different place in the current page
At the place to which you may want to jump place the anchor
tag, thus
<a name="Page 2"></a>
and at the place from which you may wish to jump, probably the bottom of the home page you are working on, appears this line:
<a href="Page 2">Page 2</a> (probably with <br /> tacked on as well.)
Include an invitation to send an e-mail
<a href="mailto:BurbleBurble@btinternet.com"> E-mail the
author of this</a>
appears like this:
E-mail the author of this
Making the Heading row for a table
Instead of <td></td> use <th></th>. Within the
<th> tag you can put colspan="2" (or 3, or whatever)
and align="center"
or align="right". Between the <th> and the </th> you
can put whatever text you want to appear as the heading. Within the
<table> tag you can put the kind of lines you want to see in your
table, thus,
<table border="0"> (no lines) up to border="6",
which I use regularly.
It is a good idea to put <center> before the <table> and </center> after the </table>
Making your text lines fully justified.
Each paragraph starts with the two following lines:
<p class="MsoNormal" style="text-align=justify">
<span style="font-family:Times New Roman">
and ends with the following line:
<o:p></o:p></span></p>
That is the old way of getting justified lines, but it is clumsy, and at the end of this article we shall reveal how it was achieved for this file, and for other files in the various websites named above, amounting to several thousand pages.
The META lines within the HEAD section
<meta name="keywords" content="aaa bbb" /;>
<meta name="description" content="blah
blah" />
Try to make the keywords content be the main words you would
like people to be looking in your website to find, and avoid very common
words.
Try to make the description content so that every word
tells, and avoid the 300 commonest words in the English Language, because
they would not help in a search.
Getting the message to the Search Engines. Review your TITLE, META and H1-H6 lines. Check over the ALT strings in the picture descriptions. Do they give a good handle on what your website is about? If not, then fix them so that they do. Each Search-Engine changes its rules about how to get to the top of its list of findings, weekly, or at any rate, quite often. This is to ensure that the big companies who have paid for high places keep getting them. Just make sure your site is linked to by plenty of other sites, and that you have observed the guidelines in this paragraph.
How to do the formatting on the fly. We use a file of a
type known as a Cascading Style Sheet. I have, with permission, been using
a style sheet generated by the University of Adelaide. It is called
et_style.css. Here are the lines you need in the HEAD section of your
webpage:
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1" />
<link rel="StyleSheet"
href="et_style.css" type="text/css" />
This may look a bit complicated, but don't even think about it: just use
the same two lines on every web page, like I do. It works. You will have
to download et_style.css from this website, but I expect you know how to
do that by now.
You will (possibly) then need to divide the web page up into divisions.
These are typically dochead, bodytext, navigation, signature. As an
example the first division starts
<div
class="dochead"> and ends </div> and so on through the
BODY of the file. The Style Sheet is very easy to understand, so have a
look at it, and see how it will infuence the appearance of the page. For
example text will always be justified, and in a serif type typeface;
headings will always be centred, and in a sans-serif typeface. Just have a
look, and you will see how easy it is. You only need the divisions if you
want different parts of the page to be treated differently. You can invent
your own divisions: for example I invented one called toc with which to
handle certain requirements for the Table of Contents in a website that
uses frames.
How to make blank spaces. You can use either of two ways.
is one of them and   is the other.
Here they are in action and Four each time. OK?
The DOCTYPE declaration, or DTD. And now finally, or perhaps we should say initially, we need to put a single line at the start of the file, even before the HTML line, which always used to be first. This line tells the browser what level of XHTML we are using. And if you use one of the validator programs out there on the net, to which you can submit the code of your web page, you will have to have that line. There are several forms of it.
The most simple is
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01
Transitional//EN">
The one that will suit most cases is
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
The one needed where you have used FRAMES is:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"
"http://www.w3.org/TR/1999/REC-html401-19991224/frameset.dtd">
Comments within the HTML file. It may be desirable to
write something in an HTML file which is to be ignored by the browsers.
An example would be the date on which you wrote the file, and your name.
You will notice that the DOCTYPE declaration is actually a comment
<!-- Your comment -->
Frames. This is a bit more complicated than all of the foregoing. However, the two Canoe Club references given above use frames extensively, and you can study them to see how it is done. Actually it is quite easy, once you get the hang of it, but to try to explain it in detail here would take us over your two-hour limit. What frames actually do, is to allow you to have two or even three web pages on the screen at a time. One of these would be a heading across the page, just a few pixels deep. The rest of the page would be divided into two vertical halves. The left-hand one of these would be the Table of Contents, while the right-hand one would be the page you want to look at. Each of these areas of the page, or frames, is given a name, for example "main" is usually the name given to the lower right-hand frame. Click on an item in the Table of Contents, and the relevant page is displayed in "main". Now you've got the intuitive side of it, the actual working out will be easy for you.
And that is all I do to make these websites.
N.H. Xmas 2002, reviewed Xmas 2003.