Lesson 14
 

Finishing Touches

We've covered enough material in this tutorial for you to go off and develop some pretty serious web pages. Now that we've come this far, I want to tell you about something I skipped earlier for the sake of simplicity; and to give you some tips on "quality assurance" for your web sites.

 

The DOCTYPE element

Right back in Lesson 1, I told you that the very first thing that appears in an HTML document should be the <HTML> tag, to tell browsers the kind of file (i.e. HTML) they are receiving when downloading from the Web.

Actually, this is not strictly true. In practice, all browsers I have ever encountered will have no trouble dealing with an HTML file as we have described it so far. But if we are following the rules (and please do -- "It's a good thing" as Martha Stewart might say), we should begin the document with a DOCTYPE element, using the <!DOCTYPE> tag (note that exclamation mark immediately after the opening angle bracket -- rather like in a COMMENT tag). It is the DOCTYPE element that really tells a user agent (browser) what kind of file it is. Here's an example, which we'll break down in a moment:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

This tells the browser that the type of document (DOCTYPE) is HTML; that the Document Type Definition (whoa there! We'll come back to that) is Public; that the public DTD (yes, that's Document Type Definition) is maintained by the World Wide Web Consortium (the W3C); that the DTD is for the Final version of HTML 3.2; and that the DTD is written in English (EN).

So what's this DTD business? Put simply, HTML is actually a subset of a larger family of markup languages, the "parent" of which is SGML (Standard Generalised Markup Language). The various markup languages under SGML are defined by a DTD, which specifies the hierarchy of elements and attributes that go to make up the language. I don't propose to go into any more detail about DTD's here; you can find more information at the W3C web site and in the W3C HTML specifications (also on the web site). But you do need to know about the DOCTYPE declaration at the top of your HTML files in order to validate them properly, which we'll discuss shortly.

If you have created an HTML document that complies with HTML 3.2 (that is, it doesn't contain any browser-specific elements or attributes not introduced until a later version -- like frames for example), then you can use the DOCTYPE declaration above. So your file would begin:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 
<HTML>
 
  <HEAD>
    --- any HEAD material in here ---
  </HEAD>
 
  <BODY>
    --- etc.

I mentioned way back that HTML 4.0 was released in December 1997. HTML 4.0 introduced a greater separation between structure and style in the markup methods. I have talked before about how HTML was originally conceived as a purely structural markup scheme, but gradually more tags like <B> and <I> were introduced to gain control over the style or physical appearance of text. Web site designers have become adept in the use of tables and other hacks like the "single pixel GIF trick" to position elements on the page more predictably. But these measures have been less than 100% satisfactory.

HTML 4.0 did two things. In one sense, it returned to being a purely structural markup scheme, in terms of the elements (tags) given in the Strict DTD (I'll explain that "Strict" thing shortly). However, it also allows Cascading Style Sheets (CSS) to be included in or linked to HTML documents, which give the browser instructions on how text is to be displayed (typeface, font size, colour, leading, etc.) and where text and images are to be positioned on the page (without using tables or other tricks).

This might sound confusing at first, but in fact, it's pretty wonderful. However there has long been a problem: browser support. It has taken the browser vendors a long time to get around to providing adequate and correct support for CSS in their products.

Internet Explorer version 3 had some support, but parts of it were faulty, and could make a real mess of pages that did use CSS. Version 4 was a lot better, but still far from perfect or complete. Version 5 and higher, available as I am updating this tutorial early in 2001, is the best yet from Microsoft. It still has some weaknesses, though.

Netscape Navigator version 3 had no support at all for style sheets, and actually, that was fine: because it just completely ignored them and presented everything in its default way. Version 4 introduced some support but it was so seriously flawed that many pages using perfectly correct and valid CSS looked like a nightmare, because Navigator broke them. Netscape decided to abandon its version 5 update, concentrating on a supposedly perfectly standards-compliant (HTML and CSS) browser, version 6. This was released a few months before I started writing this update. Version 6 is certainly the most standards-compliant browser Netscape has released to date, and I would urge anyone who is still using the abomination that is version 4 to upgrade soon.

Opera has always been a very standards-compliant browser; indeed that was one of the avowed intentions of its creators. Since version 3.6, it has had some support for CSS, and the the time of writing, the latest is version 5 with excellent support for HTML 4.0 and CSS Level 1.

Where the browsers still show weaknesses and variations is in support for the positioning properties in the CSS Level 2 specification, and until these are sorted out, web designers are still going to resort to techniques like using tables to control the layout of pages rather than risk unpredictable results in different users' browsers if they embrace CSS2 wholeheartedly.

Even if the browsers did everything in the specs and did it right, many people will not have the latest versions, and so will not properly see some CSS effects when employed

Web sites that have been prepared taking advantage of CSS and other new features of HTML 4.0 should, if they have been designed carefully, "degrade gracefully" in older browsers. That is, the site will not crash the browser, and the text and images will be displayed, albeit not in the styles or positions that the designer would have wished. In fact, in appearance terms, the page will probably look less appealing than if it had been designed to HTML 3.2 using tables to control layout.

Another consideration with HTML 4.0 is that in this edition of the specification, many of the established tags and attributes are deprecated (note: that's deprecated, not depreciated as many commonly mistake it). This means that it has been outdated by a newer mechanism (usually CSS). Deprecated elements might become obsolete in future versions of the specification. For these reasons, the W3C recommends that new web sites are developed in line with HTML 4.0. But then, most viewers will not see such sites as they were intended to be seen.

This may be an "ideal world" recommendation, and quite appropriate in the light of deprecation of many elements. However, for reasons of backward compatibility, it is likely that browser suppliers will continue to support the deprecated tags for a long time to come, probably long after they are actually (if ever) made obsolete. So in practical terms, I see no good reason to stop using these tags just at the moment.

The long and the short of it is that until:

  • a wide range of browsers across all the major platforms implement HTML 4.0 and CSS fully and correctly, and

  • the majority of the "viewing public" is likely to be using such browsers

I won't get too heavily into developing strict HTML 4.0 compliant web sites; that is, sites that don't use any deprecated tags or attributes, and don't use frames. And that's my recommendation to you. When you've mastered HTML as I've described it in this tutorial (and you need to do that to be able to read and understand the source code on current web sites), you'll still have time to learn about using CSS in conjunction with strict HTML 4.0 before this technique becomes dominant on the Web.

I'd say all that again, but I think blood would come from my ears... Anyway, back to DOCTYPE.

If or when you do come to write HTML documents that comply strictly with HTML 4.0 (that is, they don't contain any deprecated or obsolete tags or attributes) and they do not appear in a frameset, then the document type declaration you should use is as follows:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">

The URL included in the document type declaration tells browsers where to download the DTD and other information it may need in order to understand the HTML file.

If you are creating documents that make use of HTML 4.0 constructs but still use elements that are deprecated by 4.0, then you should use the transitional document type declaration (again, this applies where the document does not appear in a frameset):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">

If you are using frames, then you should use the frameset document type declaration (this includes everything in the transitional DTD plus frames as well):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Frameset//EN" "http://www.w3.org/TR/REC-html40/frameset.dtd">

I generally use the transitional DTD as my "standard" document type declaration -- as you'll see if you look at the source code of the files in this tutorial. In fact. you'll see I'm actually using HTML 4.01, a minor update to HTML 4.0. I use the transitional DTD because I'm still using some deprecated elements or attributes to influence the presentation of documents in browsers that don't support CSS (naughty me!).

Using this DTD, I can still validate my documents correctly.

 

Validating your code

We all make mistakes. Sometimes we do it out of ignorance of the facts; sometimes we know the facts, but our minds drift off into the wild, blue yonder and we get it wrong. And where computers are concerned, the old fingers vs. keyboard problem comes to the fore.

When building web pages, there are limitless possibilities for "getting it wrong". If you make an error in your code, it can be difficult to spot just by scanning through line after line of a long document. If you don't notice it, it might cause your page not to render properly in a browser. And the fact that you tested it on your browser and it worked OK doesn't mean that it's right -- some browsers are more tolerant than others of faulty HTML, and can make quite inspired guesses at what you actually meant! But not all browsers are possessed of such powers of divination.

The best way to ensure that your pages display as well as possible across all browsers is to ensure that your code isn't broken; and the best way to do that is to validate it against the HTML specifications. A number of tools are available to help you do this quickly and effectively. A validator checks your code for syntax and other errors, and points out to you problems like:

  • incorrect tags and attributes

  • "overlapping" tags (that is, opening a new tag before closing another that should be closed first)

  • incorrect use of tags, for example, placing one tag inside another where the HTML specifications don't permit it.

A validator will take note of the document type declaration at the top of your HTML file, and check the code against the specification it quotes: so that's a good reason for having the declaration and having the right one. So where can you get one of these wonderful beasts? Going straight to the horse's mouth, the W3C has a free validation service on its web site at http://validator.w3.org/. To use it, you enter the URL of the page you want to validate. The validator goes off and reads it, checks it, then gives the report back to you.

While the W3C validation service is very reliable, and might be considered by some to be the validator, it's a little inconvenient if you do a lot of HTML work -- like if you're a professional web site designer, for example. The key disadvantages are:

  1. You can't validate a file locally, that is, on your own computer. You have to upload it to your web server.

  2. You have to be signed on to the internet, so you can go to the W3C site and then point the validator at your URL.

  3. The whole process is rather slow: too slow (I feel) for "production" work.

There are a few other web-based validation services like this, and they all suffer the same disadvantages. The Web Design Group validator at http://www.htmlhelp.com/tools/validator/ uses the same basic validation "engine" as the W3C, but it has a few added features, such as the ability to enter batches of files for validation, or to validate an entire site by entering a single URL. But you still have to be online, which is the major downside.

Some tools are available to let you check your files locally on your own computer, but you might not find something suitable for your particular platform. If you're a Windows user, for a small fee you can get Liam Quinn's excellent A Real Validator. This is a true SGML parser (I'm not going to explain that!) that works in the same way as the W3C validator, but offline. There are also a number of HTML checkers (rather than true validators) on the market. What's the difference? Well, as I indicated a few paragraphs back, a true validator checks your code strictly against the DTD you specify in the <!DOCTYPE> tag that you placed at the top of the file (you did put in a <!DOCTYPE> tag, didn't you?). A checker (or as they are sometimes called, a lint) checks your code against a set of rules, which are often customisable. But with careful use and proper understanding of the rules, checkers and lints are useful tools that speed up the process of verifying your code.

A Real Validator is my tool of choice because it's fast (you can check files in batches) and accurate. Every single page in this tutorial was checked after it was built using A Real Validator. And then just to be extra sure, once I uploaded the pages to my server, I validated them using the W3C validation service (indicated by the logo below).

Valid HTML 4.01!You are only entitled to use this logo on a web page if it has validated without errors against the HTML 4.01 specification using the W3C validator. If you don't believe me, just click on the the version of this logo that you see at the foot of any page, and you'll be whisked to the W3C validator site which will show you the validation results for the page!

I find it astonishing how many web sites created by professional web site designers are built with quite seriously broken HTML. You might find it interesting when looking at the web sites of people offering web site design services or tutorials (like this one!) to try validating their own pages.

I must add one rider to this business of validation. A true validator will report the use of "browser-specific" tags or attributes as an error, because these are not part of the HTML specification (even if they have been correctly used in the context of the browsers that support them). HTML purists would say that the file simply "doesn't validate", and you should never use browser-specific tags. Others take the view that in practice, there is nothing really wrong with including browser-specific tags or attributes if:

  • doing so will enhance the function or appearance of your document for users of that browser, and

  • you use the tag or attribute correctly, and

  • you are sure that its use will not "break" your page in other browsers, or cause those browsers to crash.

After all, if a browser encounters a tag or attribute that it doesn't recognise, it should simply ignore it. And if ignoring it doesn't break your pages, what's the problem? This is quite different from using bad or incorrect HTML.

So the wise HTML author uses a validator to point out errors and to advise where caution should be exercised due to browser-specific features. Perhaps you are familiar with the saying:

"Rules are for the observance of fools and the guidance of wise men."

 

Use that spellchecker !

Few things make a web page look quite so amateur as bad spelling (actually there are quite a few things, but we won't go into those). Even when we know how to spell a word, we are still capable of typos -- and isn't it funny how it doesn't matter how many times you read the same piece of text, you just don't see the typo?

So if you have a spellchecker, use it! It only takes a few extra moments. Some HTML editors come with built-in spellcheckers; if you are using something as simple as Windows Notepad you won't have that benefit. But if your word processor has a spellchecker, you could compose your text in the word processor and spellcheck it before importing it into your HTML editor. Alternatively, you might be able to acquire an "add-on" spellchecker.

Now all you've got to worry about is the grammar...

 

Testing, testing, testing...

Back in Lesson 6 I warned you about the vagaries of browsers. Good, clean, compliant HTML code will be rendered differently by different browsers. Even the "same" browser, like Netscape Navigator, may display a page differently on different platforms, like Windows and Mac.

The only way that you can be sure that your pages will look as good as you would wish in as many browsers as possible is to test them on as many browsers as possible! You'll find that you have to make compromises; for example, a particular effect might look good in most browsers except one. But if you tweak the code to make it look good in that troublesome one, then it looks wrong in all the others...

Of course, it isn't practical for most of us to test on every version of every browser on every possible platform. We just have to do the best we can. But I would suggest to you that testing on Netscape and Explorer are essential, because of their market domination, and on as many versions as you can get your hands on. Alternatively, if you are really familiar with the reactions of earlier versions, you can probably get away with using the most recent (though I've been caught out that way before, I readily admit).

It helps if you know your audience too. For whatever reasons, we might find that the visitors we attract to our web sites tend to use a relatively small range of browsers, versions, and platforms. For example, I know from the server logs that most of the visitors to my company web site use either Netscape or Explorer on Windows 95/98, and a few on NT. We get very few visitors using any other browser, Opera being the only one that pops up anything like regularly, and very few Mac users. Naturally this information colours our choices when developing and testing pages.

Unfortunately, not everyone has access to the server logs -- especially if you are using the "free" web space that comes with your ISP (Internet Service Provider) package. So it's back to best guessing!

Testing isn't only about checking the appearance of your web pages, of course. You need to test all of the functional elements as well. Check that every link on your pages works, both the internal links (that link to other pages on your site) and the external links (that link to other web sites). Another sign of amateurism is when a visitor to a site clicks on a link to find it doesn't work, or to be given the ubiquitous "404" error.

There. Now that you've added the correct DOCTYPE declaration to each of your pages, you've validated every file, you've checked the spelling of every word, and you've tested it to death, you're ready to upload your site to your server or your ISP!

 

My final word

Well, that closes the last lesson in this tutorial (at least, in its present form!). I hope that it will set you on the road to becoming an HTML wizard.

After learning the rules, reading books and other tutorials, and lots of practice, you might be attracted by the idea of setting up as a "web site designer" yourself. But let me throw in a cautionary note.

Good web site design is more than just HTML code. It doesn't matter how competent a coder you may be; you need to have imagination and a sense of design to make it all work. And if you are building commercial web sites, you need some business management and marketing skills too, because often your client will lack them (at least, as they are applied to the Web), and will expect you to fill the gap!

Above all, if you are taking money from people for your services, you have a moral obligation (and heaven knows what sort of legal obligations, depending on where you live) to be able to provide the service that you claim. You cannot justify bad work on the basis of being cheap. A purchaser still has a right to expect competent work from anyone who charges a fee. And doing bad work cheaply does a disservice to the rest of the web design profession. Sadly, there's too much of it about.

If you are confident about your abilities, then go right ahead and set up shop. This tutorial is about HTML rather than web site design; however I would offer one piece of advice above all to others when designing a site. Begin by establishing what are the purpose and objectives of the site, and who are the likely or desired users: this will determine the kind of content required, and will suggest the appropriate look, feel and functions. Just like an architect designing a building...

I introduced this tutorial with a quotation from Frank Lloyd Wright, so I'll close with one too:

"I never design a building before I've seen the site and met the people who will be using it."

--  Frank Lloyd Wright (1867-1959)

There now, preaching over... go ahead and enjoy building your web pages!

 

What we covered...

The <!DOCTYPE> element tells browsers, validators and other agents what to expect: that the file is HTML of a given version. It may also give the source of the document type definition. The DOCTYPE should be declared right at the top of the HTML document, before the <HTML> tag.

Validating your code will help you to identify and correct errors. Valid, error-free code gives good assurance that your pages will work in a broad range of browsers and platforms (as long as the browsers comply with the version of HTML you have used).

Using a spellchecker to check your page content is a simple way to eliminate one of the commonest factors in giving web pages an amateur appearance.

Testing your pages in a variety of browser ensures not only that they work, but that they display fairly closely to the way you intended.

 

<< Go back to Lesson 13 | Top | Go on to Appendix A >>
Valid HTML 4.01! Copyright © Keith W Bell, 1999 - 2001
This page last updated 1 February 2001
http://www.campanile.org/tutorials/html/lesson14.html
  Keith's HTML Tutorial
 
 

 
 
Previous Page Next Page
Keith's Home Page Email Keith Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page