Lesson 6
 

Structure, Specs and Browsers

In this lesson, we'll take a break from learning about the tags, attributes and other features of HTML. We've reached a point where we need some understanding of other subjects that are relevant to our aim of building perfect web pages.

 

Structure

HTML was originally conceived as a structural markup language. In other words, it was intended to convey the structure and organisation of a document: its title, section headings, paragraphs, lists, and so on -- not its actual appearance on the screen. That's why the tags in the early versions of HTML, and most of those we've looked at so far, were to designate hierarchical headings, paragraphs, and lists. There was (quite deliberately) no attempt to influence the typefaces that would be used, the precise size of fonts, the space that would be introduced between lines of text (known in typography as "leading" -- pronounced as "ledding"), or the space between headings and paragraphs, and so on. HTML left all of these decisions to the browser software (or rather, the software company that developed it!). HTML was truly universal, and the actual presentation of a document varied from browser to browser.

There were good reasons for this approach. Initially, the purpose of the world wide web and its markup language HTML was to permit the easy distribution of technical documents around the global scientific community. Of course, these were largely text documents, and their content and structure was more important to readers than their visual appearance. HTML served this purpose extremely well (and still does).

But when the world at large began to see the possibilities of the web for entertainment, advertising, education, public relations, online selling and all the other activities we find on the web today, "structural" HTML began to prove rather limiting: it was hard to make eye-catching, appealing, interesting pages that would attract and captivate visitors. In traditional print media, graphic designers have been used to "pixel perfect" positioning of elements on the page, coupled with fine control of typography. Those designers who started moving into the field of web site design found the constraints of HTML particularly challenging.

Outside of its original purpose, HTML was a bit too plain vanilla and not enough knickerbocker glory.

 

Specifications

The specifications for HTML are under the control of an organisation called the World Wide Web Consortium, generally abbreviated to W3C. The W3C looks at proposals to change or enhance HTML, and decides what should be included in the specification. The specification describes the elements and their tags and attributes, and how these should typically be rendered by a "user agent", the technical term for a browser.

To a large extent, the browser manufacturers have been responsible for the development of HTML. For example, Netscape (the suppliers of the Navigator browser) and Microsoft (suppliers of Internet Explorer) have recognised some of the limitations I talked about above. To give web designers more creative options, they each introduced their own "extensions" to HTML -- additional tags and attributes that designers could build into their pages. However these extensions only worked in their own browsers -- which was fine if you knew that everyone would be reading your pages only in Netscape, or only in Internet Explorer, but not otherwise. Sometimes the browser manufacturers did incorporate each others' extensions in later versions of their browser software.

Many of these extensions were adopted by the W3C and incorporated into the HTML specification, which encouraged other browser suppliers to build them into their next software updates. Slowly, then, HTML began bringing in features that drew it away from its text-based roots, and making it (slightly) easier for designers to control the final appearance of pages on the screen.

Many people argued that it was wrong to include in the HTML specifications tags or attributes that only had the purpose of affecting the presentation of the document, and frankly, I agree with this view. Some of the style tags that were introduced, that we'll meet in the next lesson, do nothing to contribute to the organisation of a document, or the sense that readers will make of it. Their only purpose was "prettifying" it. And many kinds of user agents simply wouldn't be able to interpret these tags/attributes (what might a screen reader do when it encounters an instruction that this text coloured red?) There is a good case for separating a document's content or structure from its style, so that different kinds of browsers or other user agents can all render the document in a sensible, logical way: but some may be able to take advantage of additional styling instructions to spice up the presentation for those who can enjoy it.

The biggest steps in this direction have come with the introduction of the latest version of the HTML specification, HTML 4.0 (actually, it's had a minor update to make it HTML 4.01). You could say that the direction has changed quite dramatically, because in this version, the "style" tags and attributes that web designers have come to love (or not) are being pushed out -- the official terms is deprecated, meaning that they may be dropped altogether from a future version of the specification -- in favour of using Cascading Style Sheets (CSS) to define text and heading styles, alignment, colours, and all other aspects of visual presentation. Through CSS, defined in its own specifications, real typographic features have now been introduced. These include the ability to set the leading between lines of text, and to specify margins and paragraph indents. The new spec also gives us the ability to position elements much more precisely on the page than was possible using the kind of "hacks" that we've had to adopt until now (and which we'll mention later in this tutorial).

The basic principle is that the HTML file should be purely structural, providing only the document's content and its organisation. The HTML file can then include or refer to a separate CSS file that defines the presentation of the document. It can even call up different CSS files for different kinds of media, like screen or print for example, to which the browser or other user agent would react depending on whether you are viewing the document on screen, or printing it.

There's a drawback, of course, and it's a subject I'll return to in Lesson 14 when I talk about DOCTYPES and testing. To date, support for CSS in web browsers has been patchy and unreliable, which has limited the practical usefulness of this wonderful technology. Now, as I write these words in February 2001, the situation is improving with the latest versions of browsers, and in this revision to my tutorial I have used some very simple style sheets to control some aspects of the visual appearance of these pages. I'm not going to say anything about the use of CSS here; that's the subject of another tutorial.

All that said, I've written this tutorial in line with the current published version of HTML, version 4. With only a few exceptions, all of the tags and attributes I describe are part of the HTML 4.0 specification. If users' browsers are properly compliant with 4.0, then pretty much everything we do should work: well, within the inherent variations that are permitted amongst the browsers, that is. As to the few exceptions I mentioned, these are elements I thought worth including because they are useful and are well supported amongst current browsers, even though they may not be recommended by HTML 4.0.

 

Browsers

One of the biggest variables we have to contend with in designing for the web is the issue of browsers. Let's look at some of the trouble spots:

  1. Text or graphical? Windows and Mac users may be so used to a graphical environment that they forget anything else exists. You might find it remarkable, but there are actually some text-only browsers available (probably the best known is called Lynx). So all the work that you may put into a page with high-quality, attractive graphics may be completely lost on part of your audience using a text browser. This raises issues of what we should do, and how far we should go, to accommodate these users.

  2. Graphics on or off? This is rather similar to the first point. Most (perhaps all) graphical browsers allow the user to turn graphics off. Actually, this can be very useful, especially if the user has a slow connection to the Internet. It allows the user to see the text content of the page, which is delivered much more quickly than graphics. If the user decides there's an image on the page that might be worth looking at, he or she can turn graphics on and see it in all its beauty. Many people browse with graphics turned off as the norm, and only turn on for specific images of interest. So we have to give them a clue as to whether there might be something of interest!

  3. Which fonts? Although (through CSS) there are ways to influence the typeface that will be used to display your text, these depend on the right fonts being installed on the user's computer. If the user doesn't have those fonts, then the browser will substitute them with its default. Most browsers allow the user to actually select the default, though lots of users don't know that. To further complicate matters, many browsers allow users to override any font instructions delivered by the HTML -- so even if they do have the specified fonts on their systems, the users won't see them in use on your pages if they have set the override feature.

  4. What font size? I said earlier, that the default font size is a nominal "3", and that you can put tags in your HTML to increase or decrease the font size. However, this can also be influenced by the user. Users with sight difficulties may take advantage of a feature in their browser that allows them to set the default font to a large size. Alternatively, some people may reduce the size from the norm, then if your code contains a <SMALL> tag, it may make the text unreadable on the user's system. There is no way you can tell how the user's browser is set up.

  5. Rendering differences. Even if your viewers are all looking at your web pages

    • with graphical browsers that comply with the version of HTML you have used in your coding

    • with graphics switched on

    • with "document specified" fonts enabled

    • and with the necessary fonts installed on their system

    you still cannot be sure that they will see identical presentations of your page. Different browsers will pick slightly different actual font sizes for the nominal default size of 3; they will put different amounts of "vertical white space" between features like headings, paragraphs, lists, and horizontal rules; they will have different "screen offsets" (margins between the top and left sides of the browser window, and the top and left sides of your page); and a number of other differences.

  6. Which platform? Your viewers may all be looking at your work on different platforms. Some may be working with PCs running Windows 95, or 98, or NT, or even 3.1! Some will be viewing on Apple Macs, some on Unix boxes... This can give rise to significant differences. For example, the screen resolution of a Mac system is 72 dots per inch. The typical resolution on a Windows system is 96 dots per inch. The net result is that the same piece of text renders at different sizes on the screens of Windows and Mac systems. Even the same browser (like Netscape) behaves differently in its Mac and Windows versions.

  7. Which browser version? Some users are a bit tardy in updating their browser software to the latest version, even though the market leading browsers are available for free. So features in current or even the previous version of the HTML specification might not be visible to some of your viewers, because their browser is too old to recognise the tags!

And all that, before we've even thought about what monitors the viewers might be using. 640 x 480 pixel resolution? 800 x 600? 1024 x 768?

So how do we cope with all of this? The answer is: as best as we can. We have to make use of the knowledge we acquire about the behaviour of different browsers, and make guesses about which browsers and versions our viewers (or at least, most of them) are likely to be using. So let's review the current situation.

Netscape for a long time held a dominant position in the marketplace. This has been steadily diminishing since Internet Explorer came on the scene, and especially since Microsoft started bundling the browser with its Windows operating system software. But the important thing is that some 80% or more of web viewers use one of those two (both are available in Windows and Mac versions).

Amongst the remaining 20%, you will find browsers like Opera, which is growing in popularity due largely to its very compactness as a program. NCSA Mosaic still has its adherents: Mosaic was the "original" graphical browser and was popular in earlier days. But no more work is being done on Mosaic and so eventually its usefulness will seriously diminish. Small numbers of people use Sun's Hot Java browser, and even Amaya, the W3C's very own browser which they use to develop and test HTML -- it's available free from the W3C web site. Some viewers are using the "built-in" browsers that come with CompuServe and AOL, both of which are slightly customised (but not necessarily current) versions of Internet Explorer.

Just taking the market leaders for the moment, a proportion of their users are browsing with the latest version. A possibly larger proportion are doing their browsing with earlier versions.

Can you see where I'm going with this? Of course, you should use all the knowledge and skill at your disposal to make your web pages as accessible as possible for the biggest number of users and browser preferences, but at the very least if you design and test your pages with the most recent and the previous versions of both Netscape and Explorer in mind, then you should satisfy about 80% of your viewers. Recommendations like that tend to start lots of arguments and flaming in web design forums, but this isn't a forum -- it's my site. So my rules. So there!

Just before we close this lesson, here's a useful and important piece of information, so listen carefully. A general principle with browsers is that when they encounter a tag or attribute that they don't recognise, they simply ignore it. This is helpful or not, depending on the way you use the knowledge in building your web pages. If you are using a tag from a recent version of HTML that will enhance your pages for those viewers whose browsers support it, that's fine -- so long as you bear in mind how the page would look if that tag were simply omitted. Which is effectively what will happen when the page is rendered by a browser that doesn't know the tag.

 

What we covered...

HTML was originally conceived to mark up the structure rather than the physical appearance of a document. This brought attendant design limitations, some of which have caused web site designers to resort to ingenious tricks and hacks to get pages looking the way they want.

Specifications for HTML are under the control of the World Wide Web Consortium, or W3C. Specifications have been updated steadily to include new tags and features introduced by the browser suppliers, enlarging control over physical appearance. The intention of the latest specification, HTML 4, is to completely separate structural and "stylistic" elements (while significantly increasing control over physical appearance) with the introduction of Cascading Style Sheets. However HTML 4 and style sheets are not well supported except by the latest (at the time of writing) generation of browsers.

One of the greatest challenges in building pages for the Web is the varied responses of browsers to the same HTML code. Even two browsers that comply fully and correctly with a given HTML specification (and that is rare enough in itself!) may render the same page a little differently, due to variations that are permissible within the interpretation of the specification. It's an impossible task to test pages on every available browser, but the market dominance of Netscape Navigator and Internet Explorer indicates the common sense in testing web pages on these two at least, and in as many versions on as many platforms and operating systems as you can reasonably lay hands on.

 

<< Go back to Lesson 5 | Top | Go on to Lesson 7 >>
Valid HTML 4.01! Copyright © Keith W Bell, 1999 - 2001
This page last updated 1 February 2001
http://www.campanile.org/tutorials/html/lesson6.html
  Keith's HTML Tutorial
 
 

 
 
Previous Page Next Page
Keith's Home Page Email Keith Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page
Top of Page