Since I started writing this blog a month or so ago I've been beating the standards drum. However, I kind of put the cart in front of the horse a bit since I've been talking standards, standards, standards - and never really explaining what those standards are. Now, I'm sure many of you already know what I'm talking about, but I really think it is especially important to help out those getting started who may not know what the alphabet soup of web standards mean.

But before I get into that, there was a great article in The Register today. I just have to quote a bit of it here, but you should really read the whole thing:

Deri Jones, SciVisum's CEO warns that companies are in danger of damaging their brands by not addressing accessibility issues properly. "When webmasters design first for Internet Explorer and not standards-compliant browsers, they so often end up restricting user access to the website which has detrimental affects for a company," he said.

He went on to describe it as surprising that companies still fail to accommodate a variety of browsers, and warns that taking a non-standard approach limits a website's audience, and risks alienating some users.

The company recommends that web designers switch to using Cascading Style Sheets, and check pages using browsers other than Internet Explorer. It also suggests that anyone planning a redesign should consider using an open source content management system, such as Plone or Mambo.

Usability firm Human Factors International also expressed surprised that businesses still have not dealt with the issue of multiple browsers. Company managing director Jerome Nadel described the question of making a site accessible to multiple browsers as "a no brainer".

That's what I've been saying. Other than just doing the right thing, it is business. As IE's market share shrinks, sites that don't work well in other browsers are basically turning away customers. Can you afford to turn away customers?

So, the standards. You will see a lot of alphabet soup tossed around in web discussions - HTTP, SSL, TLS, HTML, XML, XHTML, CSS, DOM, DHTML, XSL, XSLT, WML. SVG, GIF, JPEG, PNG, URL... OK, that's enough, for now anyway. First a quick summary, then a little detail.

OK, is your brain full yet? ;-) You think this is bad? Try networking - TCP, UDP, ICMP, ESP, AH, RIP, OSPF, BGP, ISIS, EIGRP, VLSM, CIDR, ISDN, BRI, B8ZS, AMI, RADIUS, TACACS... Geeks and acronyms, we're inseparable. With all that alphabet soup, there are really two standards I think we should focus on - XHTML and CSS.

XHTML you're probably familiar with, at least its HTML ancestor. Tim Berners-Lee created HTML in 1990 and there was never a real standard. HTML was enhanced and grown by different parties, with informal documentation available from a number of sources. However, by 1994 the World Wide Web was beginning to really take off and it was important that a standard be developed to codify the state of the art. This standard was HTML 2.0, published by the IETF. The IETF, or Internet Engineering Task Force, sets most of the standards for the protocols used on the net. Check out HTML 2.0, it'll give you an idea of how far we've come.

After HTML 2.0 a number of proposals for enhancements were floated - form-based file uploads, HTML tables (that's right, HTML 2.0 didn't have tables yet), client-side image maps (Anyone else ever actually use server-side image maps and ISMAP?), and others. All of these fed into the effort to create HTML 3.0.

HTML 3.0 was overly ambitious, to put it mildly. With the explosive growth of the web people wanted it to be everything to everyone. The proposals for HTML 3.0 included concepts to be able to mark a range of content in a document, cryptographic checksums for included content (like images), figures, mathematical markup (symbols, equations, etc - these concepts returned, in a better form, as MathML), new form controls (sliders, knobs, Scribble On Image)... OK, I need to jump in there. I have long held Scribble On Image up as the example of the bloat and over-reach that doomed HTML 3. This is from the draft spec:

Scribble on Image --(type=scribble)--

These fields allow the user to scribble with a pointing device (such as a mouse or pen) on top of a predefined image. The image is specified as a URI with the SRC attribute. If the user agent can't display images, or can't provide a means for users to scribble on the image, then the field should be treated as a text field. The VALUE attribute can be used to initialize the text field for these users. It is ignored when the user agent provides scribble on image support.

Keep in mind this was 1995! They wanted people to be able to input arbitrary doodles via HTML forms! I just think this is ridiculous. I'm not sure I can think of any reasonable use for this now, other than for games or something. Anyway, HTML 3.0 was also loaded down with tons of presentational markup, etc - which eventually showed up in CSS instead.

During the HTML 3.0 effort the IETF decided they really weren't the proper body to handle a document format standard. The IETF deals mainly in protocols for moving data around, not in the data formats. So the W3C was founded to take over stewardship of the web standards. Since HTML 3.0 was dragging on, both Netscape and Microsoft had run off and implemented pieces of the 3.0 proposal, as well as completely new things like Frames, in their browsers. The web was diverging, and the HTML 3.0 standards work was becoming increasingly meaningless. Even if a standard were developed, the browser vendors had already 'voted with their feet' and gone off in another direction.

To try to prevent the web from dissolving into multiple, incompatible camps the W3C decided to scrap HTML 3.0, and instead develop a new standard which would codify the then state of the art, and act as a stepping stone to a later recommendation for enhancements. This new standard was HTML 3.2, published in early 1997. You can see where HTML 3.2 is an evolution of HTML 2.0. It wasn't a radical change, but it represented the common ground between the different browser vendors. It was designed to help herd the cats in the same direction.

And that direction was HTML 4 - OK, OK, that link is actually to HTML 4.01, which is a minor revision and the final, and current, version of HTML. HTML 4 originally appeared in late 1997, and the 4.01 revision at the end of 1999. If you look at the changes from 3.2 to 4.0, and then changes from 4.0 to 4.01 you can see that this was the big leap for HTML. Internationalization, frames, object, table changes, form changes, and more. HTML 4.01 is basically the foundation for the web as we know it today. So things have been fairly stable on that front for over five years.

That's not to say there haven't been changes. After XML was developed, the W3C decided that the future direction for HTML would be XML-based. In early 2000 XHTML 1.0 was published, with later revisions in late 2002. As the title of the recommendation itself says, it is "A Reformulation of HTML 4 in XML 1.0". There are only minor differences with HTML 4.01. However, most of these changes are just good ideas anyway. In HTML you could leave some elements unclosed, for example opening a paragraph with <p>, but never using </p>. In XML all elements must be closed, so you need to use </p> on the end of your paragraphs. But this is just cleaner markup anyway, and if you want to use CSS or do any DHTML you really need to do this or you will get odd behavior on different platforms. This is no big deal because you can make a document that is XHTML 1.0 compliant and HTML 4.01 compliant at the same time, simply by following these HTML Compatibility Guidelines.

And that's where we stand today. There is an XHTML 1.1 and a XHTML 2.0 Draft, but these really don't have client support at this time. So you're better off sticking with XHTML 1.0 for now. XHTML 1.1 is really about breaking XHTML 1.0 into modules, to allow pieces to be used independently to produce new recommendations such as XHTML Basic for mobile devices, or to be combined with other recommendations to allow for more capabilities to be added in the future. An example of this is the XHTML + MathML + SVG Profile. So use XHTML 1.0 on your new pages.

Right, Well, I guess that's it for... What? Oh, right, CSS! One of the things people have wanted from the web is control over the appearance of the documents. Early on elements and attributes were added to HTML to provide some of this control, and the doomed HTML 3.0 recommendation would've added many more. However, embedding these controls into the document is severely limiting. It also mingles structure and presentation, which causes problems on many levels. XHTML is derived from SGML, and the SGML world handled this with DSSSL - Document Style Semantics and Specification Language. But DSSSL is a fairly hefty language, and quite a bit of overkill for the web.

Instead of adapting DSSSL (it was considered), a more declarative system was developed, this system became CSS - Cascading Style Sheets. CSS1 was first published in late 1996, and it gave web developers previously unheard of control over the appearance of their sites. CSS2 was published in early 1998 and added many features. Currently CSS2.1 is pending as the next official revision. It updates CSS2 a bit, and paves the way for CSS3, which is under development.

Most of the current browsers have solid support for CSS2.1 at this point - of course, IE has the weakest support for CSS2/2.1. (So use the IE7 library already!) Some support for CSS3 is already appearing in the latest browsers, it promises quite a bit more power. CSS is absolutely vital to producing good looking pages that really take advantage of today's browsers. Hey, just be glad that JavaScript Style Sheets, aka JSSS, never caught on.

OK, OK, I'm done. I bet you're glad to know that. Oh - use PNG more, it really is a nice format. Shame it doesn't get a lot of use. If you have any questions, comments, criticisms, or anything else to say, just leave me a comment. Let me know if there is something you'd like covered and I'll see what I can do. Until next time...