Introduction to XHTML

Welcome to the exciting world of XHTML. If you've never heard of XHTML before or have, but never really knew what it was or what XHTML can do for you then after reading this tutorial you shouldn't only know what it is, but you should have grasped the basics of its syntax (assuming you have previous knowledge of HTML).

What is XHTML

XHTML stands for Extensible HyperText Markup Language. It is a hybrid language developed by the W3C, and was created to enforce web standards. XHTML is a cross between the cutting edge language XML and the standard HTML which has been used since the start of the web to create static web documents.

If you want to find out more about XHTML then check out the W3C website, and the New York Public Library Style guide.

Setting the scene

Before we get to know the syntax of XHTML we need to set the scene. When you go to create your first XHTML document you will need to set the right DOCTYPE (short for "document type declaration"). Using the following DOCTYPE, you're telling the browser and validation services to work in standards complaint mode. This means that it'll be able to correctly interpret the XHTML code.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd ">

This DOCTYPE tells your browser to interpret your page as XHTML 1.0 Transitional. XHTML 1.0 Transitional means that it is XHTML but some old methods from HTML are still acceptable. I recommend that you use this DOCTYPE, but you could also use XHTML 1.0 Strict which looks like this

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

If you decide that DOCTYPES aren't for you, or you use an incomplete or outdated DOCTYPE then you are telling browsers to operate and render your page in "Quirks" mode where the browser is assuming you have used deprecated invalid mark-up.

One more thing to watch out for when using DOCTYPES is that you use a full URL. For example, if you look at the W3C's website you can see they have used

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"DTD/xhtml1-strict.dtd">

This DOCTYPE uses a direct link "DTD/xhtml1-strict.dtd" which is on their server, you are not on their server and must therefore use a full URL as shown in my examples.

Now if you're getting a bit confused by all this then take a look at the next code example to see how it all fits together.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

	 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html><head>
<title>Setting the right doc type!</title>
<meta http-equiv="Content-Type" content="text/html; 
	charset=iso-8859-1" />

</head><body>
	<div>Content Here</div>
</body>
</html>

A few ground rules

All tags must be typed in lower case

Taken from the XML language, XHTML has carried across the format of typing all tags in lower case. So the following code would not be valid XHTML code.

<FORM ACTION="Validateform.php" METHOD="post">

To make that code example valid XHTML you would need to transform it into the following.

<form action="validateform.php" method="post">

Notice that everything has been converted to lowercase. This is only necessary for the tags to validate as valid XHTML, I also suggest that you name your files in lower case and without spaces, this can reduce problems when transferring files to different platforms. Windows doesn't currently have a problem with capitals but if you move to a Linux or Unix server, your whole site may stop working completely, or if you have been sporadically handing out capitals here and there then you might have half of the pages working and the other half not..

Quote all attributes and close every tag

As you can see so far XHTML has been a tidier (stricter) version of HTML, but wait, there's more to come. To write valid XHTML you need to make sure that all attributes are quoted, for example the following code will not be valid mark-up.

<img src="image.jpg" height=50 width=30>

To make this a bit more compliant you would write

<img src="image.jpg" height="50" width="30">

This solves the first problem of quoting all the attributes. The second rule you need to comply with is to close all open tags. Now by using the img tag as an example I am killing two birds with one stone. By this I mean that not only are we closing all tags, we're also closing an empty tag.

If your a bit confused then don't worry. A normal tag like the <p> tag would be sufficient but now you must close it with </p>. For an image tag you might be thinking, well there is no closing tag. This is why it's called an empty tag. To close the above code example we could adapt it to look like the following

<img src="image.jpg" height="50" width="30" />

NB: The space before the closing slash, this is here to keep older browsers not recognising this format.

Finishing touches

In the above examples I have used the img tag for a very specific reason

Firstly there's one key attribute that's missing from the examples above, this is the alt tag, so the code should now look like the example below.

<img src="image.jpg" height="50" width="30" alt="Company logo"/>

Whilst we're on the subject of extra tags there's one more that you could use for good measure, but note this tag is optional.

<img src="image.jpg" height="50" width="30" alt="Nildram logo" 
title="Nildram: Internet service provider"/>