by on July 9th, 2007

We have reached a point where many pages on the WWW contain “bad” HTML. Many will fall, my friend. Talk about strict rules. Lets explore a little history of HTML and XHTML before we answer this ‘WHY’.

The Extensible Hypertext Markup Language, or XHTML, is a markup language that has the same depth of expression as HTML, but also conforms to XML syntax. While HTML prior to HTML 5 was defined as an application of Standard Generalized Markup Language (SGML), a very flexible markup language, XHTML is an application of XML, a more restrictive subset of SGML. Because they need to be well-formed, true XHTML documents allow for automated processing to be performed using standard XML tools—unlike HTML, which requires a relatively complex, lenient, and generally custom parser. XHTML can be thought of as the intersection of HTML and XML in many respects, since it is a reformulation of HTML in XML. XHTML 1.0 became a World Wide Web Consortium (W3C) Recommendation on January 26, 2000. XHTML 1.1 became a W3C Recommendation on May 31, 2001.

There are three formal DTDs for XHTML 1.0, corresponding to the three different versions of HTML 4.01:

XHTML 1.0 Strict is the XML equivalent to strict HTML 4.01, and includes elements and attributes that have not been marked deprecated in the HTML 4.01 specification.
XHTML 1.0 Transitional is the XML equivalent of HTML 4.01 Transitional, and includes the presentational elements (such as center, font and strike) excluded from the strict version.
XHTML 1.0 Frameset is the XML equivalent of HTML 4.01 Frameset, and allows for the definition of frameset documents—a common Web feature in the late 1990s.

The XHTML 2 Working Group is considering the creation a new language based on XHTML 1.1. Between August 2002 and July 2006 the W3C released the first eight Working Drafts of XHTML 2.0, a new version of XHTML able to make a clean break from the past by discarding the requirement of backward compatibility. This lack of compatibility with XHTML 1.x and HTML 4 caused some early controversy in the web developer community. HTML 5 initially grew independently of the W3C, through a loose group of browser manufacturers and other interested parties calling themselves the WHATWG, or Web Hypertext Application Technology Working Group. The WHATWG announced the existence of an open mailing list in June 2004, along with a website bearing the strapline “Maintaining and evolving HTML since 2004.” The key motive of the group was to create a platform for dynamic web applications; they considered XHTML 2.0 to be too document-centric, and not suitable for the creation of forum sites or online shops.

An XHTML document that conforms to an XHTML specification is said to be valid. Validity assures consistency in document code, which in turn eases processing, but does not necessarily ensure consistent rendering by browsers. A document can be checked for validity with the W3C Markup Validation Service. In practice, many web development programs provide code validation based on the W3C standards.

Why XHTML is Preferable and Standard

The following HTML code will work fine if you view it in a browser, even if it does not follow the HTML rules:

This is bad HTML


XML is a markup language where everything has to be marked up correctly, which results in “well-formed” documents. XML was designed to describe data and HTML was designed to display data. Today’s market consists of different browser technologies, some browsers run Internet on computers, and some browsers run Internet on mobile phones and hand helds. The last-mentioned do not have the resources or power to interpret a “bad” markup language.

Therefore – by combining HTML and XML, and their strengths, we got a markup language that is useful and functional as well as compatible now and in the future – XHTML.

XHTML pages can be read by all XML enabled devices and while waiting for the rest of the world to upgrade to XML supported browsers, XHTML gives you the opportunity to write “well-formed” documents now, that work in all browsers and that are backward browser compatible !!!