!-- (HTML comment) | !DOCTYPE | a | abbr | acronym | address | applet | area | b | base | basefont | bdo | big | blockquote | body | br | button | caption | center | cite | code | col | colgroup | dd | del | dfn | dir | div | dl | dt | em | fieldset | font | form | frame | frameset | h1 | h2 | h3 | h4 | h5 | h6 | head | hr | html | i | iframe | img | input | ins | isindex | kbd | label | legend | li | link | map | menu | meta | noframes | noscript | object | ol | optgroup | option | p | param | pre | q | s | samp | script | select | small | span | strike | strong | style | sub | sup | table | tbody | td | textarea | tfoot | th | thead | title | tr | tt | u | ul | var

Intro

XHTML is a reformulation of HTML as an XML document. XHTML 1.0 is virtually "HTML 5.0", i.e. XHTML supports all HTML 4.0 tags and attributes but they now must conform well-formed. XHTML was declared an official recommendation in January 2000 by the W3C. So far the fusion of HTML and XML into XHTML is not that popular. Most of the time people are using HMTL and XML side-by-side to compliment each other. See also my section on XML.

If XML data is being poured into an XSL style sheet template, then it is actually being poured into an XHTML template.

Comparing HTML and XML

Here is a table comparing and contrasting HTML and XML.

HTML XML
Focuses on visual appearance and display of data. Focuses on the transmission of structured data.
Uses pre-defined tags, eg <p>, <h1>. These are actually assumed to be HTML DTD set by the W3C. A DTD describes the semantics of a document, i.e. the defined elements/namespaces and their relationships. Uses custom defined elements, eg <film>, <book>. All XML documents must follow certain rules to be Well Formed. However, if an XML document references a DTD, then the document is said to be Valid.

Both are based on SGML (Standard Generalized Markup Language).

A key difference between XML and SGML is that not all XML documents need be valid, whereas all SGML documents must reference a DTD.

CSS (Cascading Style Sheets) provides rich display formatting. XSL (Extensible Style Language) provides rich display formatting that can deal with the complexities that XML presents. Microsoft's implements XSL but calls it XSLT (XSL Transformations).
"Static HTML" utilizes the HOM (HTML 3.2 Object Model).

DHTML (Dynamic HTML) utilizes the DOM (Document Object Model as set forth by the W3C) to makes the HTML tags into elements that can generate events and be used in scripting. Changes to the attributes, styles, and contents of elements show instantly/dynamically on a page without additional trips to the web server.

XML utilizes the XML Object Model (XOM) which is very much like DOM.
Case insensitive. Case sensitive. EG: Use <body onload="test()"> instead of <body OnLoad="test()">.

Additional XHTML Prohibitions

Here are prohibitions placed on some of the elements in XHTML

a
cannot contain other <a> elements.
pre
cannot contain the <img>, <object>, <big>, <small>, <sub>, or <sup> elements.
button
cannot contain the <input>, <select>, <textarea>, <label>, <button>, <form>, <fieldset>, <iframe> or <isindex> elements.
label
cannot contain other <label> elements.
form
cannot contain other <form> elements.

Valid XHTML

Valid XHTML requires well formed standard HTML. Here are some additional requirements for strictly conforming XHTML documents:

  • The DOCTYPE declaration should be one of the three following declarations:
    <!DOCTYPE
    html
    PUBLIC
    "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
    
    <!DOCTYPE
    html
    PUBLIC
    "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    
    <?xml version="1.0" encoding="utf-8" ?><?xml version="1.0" encoding="utf-8" ?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    
    <!DOCTYPE
    html
    PUBLIC
    "-//W3C//DTD XHTML 1.0 Frameset//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
    
  • The root element must be <html>.
  • The <html> element must have a namespace declaration, i.e. xmlns="http://www.w3.org/1999/xhtml".
  • The <head> and <body> elements must be present.
  • The <title> must be the first element in the <head> element.
  • An XML declaration (<?xml version="1.0" encoding="UTF-8"?>) is recommended but not required.
  • Elements must be in lower case, i.e. <body> and <BODY> are not equivalent.
  • Script blocks tend to contain characters that interfere with XML parsing (EG: < and &), so either those specific entities should be character encoded (EG:  &lt; and &amp;) or the script block should be enclosed in a CDATA section (EG: <script> <![CDATA[ .... ]]></script>)

EG:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
    <head>
        <title>This is a Title</title>
        ...
    </head>
    <body>
        ...
    </body>
</html>


GeorgeHernandez.comSome rights reserved