XML (Extensible Markup Language) is a protocol using plain text to flexibly structure data that is both human and machine readable. XML is conducive for the transmission and storage of structured data trans-Web, trans-platform, and trans-process.
There are 2 versions: XML 1.0 and XML 1.1. XML 1.0 came out in 1998 and its 5th edition from 2008 is still going as of 2012. XML 1.2 came out in 2004 and is only used by proud weirdos. I prefer JSON myself!
Here are some applications of XML:
As a portable, disconnected database.
No database engine required!
No connection required (once data is loaded)!
No server load (once data is loaded)!
Data may be sorted, filtered, calculated, etc.
To encapsulate data (for interfaces) between processes. Think Web Services.
Input from any source.
Output in different formats, i.e. data is separated from presentation.
Open and extensible. It is independent of location, platform, process, or language.
Machine and human readable. It's not just a bunch of data: the elements make it instantly identifiable.
The structure, relationships, and context of data is transported as well as the data. In contrast data as HTML instead of XML has just the data.
An XML document has data marked up with customized elements. EG:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Some tags can look like HTML. /-->
<p>Won't you guide my sleigh tonight?</p>
Namespaces. If an XML document was created from multiple sources, then some of the elements may have the same names. Namespaces can be used to qualify names.
XML Grammar. The basic grammar and data structure of of XML documents for particular purposes or organizations are defined by a Document Type Definition (DTD) or an XML Schema (as DTD or XSD) (aka XML-Data). Establishing XML grammar for XML documents verifies conformance of the content.
There are several major ways that XML documents are manifested:
XML File. An XML document can be made into a XML File with a .xml extension.
Data Islands. An XML document can be embedded within another document as a Data Island. See LoadXML Data Island.
XML String. A process can generate an XML String which is equivalent to an XML document. This is often used when sending XML data from the server to the client.
XML Object. A process can use an API (Application Processing Interface) to make an XML Object (usu. an XML DOM object) that can build an XML document programmatically, often node by node.
Generating XML Documents. XML documents can be made by hand, generated by the application itself, but often some process takes data in a database and spits it out in XML format, i.e. as an XML document. The data source of an XML document may be one or more databases, or even non-databases. Since XML documents are plain text, they can be generated by all sorts of computer processes, APIs, and languages including (but not limited to) scripts, executables (.exe), and components (.dll).
ASP (Active Server Pages).
EGs: Receiving XML from an ASP file.
objDOM.load(xmlMade.xml). If the source ASP saved its generated XML as an XML file..
objDOM.load(xmlMaker.asp). If the source ASP returns an XML document.
EGs: Various ways an ASP file may generate an XML document.
Get the POST data from a form on the client and construct an XML document node by node.
Once you have the XML document, then an API is needed to access and modify the data. There are two generic APIs for XML, both of which are W3C standards:
DOM (Document Object Model). A tree-based API. See also my section on DOM.
SAX (Simple API for XML). An event-based API.
Different companies provide different APIs for accessing the W3C generic APIs for XML. EG: Microsoft provides the following APIs for the W3C APIs via its MSXML Parser, a COM object called msxml.dll, msxml2.dll, etc.
XML DOM Document object. The XML document is usually loaded into an XML DOM Document object.
XML DSO (XML Data Source Object). The XML document may be loaded into an XML DSO object. The source data must me structured like a typical table.
SAX object. The data is processed as it comes in, much like a read-only stream of data.
There are many tools you can use to access, modify or validate raw XML, either manually or to create programs that do so.
Once you are able to access and modify the XML document, then you need to present it. Many times the processing of the XML can be done at the presentation level instead.
No stylesheet referenced
Internet Explorer. If you open a raw, non-styled XML file in IE, then it is automatically presented in a tree-like structure where you can expand or contract branches of the tree by clicking on the "+" or "-" icons. EG: click here to see egSimple.xml.
Notepad. If you open a raw, non-styled XML file in Notepad, you will see the plain text. EG: click here to see egSimple.xml and then view the source code in Notepad.
CSS for XML (Cascading Style Sheet). The XML document references a CSS style sheet (.css) that specifies styles for the custom elements.
XSL(Extensible Stylesheet Language). XSL is a W3C standard which restructures and styles XML data by referencing an XSL style sheet (.xsl).
XSLT (XSL for Transformations). XSLT transforms XML documents into a restructured XML document or another kind of document.
XPath. XPath is a query expression language used by XSLT for accessing different parts of an XML document.
XSL-FO (XSL Formatting Objects). XSL-FO is for generating output specific material, EG: for PDF output.
Bind to XML. Certain HTML elements can be bound directly but this is browser specific. EG:
There are other supporting technologies related to XML.
XLink (XML Linking Language). Allows elements in XML that create and describe links between resources. Sort of like an XML version of hyperlinks found in HTML.
XPointer. A language for accessing a specific part of an XML document by suffixing to the URL of the XML document. It has its own syntax or can utilize XPath. Sort of like an XML version of bookmarking in HTML.
Specific W3C XML implementations of note:
MathML (Mathematical Markup Language). For presenting mathematical equations.
SMIL (Synchronized Multimedia Integration Language). Pronounced "smile". For multimedia presentations. Used by HTML+TIME.
VML (Vector Markup Language). For vector graphics.
SVG (Scalable Vector Graphics). For mixed vector and raster graphics.
SVG Tiny. For cell phone graphics.
SVG Basic. For PDA graphics.
CDF (Channel Definition Format). For news feeds. This has been largely hijacked by RSS.
SOAP (Simple Object Access Protocol). For web services.
XML Fragments. Supports logical documents composed of multiple entities. Hopefully this will enable systems to deal with large XML data that is in chunks.
XHTML. This is simply making HTML documents into XML documents.