XML Files
XML stands for Extensible Markup Language. It was a W3C recommendation in 1998 and is a tag-based syntax.
Instead of "pre-made" tags like in HTML, you create your own tags.
The purpose of XML is to take information and apply structure and meaning to information.
It was intended to be used over the Internet and exchange information between fundamentally distinct systems.
Advantages of XML
- It will keep content separated.
- It is open format and read by many applications.
- It can be used for both client and server.
- It has a widespread support of different languages and runtimes.
- If you have computers that are not aware of their "native" way of working data, XML can serve as a bridge between the two computers.
Disadvantages of XML
- XML is considered not suitable for exceptionally large data sets.
- Data types, like images, are not expressed well and it can become difficult to read when complex.
Types of XML Content
An XML document declaration looks like the following:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
So, what does this all mean?
?xml identifies the file as an XML document. This declaration provides a place for the encoding and standalone attributes.
The most important part of the declaration is that there cannot be any whitespace and must be at the very beginning of each XML document.
Next, "UTF-8" is the encoding attribute. This is set to the default encoding.
Lastly, the standalone attribute can be set to "no" if it relies on other data. In this example, it would only rely on this one XML document.
Tags
Like HTML and JavaScript, you cannot start tags with numbers or spaces.
Acceptable elements are:
- <_Element1>
- <My.Element>
- <My-Element_Name>
Attributes
Attributes are specified on opening elements tag and must start with a letter or underscore.
They can only appear once on any given element.
Comments
Comments are the exact same as HTML. They can go anywhere except inside the element brackets and before a document declaration.
Example: <!-- and ends with -->
Character Data Sections (CDATA)
While these are part of the document, it is not parsed by the XML parser. The way you write CDATA in XML is:
<![CDATA[ and ends with ]]>
You can give the XML parser special instructions on processing. For example:
<?SpellCheckMode mode="en-GB"?>
XML entities represent the item of data but also help shorten and give more flexibility in XML documents.
Entities can also provide markup for otherwise illegal characters.
Proper XML Syntax
- All XML documents must have a single root tag.
- All elements of an XML document must be well-formed (or it could lead to runtime errors).
- Empty tags must be closed. Example: <elem></elem>, use <elem />
- Any attribute values contained in your XML document cannot be minimized.
- In addition to attribute values not being minimized, they also must be inside single or double quotes.
- Lastly, tags must be properly nested inside of each other.
Example: <option selected> would be considered illegal, but <option selected="selected"> is correct.
Document Type Definitions
While they are simple to use, they are not considered powerful.
XML Schema on the other hand, is considered much more powerful and flexible than document type definitions.
Here is an example of the same text of this webpage as an XML file: XML.xml