What Is XML (Extensible Markup Language)? - TechTarget
- Home
- Application development
- Peter Loshin, Former Senior Technology Editor
- David Linthicum, Deloitte Consulting
- Maxine Giza
What is XML (Extensible Markup Language)?
XML (Extensible Markup Language) is used to describe data. The XML standard is a flexible way to create information formats and electronically share structured data via the public internet, as well as via corporate networks.
XML is a markup language based on Standard Generalized Markup Language (SGML) used for defining markup languages.
XML's primary function is to create formats for data that is used to encode information for documentation, database records, transactions and many other types of data. XML data may be used for creating different content types that are generated by building dissimilar types of content -- including web, print and mobile content -- that are based on the XML data.
Like Hypertext Markup Language (HTML), which is also based on the SGML standard, XML documents are stored as American Standard Code for Information Interchange (ASCII) files and can be edited using any text editor.
What is XML used for?
XML's primary function is to provide a "simple text-based format for representing structured information," according to the World Wide Web Consortium (W3C), the standards body for the web, including for the following:
- underlying data formats for applications such as those in Microsoft Office;
- technical documentation;
- configuration options for application software;
- books;
- transactions; and
- invoices.
XML enables sharing of structured information among and between the following:
- programs and programs;
- programs and people; and
- locally and across networks.
W3C defines the XML standard and recommends its use for web content. While XML and HTML are both based on the SGML platform, W3C has also defined the XHTML and XHTLM5 document formats that mirror, respectively, the HTML and HTML5 standards for web content.
How does XML work?
XML works by providing a predictable data format. XML is strict on formatting; if the formatting is off, programs that process or display the encoded data will return an error.
For an XML document to be considered well-formed -- that is, conforming to XML syntax and able to be read and understood by an XML parser -- it must be valid XML code. All XML documents consist of elements; an element acts as a container for data. The beginning and end of an element are identified by opening and closing tags, with other elements or plain data within.
XML works by providing properly formatted data that can be reliably processed by programs designed to handle XML inputs. For example, technical documentation may include a <warning> element similar to that shown in the following snippet of XML code:
<warning> <para> <emphasis type="bold">May cause serious injury</emphasis> Exercise extreme caution as this procedure could result in serious injury or death if precautions are not taken. </para></warning>In this example, this data is interpreted and displayed in different ways, depending on the form factor of the technical documentation. On a webpage, this element could be displayed in the following way:
WARNING: Exercise extreme caution as this procedure could result in serious injury or death if precautions are not taken.The same XML code is rendered differently on an appliance user interface (UI) or in print. This element could be interpreted to display the text tagged as emphasis differently, such as having it appear in red and with flashing highlights. In printed form, the content might be provided in a different font and format.
XML documents do not define presentation, and there are no default XML tags. Most XML applications use predefined sets of tags that differ, depending on the XML format. Most users rely on predefined XML formats to compose their documents, but users may also define additional XML elements as needed.
XML example
The following is an example of a simple XML file:
<?xml version="1.0" encoding="UTF-8"?><library> <book> <title>The Fire Next Time</title> <author>Baldwin, James</author> </book> <book> <title>Beloved</title> <author>Morris, Toni</author> </book> <book> <title>The Messiah of Stockholm</title> <author>Ozick, Cynthia</author> </book></library>The first line in this example, called the XML prolog or XML declaration, specifies the version of XML being used, as well as the character encoding scheme. This declaration, if present, must be the first line of the file.
The prolog specifies that the file contains XML version 1.0 data, encoded using Unicode Transformation Format 8 (UTF-8) encoding, which is the same as ASCII text and specifies the American English character set. Different character encodings can be used for different language character sets, but all XML files must be composed of Unicode characters. Any Unicode character can be represented in an XML file using a numeric character reference with a string in this format:
&#xXXXX;In this format, the letters "XXXX" are substituted with a valid hexadecimal Unicode numeric code.
XML elements
The logical structure of an XML file requires that all data in the file be encapsulated within an XML element called the root element or document element. This element identifies the type of data contained in the file; in the example above, the root element is <library>.
The root element contains other elements that define the different parts of the XML document; in the example above, the root element contains <book> elements, which, in turn, consist of the two elements <title> and <author>.
All XML elements must be properly terminated for an XML file to be considered well-formed. This means that a tag must be properly terminated with an opening and closing tag, like this paragraph element that would be a part of a document:
<para>This is an example of an XML tag for a paragraph.</para>A tag can also be empty, in which case it is terminated with a forward slash. In this example, an empty self-terminating paragraph tag is used to insert an extra space in a document:
<para />XML enables users to define their own additional elements if needed. In the preceding example, an XML author might define new elements for publisher, date of publication, International Standard Book Number and any other relevant data. The elements can also be defined to enforce rules regarding the contents of the elements.
XML entities
XML elements can also contain predefined entities, which are used for special reserved XML characters. Custom entities can be defined to insert a predefined string of characters for inclusion in an XML file.
The five standard predefined XML entities are the following:
- < -- The less than symbol (<), also known as the open angle bracket, is normally used in XML to indicate the start of an XML tag. This entity is used when the open angle bracket is part of the content of the XML file.
- > -- The greater than symbol (>), also known as the close angle bracket, is normally used in XML to indicate the end of an XML tag. This entity is used when the close angle bracket is part of the content of the XML file.
- & -- The ASCII ampersand symbol (&) is reserved in XML for indicating the start of an XML entity. This entity is used when an ampersand occurs within an XML element.
- " -- The ASCII double quote character (") is used in XML element tags to identify optional attribute values of the element. For example, an <emphasis> tag might include options for emphasizing some text, such as bold, italic or underline. This entity is used when a double quote character appears in the contents of an XML element.
- ' -- The ASCII single quote character ('), also known as an apostrophe, is used in XML element tags to identify option attributes of the element. For example, an <emphasis> tag might include options for emphasizing some text, such as bold, italic or underline. This entity is used when a single quote or apostrophe appears in the contents of an XML element.
XML entities take the form of &name; where the entity name begins with the ampersand symbol and ends with a semicolon. Custom entities can be single characters or complex XML elements. For example, boilerplate language for technical documentation or legal contracts can be reduced to a single entity. However, when using entities, the XML author must ensure that inserting the entity into an XML file will produce well-formed XML data.
Is XML a programming language?
XML is not a programming language. However, as a markup language, it is used to annotate data using tags, which interpret that data. Programming languages consist of instructions to implement algorithms, while markup languages are used to format data for processing by programs running algorithms that interpret marked-up data.
Markup language tags are considered a type of computer code because they define different elements of the markup language and because there are strict syntax rules for how to compose those elements.
What is an XML file?
An XML file is a plaintext file with the .xml file extension. XML files contain Unicode text, and they can be opened with any application capable of reading text files.
XML files can be edited either with a simple text editor or specialized XML editors. An XML editor may include tools for validating the XML code, including the ability to do the following:
- parse XML code and display well-formed XML;
- flag orphaned text, which is text not enclosed within a tag; and
- identify improperly formed tags.
Different types of content can be incorporated into an XML file. For example, rich media content can be incorporated into XML through tags that identify the files in which the rich media content resides.
How to open and read XML files
Any text editor can be used to open and edit an XML file. While text editors may be sufficient for casual XML file editing, specialized XML editing software is preferred for any extensive writing or editing of XML files. XML editing programs ease editing XML files with the following features:
- syntax highlighting for tracking complex XML tags;
- XML parser for checking validity of XML code and displaying parsed data;
- expanding or collapsing XML tags and nodes;
- enhanced interface for editing multiple files at once;
- graphical UI enabling visual display of relationships between XML elements and simplified display of complex XML elements, like tables; and
- productivity tools, like macros, custom elements, and search and replace functions.
Some leading XML editing programs are the following:
- Oxygen XML Editor
- XML Notepad
- Adobe FrameMaker
- MadCap Flare
- Quark Author
- Liquid XML Studio
XML files are structured much like any other type of programming code, with headers defining the contents of the file and indentation for nested elements.
What are the differences between XML and HTML?
While XML and HTML share the same underlying SGML foundations, they are different and are used in different ways.
The biggest difference between XML and HTML is that XML is used to store data as structured information, while HTML is used to represent content. Because XML stores data and because it enforces strict validation, XML content can be reliably processed by programs. This is why XML is often used to create files that are used to generate HTML content.
Strict validation of XML code means that, if there are errors in the code, it will fail when it is processed for output. Users can then correct the XML code so it can be successfully processed. This is crucial for HTML content that is based on XML but also makes XML an important format for software configuration files that must be well-formed in order to be successfully processed by software.
What is the benefit of using XML for documentation?
XML is widely used for technical documentation because it can specify structural information. This document structure can then be parsed by other programs for output.
For example, in HTML, the user can create different types of lists, including numbered lists, but there is no way to explicitly tag content as being part of a step-by-step procedure. In XML, a procedure tag can be defined to represent a list of items as being the steps of a procedure, including identifying different elements for required steps, optional steps and alternate steps.
Likewise, in HTML, a string can be tagged as one of several different heading levels to indicate a headline or title, but in XML, a string can be explicitly tagged as a title, subtitle, headline or subheadline. This enables the user to differentiate programs to process the XML content for different types of output.
For example, if the content is to be output in printed book form, the book or chapter title can be incorporated into a header or footer. If the content is to be output as HTML content, the book or chapter title could be incorporated into a webpage as needed.
As the foundational format for storing data of all types, XML drives much more than technical documentation. Learn about other formats that compete with XML for data representation, especially JavaScript Object Notation (JSON) and YAML Ain't Markup Language (YAML) data formats.
Continue Reading About XML (Extensible Markup Language)
- Does reading XML while writing JSON make me a bad person?
- Is JSON and XML your REST performance bottleneck?
- REST vs. SOAP: Choosing the best web service
- Top software documentation tools and how to use them
- What coding standards in software engineering should we follow?
Related Terms
What is Hyperledger? Hyperledger is a collection of open source projects created to support the development of blockchain-based distributed ledgers. See complete definition What is mobile application development? Mobile application development is the set of processes and procedures involved in writing software for small, wireless computing ... See complete definition What is user experience and UX design? Why should you care? User experience design, or UX design, is the process of building a product that's user-friendly and provides enhanced user ... See complete definitionNew & Updated Definitions
- What is 5G New Radio (5G NR)?
5G New Radio, or 5G NR, is a globally accepted standard that delivers more responsive mobile experiences for a wide range of applications over fifth-generation wireless (5G) networks. See More.
- What is multi-access edge computing? Benefits and use cases
- What is 5G?
- What is a small cell in wireless networks?
- What is 6G? Overview of 6G networks & technology
- phase-locked loop (PLL)
- What is identity and access management? Guide to IAM
- What are AI agents? Types and examples
- What is time-sensitive networking (TSN) via 5G?
- Networking
- Security
- CIO
- HR Software
- Customer Experience
- What is multi-access edge computing? Benefits and use cases
Multi-access edge computing (MEC) is a network architecture concept that brings cloud computing capabilities and IT services ...
- What is 5G?
Fifth-generation wireless or 5G is a global standard and technology for wireless and telecommunications networks.
- What is a small cell in wireless networks?
A small cell is a type of low-power cellular radio access point or base station that provides wireless service within a limited ...
- What is identity and access management? Guide to IAM
No longer just a good idea, IAM is a crucial piece of the cybersecurity puzzle. It's how an organization regulates access to ...
- What is data masking?
Data masking is a security technique that modifies sensitive data in a data set so it can be used safely in a non-production ...
- What is antivirus software?
Antivirus software (antivirus program) is a security program designed to prevent, detect, search and remove viruses and other ...
- What is a chief data officer (CDO)?
A chief data officer (CDO) in many organizations is a C-level executive whose position has evolved into a range of strategic data...
- What is user-generated content?
User-generated content (UGC) is published information that an unpaid contributor provides to a website.
- What is business process outsourcing (BPO)?
Business process outsourcing (BPO) is a business practice in which an organization contracts with an external service provider to...
- What is compensation management?
Compensation management is the discipline and process for determining employees' appropriate pay, incentives, rewards, bonuses ...
- What is HR technology (human resources tech)?
HR technology (human resources tech) refers to the hardware and software that support an organization's human resource management...
- What is core HR (core human resources)?
Core HR (core human resources) is an umbrella term that refers to the essential, mandatory and fundamental tasks and functions of...
- What are virtual agents and how are they being used?
A virtual agent is an AI-powered software application or service that interacts with humans or other digital systems in a ...
- Customer acquisition cost (CAC): How to calculate and reduce it
Customer acquisition cost (CAC) is the cost associated with convincing a consumer to buy your product or service, including ...
- What is direct marketing?
Direct marketing is a type of advertising campaign that seeks to elicit an action (such as an order, a visit to a store or ...
Từ khóa » G-xml
-
GXml - GNOME Wiki!
-
GXml - GitLab - GNOME
-
G XML Format Examples (All Parameters)
-
Introduction To GXML Applications - TIBCO Product Documentation
-
G-XML/a At Master · AScience/g-XML - GitHub
-
An Efficient G-XML Data Management Method Using ... - NASA/ADS
-
Gateway-nco-g-xml-11 Fail To Start - ng ... - IBM
-
Netcool/OMNIbus Integrations Release Notice - Gateway For ... - IBM
-
NI GXML Toolkit For LabVIEW - Download - VIPM By JKI
-
GXML (GeneXus Markup Language) | Article
-
Guideline XML (gXML) - Cover Pages
-
A G-XML Graph For Bibliographic Data. | Download Scientific Diagram
-
How To Configure XML Feed - Advanced Product Feeds Documentation