XML is a very popular and flexible format these days. Every programmer should understand it, it is simply a must have. Many technologies are tied today, they are actively using it, moreover, modern ones are among them.

Introduction

Hello dear readers of my article. I want to say right away that this is only the first article in my series of three articles. The main goal of the entire cycle is to devote each reader to XML and give, if not a complete explanation and understanding, then at least a good push to it, explaining the main points and things. The entire cycle will be for one nomination - "Attention to detail", and the division into 3 articles was made in order to fit into the limit of characters in posts and to divide a large amount of material into smaller portions for greater understanding. The first article will be devoted to XML itself and what it is, as well as one of the ways to create a schema for XML files - DTD. To begin with, I would like to give a short introduction for those who are not yet familiar with XML: no need to be intimidated. XML is not very complicated and needs to be understood by any programmer, as it is a very flexible, efficient and popular file format today for storing a variety of information you just want. XML is used in Ant, Maven, Spring. Any programmer needs knowledge of XML. Now that you have gathered strength and motivation, let's begin to study. I will try to lay out all the material as simply as possible, collecting only the most important and not going into the jungle.

XML

For a clearer explanation, it would be more correct to visualize the XML with an example.< ? xml version= "1.0" encoding= "UTF-8" ? > < company> < name> IT- Heaven< / name> < offices> < office floor= "1" room= "1" > < employees> < employee> < name> Maksim< / name> < job> Middle Software Developer< / job> < / employee> < employee> < name> Ivan< / name> < job> Junior Software Developer< / job> < / employee> < employee> < name> Franklin< / name> < job> Junior Software Developer< / job> < / employee> < / employees> < / office> < office floor= "1" room= "2" > < employees> < employee> < name> Herald< / name> < job> Middle Software Developer< / job> < / employee> < employee> < name> Adam< / name> < job> Middle Software Developer< / job> < / employee> < employee> < name> Leroy< / name> < job> Junior Software Developer< / job> < / employee> < / employees> < / office> < / offices> < / company>HTML and XML are similar in syntax, as they have a common parent, SGML. However, in HTML, there are only fixed tags of a specific standard, while in XML you can create your own tags, attributes and, in general, do whatever you want to store the data in a way that suits you. Basically, XML files can be read by anyone who knows English. This example can be depicted using a tree. Tree root- Company. It is also the root (root) element from which all other elements come. Each XML file can have only one root element. It should be announced after xml file declaration(first line in the example) and contain all other elements. A little about the declaration: she obligatory and is needed to identify the document as XML. It has three pseudo-attributes (special predefined attributes): version (according to the 1.0 standard), encoding (encoding) and standalone (autonomy: if yes and external schemas are connected to the document, then there will be an error, by default - no). The elements Are entities that store data using other elements and attributes. Attributes- this is additional information about the element, which is indicated when adding the element. If we translate the explanation into an OOP field, then we can give an example: we have a car, each car has characteristics (color, capacity, brand, etc.) - these are attributes, and there are entities that are inside the car: doors, windows, engine , the steering wheel is other elements. You can store properties both as separate elements and as attributes, depending on your desire. After all, XML is an extremely flexible format for storing information about something. After the explanations, it is enough for us to disassemble the example above for everything to fall into place. In the example, we described a simple company structure: there is a company that has a name and offices, and the offices have employees. The Employees and Offices elements are wrapper elements - they serve to collect elements of the same kind, in fact, combining them into one set for ease of processing. Floor and room deserve special attention. These are the attributes of the office (floor and number), in other words, its properties. If we had a "picture" element, then it would be possible to transfer its dimensions. You may notice that the company does not have a name attribute, but it does have a name element. You can simply describe structures however you like. Nobody obliges you to write all properties of elements only in attributes, you can use just elements and write some data inside them. For example, we can record the name and title of our employees as attributes:< ? xml version= "1.0" encoding= "UTF-8" ? > < company> < name> IT- Heaven< / name> < offices> < office floor= "1" room= "1" > < employees> < employee name= "Maksim" job= "Middle Software Developer" > < / employee> < employee name= "Ivan" job= "Junior Software Developer" > < / employee> < employee name= "Franklin" job= "Junior Software Developer" > < / employee> < / employees> < / office> < office floor= "1" room= "2" > < employees> < employee name= "Herald" job= "Middle Software Developer" > < / employee> < employee name= "Adam" job= "Middle Software Developer" > < / employee> < employee name= "Leroy" job= "Junior Software Developer" > < / employee> < / employees> < / office> < / offices> < / company>As you can see, now the name and title of each employee are his attributes. And you can see that there is nothing inside the entity (tag) employee, all the elements of employee are empty. Then you can make employee an empty element - close it immediately after declaring the attributes. This is done quite simply, just put a slash:< ? xml version= "1.0" encoding= "UTF-8" ? > < company> < name> IT- Heaven< / name> < offices> < office floor= "1" room= "1" > < employees> < employee name= "Maksim" job= "Middle Software Developer" / > < employee name= "Ivan" job= "Junior Software Developer" / > < employee name= "Franklin" job= "Junior Software Developer" / > < / employees> < / office> < office floor= "1" room= "2" > < employees> < employee name= "Herald" job= "Middle Software Developer" / > < employee name= "Adam" job= "Middle Software Developer" / > < employee name= "Leroy" job= "Junior Software Developer" / > < / employees> < / office> < / offices> < / company>As you can see, by closing the blank elements, we have preserved all the integrity of the information and have greatly reduced the record, making the information more concise and readable. To add comment(the text to be skipped when parsing the file) in XML, there is the following syntax:< ! -- Иван недавно уволился, только неделю отработать должен. Не забудьте потом удалить его из списка. -- >And the last construct is CDATA , stands for "character data". Thanks to this design, you can write text that will not be interpreted as XML markup. This is useful if you have an entity inside an XML file that stores XML markup in information. Example:< ? xml version= "1.0" encoding= "UTF-8" ? > < bean> < information> < ! [ CDATA[ < name> Ivan< / name> < age> 26 < / age> ] ] > < / information> < / bean>The peculiarity of XML is that you can extend it however you want: use your elements, your attributes, and structure it as you wish. You can use both attributes and elements to store data (as shown in the example earlier). However, you need to understand that you can come up with your own elements and attributes on the fly and how you want, but what if you are working on a project where another programmer wants to transfer the name element to attributes, and your entire program logic is written so that name was an element? How do you create your own rules for what elements should be, what attributes they have and other things so that you can validate XML files and be sure that the rules will become standard in your project and no one will break them? There are special tools to write all the rules for your own XML markup. The most famous are DTD and XML Schema. This article will only focus on the first.

DTD

The DTD is designed to describe document types. The DTD is already obsolete and is being actively deprecated in XML, but there are still many XML files using the DTD and, in general, it is useful to understand it. DTD is a technology for validating XML documents... DTD declares specific rules for the document type: its elements, what elements can be inside the element, attributes, whether they are required or not, the number of their repetitions, as well as the entity (Entity). Similar to XML, the DTD can be visualized with an example for a clearer explanation.< ! -- Объявление возможных элементов -- > < ! ELEMENT employee EMPTY> < ! ELEMENT employees (employee+ ) > < ! ELEMENT office (employees) > < ! ELEMENT offices (office+ ) > < ! ELEMENT name (#PCDATA) > < ! ELEMENT company (name, offices) > < ! -- Добавление атрибутов для элементов employee и office -- > < ! ATTLIST employee name CDATA #REQUIRED job CDATA #REQUIRED > < ! ATTLIST office floor CDATA #REQUIRED room CDATA #REQUIRED > < ! -- Добавление сущностей -- > < ! ENTITY M "Maksim" > < ! ENTITY I "Ivan" > < ! ENTITY F "Franklin" >We have such a simple example. In this example, we have declared our entire hierarchy from the XML example: employee, employees, office, offices, name, company. To create DTD files, 3 basic constructs are used to describe any XML files: ELEMENT (for describing elements), ATTLIST (for describing attributes for elements) and ENTITY (for substituting text with abbreviated forms). ELEMENT Serves to describe the item. The elements that can be used within the described element are listed in parentheses as a list. You can use quantifiers to specify the amount (they are the same as regular expression quantifiers): + means 1+ * means 0+? means 0 OR 1 If no quantifiers have been added, then it is considered that there should be only 1 element. If we needed one of a group of elements, we could write like this:< ! ELEMENT company ((name | offices) ) >Then one of the elements would be selected: name or offices, but if there were two of them inside company at once, then the validation would not take place. You can also notice that employee has the word EMPTY - this means that the element must be empty. There is also ANY - any elements. #PCDATA - text data. ATTLIST Serves for adding attributes to elements. The ATTLIST is followed by the name of the required element, and after the dictionary of the form "attribute name - attribute type", and at the end you can add #IMPLIED (optional) or #REQUIRED (required). CDATA - text data. There are other types, but they are all lowercase. ENTITY ENTITY is used to declare abbreviations and the text that will be substituted on them. In fact, we can simply use in XML instead of the full text, just the name of the entity with an & before and; after. For example: to distinguish between HTML markup and just characters, the left angle bracket is often escaped with lt; , you just need to put & before lt. Then we will not use markup, but simply the symbol< . Как вы можете видеть, все довольно просто: объявляете элементы, объясняете, какие элементы объявленные элементы способны содержать, добавление атрибутов этим элементам и, по желанию, можете добавить сущности, чтобы сокращать какие-то записи. И тут вы должны были бы спросить: а как использовать наши правила в нашем XML файле? Ведь мы просто объявили правила, но мы не использовали их в XML. There are two ways to use them in XML: 1. Injection - writing DTD rules inside the XML file itself, you just need to write the root element after the DOCTYPE keyword and enclose our DTD file inside square brackets. < ? xml version= "1.0" encoding= "UTF-8" ? > < ! DOCTYPE company [ < ! -- Объявление возможных элементов -- > < ! ELEMENT employee EMPTY> < ! ELEMENT employees (employee+ ) > < ! ELEMENT office (employees) > < ! ELEMENT offices (office+ ) > < ! ELEMENT name (#PCDATA) > < ! ELEMENT company (name, offices) > < ! -- Добавление атрибутов для элементов employee и office -- > < ! ATTLIST employee name CDATA #REQUIRED job CDATA #REQUIRED > < ! ATTLIST office floor CDATA #REQUIRED room CDATA #REQUIRED > < ! -- Добавление сущностей -- > < ! ENTITY M "Maksim" > < ! ENTITY I "Ivan" > < ! ENTITY F "Franklin" > ] > < company> < name> IT- Heaven< / name> < ! -- Иван недавно уволился, только неделю отработать должен. Не забудьте потом удалить его из списка. -- > < offices> < office floor= "1" room= "1" > < employees> < employee name= "&M;" job= "Middle Software Developer" / > < employee name= "&I;" job= "Junior Software Developer" / > < employee name= "&F;" job= "Junior Software Developer" / > < / employees> < / office> < office floor= "1" room= "2" > < employees> < employee name= "Herald" job= "Middle Software Developer" / > < employee name= "Adam" job= "Middle Software Developer" / > < employee name= "Leroy" job= "Junior Software Developer" / > < / employees> < / office> < / offices> < / company> 2. Import - we write all our rules into a separate DTD file, after which we use the DOCTYPE-construction from the first method in the XML file, only instead of square brackets, you need to write SYSTEM and specify the absolute or relative path to the current file location. < ? xml version= "1.0" encoding= "UTF-8" ? > < ! DOCTYPE company SYSTEM "dtd_example1.dtd" > < company> < name> IT- Heaven< / name> < ! -- Иван недавно уволился, только неделю отработать должен. Не забудьте потом удалить его из списка. -- > < offices> < office floor= "1" room= "1" > < employees> < employee name= "&M;" job= "Middle Software Developer" / > < employee name= "&I;" job= "Junior Software Developer" / > < employee name= "&F;" job= "Junior Software Developer" / > < / employees> < / office> < office floor= "1" room= "2" > < employees> < employee name= "Herald" job= "Middle Software Developer" / > < employee name= "Adam" job= "Middle Software Developer" / > < employee name= "Leroy" job= "Junior Software Developer" / > < / employees> < / office> < / offices> < / company>You can also use keyword PUBLIC instead of SYSTEM, but you probably won't need it. If you are interested, then you can read about it (and about SYSTEM too) in detail here: link. Now we cannot use other elements without declaring them in the DTD, and all XML obeys our rules. You can try to write this code in IntelliJ IDEA in separate file with the extension .xml and try to add some new elements or remove an element from our DTD and notice how the IDE will indicate an error to you. However, DTD has its drawbacks:
  • It has its own syntax different from the xml syntax.
  • There is no data type checking in the DTD, and it can only contain strings.
  • There is no namespace in the DTD.
On the problem of native syntax: you have to understand two syntaxes at once: XML and DTD syntax. They are different and this can make you confused. Also, because of this, it is more difficult to track errors in huge XML files in conjunction with the same DTD schemas. If something doesn't work for you, you have to check a huge amount of text in different syntaxes. It's like reading two books at the same time: in Russian and in English. And if your knowledge of one language is worse, then it will be more difficult to understand the text. On the Data Type Checking Issue: Attributes in DTDs Do Have different types, but they are all essentially string representations of something, lists or links. However, you cannot demand only numbers, let alone positive or negative. You can forget about object types altogether. The last problem will be discussed in the next article, which will be devoted to namespaces and XML schemas, since there is no point in discussing it here. Thank you all for your attention, I have done a great job and continue to do it in order to finish the entire series of articles on time. Basically, it remains for me to figure out the XML schemas and come up with a clearer explanation of them in order to finish the 2nd article. Half of it has already been done, so you can expect it soon. The last article will be completely devoted to working with XML files using Java. Good luck to everyone and success in programming :) Next article:

You've probably heard of XML and you know many reasons why it should be used in your organization. But what exactly is XML? This article explains what XML is and how it works.

In this article

Markups, markup, and tags

To understand XML, it is helpful to remember how you can label data. Documents have been created by people for centuries, and throughout that time people have made notes on them. For example, teachers often mark up student papers to indicate the need to move paragraphs, make sentences clearer, correct spelling errors, etc. Markups in a document can help define structure, meaning, and appearance information. If you have ever used fixes in Microsoft Office Word, then you are familiar with the computerized markup form.

In the world information technologies the term "tagging" has become the term "markup". Markup uses codes called tags (or sometimes tokens) to define the structure, visual presentation, and, in the case of XML, the meaning of the data.

The HTML text for this article is a good example of the use of computer markup. If Microsoft Internet Explorer right-click this page and select the command View HTML Code you will see readable text and HTML tags like

AND

... In HTML and XML documents, tags are easy to recognize because they are enclosed in angle brackets. In the original text of this article, HTML tags serve many functions, such as defining the beginning and end of each paragraph (

...

) and the location of the figures.

Distinctive features of XML

HTML and XML documents contain tagged data, but that is where the similarities between the two languages ​​end. In HTML, tags define how data is styled — the location of headings, the beginning of a paragraph, and so on. In XML, tags define the structure and meaning of data — what it is.

By describing the structure and meaning of data, it becomes possible to reuse it in several ways. For example, if you have a block of sales data, each element in which is clearly defined, then you can load only the necessary elements into the sales report, and transfer other data to the accounting database. In other words, you can use one system to generate and tag data in XML format, and then process that data on any other system, regardless of the client platform or operating system. This interoperability makes XML the foundation of one of the most popular data exchange technologies.

Consider the following when working:

    HTML cannot be used in place of XML. However, XML data can be wrapped in HTML tags and displayed on web pages.

    HTML capabilities are limited to a predefined set of tags that are common to all users.

    XML rules allow you to create any tags required to describe the data and its structure. Let's say you need to store and share information about pets. To do this, you can create the following XML:

    Izzy Siamese 6 yes no Izz138bod Colin wilcox

As you can see, the XML tags tell you what data you are viewing. For example, it is clear that this is data about a cat, and you can easily determine its name, age, and so on. Thanks to the ability to create tags that define almost any data structure, XML is extensible.

But don't confuse the tags in this example with the tags in the HTML file. For example, if the above XML text is pasted into an HTML file and opened in a browser, the results will look like this:

Izzy Siamese 6 yes no Izz138bod Colin Wilcox

The web browser will ignore the XML tags and display only the data.

Well-formed data

You've probably heard some IT professional talk about a "well-formed" XML file. A well-formed XML file must follow very strict rules. If it doesn't follow these rules, XML won't work. For example, in the previous example, each start tag has a corresponding end tag, so in this example one of the rules of a well-formed XML file is met. If you delete a tag from the file and try to open it in one of the Office programs, an error message will appear and you will not be able to use such a file.

You don't need to know the rules for creating a well-formed XML file (although they are easy to understand), but remember that you can only use well-formed XML data in other applications and systems. If the XML file does not open, then it is probably not well-formed.

XML is platform independent, which means that any program built to use XML can read and process XML data regardless of hardware or operating system. For example, if you apply the correct XML tags, you can use a desktop program to open and process data from the mainframe. And, no matter who created the XML data, you can work with the data in a variety of Office applications. Because of its interoperability, XML has become one of the most popular technologies for exchanging data between databases and user computers.

In addition to well-formed tagged data, XML systems typically use two additional components: schemas and transformations. The following sections describe how they work.

Schemes

Don't be intimidated by the term "schema". A schema is simply an XML file that contains rules for the content of an XML data file. Schema files usually have the XSD extension, while XML data files use the XML extension.

Schemas allow programs to validate data. They form the structure of the data and make it understandable for the creator and other people. For example, if a user enters invalid data, such as text in a date field, the program might prompt him to correct it. If the data in the XML file follows the rules in the schema, you can use any program that supports XML to read, interpret, and process it. For example, as shown in the picture below, Excel can validate data for compliance with the CAT scheme.

Diagrams can be complex, and this article cannot explain how to create them. (Also, chances are your organization has IT people who know how to do this.) However, it is useful to know what the diagrams look like. The following diagram defines the rules for a set of tags ... :

Don't worry if the example isn't clear. Just pay attention to the following:

    Inline elements in the above example schema are called declarations. If additional information about the animal, such as color or special characteristics, was required, IT would add advertisements to the diagram. The XML system can be modified as business needs evolve.

    Declarations are powerful tools for manipulating data structures. For example, the declaration means tags like and must be followed in the above order. You can also use declarations to validate the types of data entered by the user. For example, the above scheme requires a positive integer for the age of the cat and Boolean values ​​(TRUE or FALSE) for the ALTERED and DECLAWED tags.

    If the data in the XML file conforms to the schema rules, then the data is said to be valid. The process of verifying that an XML data file conforms to schema rules is (logically enough) called validation. The big advantage of using schemas is that they can prevent data corruption. Schemas also make it easier to find corrupted data, because when this problem occurs, processing of the XML file stops.

Transformations

As discussed above, XML also allows for efficient use and reuse of data. The mechanism for reusing data is called XSLT transformation (or simply transformation).

You (or your IT department) can also use transformations to exchange data between back-end systems, such as between databases. Suppose database A stores the sales data in a table that is convenient for the sales department. Database B stores data on income and expenses in a table specially designed for accounting. Database B can use a transform to take data from database A and place it in the appropriate tables.

The combination of the data file, schema, and transformation forms the underlying XML system. The following figure shows the operation of such systems. The data file is checked against the schema rules and then passed in any suitable way for transformation. In this case, the transform places the data in a table on the web page.

The following example provides a transform that loads data to a table on a web page. The point of the example is not to explain how to create transformations, but to show one of the forms that they can take.

Name Breed Age Altered Declawed License Owner

This example shows what the text of one of the conversion types might look like, but remember that you can limit yourself to a clear description of what you need from the data, and this description can be done in your own language. For example, you might go to the IT department and say that you want to print sales data for specific regions for the last two years, and that the information should look like this and that. The department can then write (or modify) a conversion to fulfill your request.

Microsoft and a growing number of other companies are creating transformations for a variety of tasks to make XML even more convenient to use. In the future, you will most likely be able to download a transform that suits your needs without additional configuration or with minor changes. This means that over time, using XML will be less and less expensive.

XML in the Microsoft Office system

The professional editions of Office provide enhanced XML support. Starting with the 2007 Microsoft Office System, Microsoft Office uses XML-based file formats such as DOCX, XLSX, and PPTX. Because XML stores data in a text format rather than a native binary format, customers can define their own schemas and use your data in a variety of ways, without having to pay royalties. For more information on the new formats, see the article Open XML Formats and File Name Extensions. Other benefits are given below.

This is all great, but what if you have schema-less XML data? It depends on which Office program you're using. For example, when you open an XML file without a schema in Excel, it assumes that the schema exists and enables you to load the data into an XML table. You can use XML lists and tables to sort, filter, and calculate data.

Enabling XML Tools in Office

By default, the Developer tab is not displayed. It must be added to the ribbon to use XML commands in Office.

Today we will start considering a very popular and convenient XML markup language... Since this data presentation format is very flexible and universal, and it can be used almost everywhere, I mean being ashamed of something. Therefore, sooner or later a novice programmer will have to deal with this language, and it does not matter what exactly you are doing, whether it is web programming or database administration, because everyone uses XML, and you will also use it in the implementation of the tasks you need.

We will start as usual with theory, let's look at what kind of language it is, why it is good, how to use it and where it is used.

XML language definition

XML (eXtensible Markup Language) Is a universal and extensible data markup language that does not depend on the operating system and processing environment. Xml serves to represent some data in the form of a structure, and you can develop this structure yourself or adjust it for a particular program or some service. That is why this language is called extensible, and this is its main advantage, for which it is so appreciated.

As you know, there are quite a few markup languages, for example, HTML, but all of them, one way or another, depend on the handler, for example, the same html, the code of which is parsed by the browser, is standardized and not extensible, there are clear tags, syntax that cannot be broken, and in xml you can create your own tags, i.e. your markup. The main difference between HTML and XML is that html only describes the markup for displaying data, and xml is an abstract data structure that can be processed and displayed as you like and anywhere and therefore you do not need to compare these languages, they have completely different purposes.

As noted above, xml is a very common and universal language, through which almost all applications, both web and just for a computer, use it as an exchange of information, since it can be used to exchange data very easily between applications or services that are even written different languages. In this connection, every novice programmer who is engaged in absolutely any programming should have an understanding of XML. If you want to become a web master, then you simply have to know XML, and we have already considered how to become a WEB Master and what you need to know for this.

For example, I once had a task to write a certain service that should return data in the form of xml upon request, i.e. kind of develop the server side of the application, and I even had no idea what the client was written on that would process this data, and that, I wrote a service that returned data in the form of xml and that's it, the application worked fine. And this is just an example that I had to face, and now imagine how many different organizations cooperate and conscientiously develop software and exchange data, and I would not be surprised that this data will be in the form of xml.

For example, I once had a task to write a certain service that should return data in the form of xml upon request, i.e. sort of develop the server side of the application, and I even had no idea what the client was written on that would process this data, and that, I wrote a service that returned data in the form of xml and that's it, the application worked fine. And this is just an example that I had to face, and now imagine how many different organizations cooperate and conscientiously develop software and exchange data, and I would not be surprised that this data will be in the form of xml.

Also, I once had to store xml data in MS SQL 2008 in order to better represent this data and exchange it between the server and the client part of the application, we discussed this in the article - Transact-sql - working with xml.

The XML language itself is very simple, and it is simply impossible to get confused in it, all the complexity arises precisely in the processing and interaction of xml with other applications, technologies, i.e. everything that surrounds xml, which is exactly what you can easily get confused about.

Today we are talking with you only about the basics of XML, and we will not focus on processing technologies and interaction with this language, since this is true, very voluminous material, but I think in the future we will continue to get acquainted with related technologies.

Let's get down to practice. And all the examples that we will consider, I will write in Notepad ++ only because it is very convenient, but now we will not talk about it, since we have already considered this in the article - What is good about Notepad ++ for a novice developer.

XML tags

XML for markup uses tags ( tags are case sensitive), but not such tags as in html, but those that you come up with yourself, but the xml document also has a clear structure, i.e. there is an opening tag and an end tag, there are nested tags and, of course, there are values ​​that are located in these tags. In other words, all it takes to get started with xml is to just stick to these rules. Together, the opening, closing tag and the value are called an element, and the entire xml document consists precisely of the elements that together form the data structure. An xml document can have only one root element, remember this, because if you write two root elements, it will be an error.

And it's time to give an example of xml markup, and the first example so far for the syntax:

<Начало элемента> <Начало вложенного элемента>The value of the nested element

As you can see, everything is quite simple, and there can be a lot of such nested elements.

Now let's give an example, a real xml document:

As you can see, I just gave an example of some sort of book catalog here, but I did not declare this document, i.e. did not write an XML declaration that tells the application that will process this data that the data is located here exactly xml and in what encoding it is presented. You can also write comments and attributes, so let's give an example of such a document:

Book 1 Ivan Just Book 1 Book 2 Sergey Just Book 2 Book 3 Novel Just Book 3

Where the first line is the declaration of the declaration that this is an XML document and must be read in UTF-8 encoding.

This data without processing will look, for example, in the browser ( Mozilla Firefox) in the following way:

I hope you understand that here catalog is the root element, which consists of the book elements, which in turn consists of the name, author and comment elements, also, for example, I set several attributes for the catalog element and the book element.

For the basics, I think that's enough, because if we dive deeper and deeper into XML, and all the technologies associated with this language, then this article will never end. So that's all for today. Until!

XML was created to describe data with an eye to what the data is.

HTML was created to display data with an eye to how the displayed data looks.

What is XML?

  • XML stands for EXtensible Markup Language
  • XML is markup language similar to HTML
  • XML was created for data descriptions
  • XML tags are not predefined. You can use your tags
  • XML uses Document Type Definition (DTD) or XML Schema for data description
  • XML recommended by W3C

The main difference between XML and HTML

XML was designed to transfer data.

XML is not a replacement for HTML.

XML and HTML were developed for different purposes:

  • XML was created to describe data and focuses on what data is transmitted
  • HTML was designed to display data with a focus on displaying data
  • Thus, HTML is more about displaying information, while XML is more about describing information.

XML does nothing

XML was not built to take any action.

It might be tricky to understand, but XML doesn't do anything. This markup language was created for structuring, storing and transmitting information. The following example is a note from Anton Ira, presented in XML:

Ira

Anton

Reminder

Don't forget to meet this week!

As you can see, XML is very concise.

Note ( ) consists of a header ( ) and content ( ) letters. It contains the sender (tag - "from whom the letter") and the recipient (tag - "to whom"). But this letter does nothing. This is pure information wrapped in tags. In order to send, receive and display this information, someone has to write a program.

XML - Free Extensible Markup Language

XML tags are not predefined. You can enter your tags.

Tags and document structure in HTML are predefined. The creator of the html document can only use the tags defined by the standards.

XML allows the author of the xml document to enter his tags and document structure. The tags provided in the example (for example, and ) are not defined by the XML standard. These tags are introduced by the author of the document.

XML is complementary to HTML

XML is not a replacement for HTML.

It is important to understand that XML is not a replacement for HTML. In the future, web developers will use XML to describe data, while HTML will be used to format and display that data.

My best definition of XML is this: XML is a cross-platform, software- and hardware-agnostic communication tool.

The note: Cross-platform - Suitable for any operating system and any hardware.

If you know there are various OS apart from the familiar Windows. This OS is Linux, Mac and others.

As for the hardware, we can say the following: it can be ordinary PCs, laptops, PDAs, etc.

XML in Future Web Development

XML will be used everywhere.

We have witnessed the development of XML since its inception. It was amazing to see how quickly the XML standard was developed and how quickly a large number of vendors software adopted this standard. We strongly believe that XML will be just as important for the future of the Internet as HTML, which is the foundation of the Internet, and that XML will be the most prevalent tool for all data manipulation and transmission.

Lucinda Dykes, Ed Tittel

XML is a markup language that creates web pages. Before you start using XML, learn the difference between a valid document and a well-formed document, how to create Document Type Definition (DTD) elements and basic schema declarations to create an XML document. You will also want to understand the regularly used reserved characters and which web browsers support XML and stylesheets best.

Valid versus well-formed XML document

In XML, a valid document must conform to the rules in its DTD (Document Type Definition) or schema, which defines what elements can appear in a document and how elements can fit within each other. If the document is poorly formed, it doesn't advance very far in the XML world, so you have to play by some very simple rules when creating an XML document. A well-formed document must have the following components:

    All start and end tags are the same. In other words, the opening and closing parts must always contain the same name in the same case: ... or ..., but not ....

    Empty elements follow special XML syntax like .

    All attribute values ​​appear in single or double quotes: id = "value"> or .

Rules for Creating a Document Type Definition or DTD, Elements

Basically, you prepare and use a Document Type Definition (DTD) to add structure and logic, making it easy to ensure that all the required functionality is present - in the correct order - in your XML document. You can develop many rules in the DTD that control how elements can be used in an XML document.

SymbolValueExample
#PCDATAContains parsed character data or text
#PCDATA element-nameContains text and other element; #PCDATA always appears first in a rulechild) *>
, (comma)Must be used in this orderchild3)>
| (pipe panel)Use only one of the options providedchild3)>
element-name (by itself)Use only one name
element name? child3?)>
item-name +Use one or more timeschild3)>
element name *Use once, many times, or not at allchild3)>
() Indicates groups; can be nestedor
child4)>

Basic XML Schema Declarations

An XML Schema document is built from a series of declarations that give a very detailed information and ensure that the information contained in the XML document is in the correct form.

AdAppointmentSyntax
SchemeSpecifies the language that the schema usesxmlns: xsd = "// www. w3. org / 2001 / XMLSchema">
ElementDefines an element
AttributeDefines an attributetype = "type">
Complex typeDefines an element that contains other elements, contains attributes
or contains mixed content (elements and text)
Simple typeCreates a constrained data type for an element or attribute
value
Sequential linkerIndicates that attributes or elements are in a complex type
should be listed in order
Choice compositorIndicates that any of the attributes or elements in a complex type can be used
All compositorIndicates that any or all attributes or elements in a complex type can be used
annotationContains documentation and / or appInfo elements that provide
additional information and comments on the schematic document
DocumentationProvides readable information in annotations
Application InformationProvides computer readable information within
annotation

Common reserved characters in XML

Some objects are reserved for internal use in XML and should be replaced with symbolic links in your content. These five commonly used internal objects are already defined as part of XML and are ready to use:

CSS1?

XSLT 1.0?YesYesNotNot
Internet Explorer 6.0 Yes Yes Yes Yes
Mozilla 1.5 Yes Yes Yes Yes
Mozilla Firefox 1.0 Yes Yes Yes Yes
Netscape Navigator 7 Yes Yes Yes Yes
Opera 7 Yes Yes Yes Not