Elements and Tags

Elements

Elements are the building blocks of markup languages such as XML. This is an example of a text element:

[xml] <text>This is the text</text>[/xml]

An element starts with an opening tag (here: <text>), followed by its content (here: This is the text), and ends with a closing tag (here: </text>).


Tags

Tags mark the start and the beginning of an element. There are three types of tags: opening tags, closing tags, and milestone tags:

  1. An opening tag starts with an opening angled bracket (<), followed by the element's name (here: text), and ends with a closing angled bracket (>).
  2. A closing tag starts with an opening angled bracket and a backslash (</), followed by the element's name (here: text), and ends with a closing angled bracket (>).
  3. A milestone tag represents what is called an empty element: an element that does not have any content. These empty elements can be used to point to a specific point in the text, rather than to a selection of text. A good example of an empty element is the one that represents a pagebreak: <pb/>. A milestone tag starts with an opening angled bracket (<), followed by the element's name (here: pb), and ends with a backslash and a closing angled bracket (/>).

[table]
No.,type,description,example
1,opening tag,opens an element,"<text>"
2,closing tag, closes an element,"</text>"
3,milestone tag, represents an empty element,"<pb/>"
[/table]


Nesting

Tags beget tags. In the following example, <text> is what we call the parent tag of <body> and <body> is the child tag of <text>.

[xml] <text>
<body>The body of the text</body>
</text>[/xml]

Important!

For an XML document to work, it needs to be well-formed. That means that all child tags have to be properly nested inside their parent tags like Russian Matryoshka dolls.

For example:

[xml highlight="3"] <text> <body></body> </text>

is CORRECT[/xml]

while:

[xml highlight="3"] <text> <body></text> </body>

is INCORRECT[/xml]


Validation

While your XML will work (i.e.: be able to be displayed in a browser without producing any errors) as long as it is well-formed, it also needs to be valid if you want other people to be able to use and process it. In our case, because we work with TEI-compliant XML, this means that our code will need to comply to the TEI Guidelines.

These guidelines do not only specify which tags we can use (e.g.: that <p> describes a paragraph, or that <del> describes a deletion); they also contain a strict set of rules to specify exactly how we can use those tags. For example, they specify that the <body> tag (which represents a body of text) must always be the direct child of a <text> tag. This means that

[xml highlight="8"] <TEI>
<text>
<body>
</body>
</text>
</TEI>

is CORRECT[/xml]

while:

[xml highlight="6"] <TEI>
<body>
</body>
</TEI>

is INCORRECT[/xml]

To help you to follow these rules without having to look them up in the TEI Guidelines for every tag, we have decided to start de description of every tag with a table that contains that tag's validation information. Those tables should speak for themselves: every tag must be the direct child of one of the tags in its Contained by section, and every tag can only be the parent tag of those tags specified in its May Contain section. (The table also contains a list of the attributes that tag can contain, but we'll get to that in the next section when we're discussing Attributes and their Values.)

The BDMP's full validation schema can also be downloaded from our Validation page.