23 Documentation Elements

This chapter describes a module which may be used for the documentation of the XML elements and element classes which make up any markup scheme, in particular that described by the TEI Guidelines, and also for the automatic generation of schemas or DTDs conforming to that documentation. It should be used also by those wishing to customize or modify these Guidelines in a conformant manner, as further described in chapters 24.3 Customization and 24.4 Conformance and may also be useful in the documentation of any other comparable encoding scheme, even though it contains some aspects which are specific to the TEI and may not be generally applicable.

An overview of the kind of processing environment envisaged for the module described by this chapter may be helpful. In the remainder of this chapter we refer to software which provides such a processing environment as an ODD processor.96 Like any other piece of XML software, an ODD processor may be instantiated in many ways: the current system uses a number of XSLT stylesheets which are freely available from the TEI, but this specification makes no particular assumptions about the tools which will be used to provide an ODD processing environment.

As the name suggests, an ODD processor uses a single XML document to generate multiple outputs. These outputs will include:

The input required to generate these outputs consists of running prose, and special purpose elements documenting the components (elements, classes, etc.) which are to be declared in the chosen schema language. All of this input is encoded in XML using elements defined in this chapter. In order to support more than one schema language, these elements constitute a comparatively high-level model which can then be mapped by an ODD processor to the specific constructs appropriate for the schema language in use. Although some modern schema languages such as RELAX NG or W3C Schema natively support self-documentary features of this kind, we have chosen to retain the ODD model, if only for reasons of compatibility with earlier versions of these Guidelines. For reasons of backwards compatibility, the ISO standard XML schema language RELAX NG (http://www.relaxng.org) may be used as a means of declaring content models and datatypes, but it is also possible to express content models using native TEI XML constructs. We also use the ISO Schematron language to define additional constraints beyond those expressed in the content model, as further discussed in 23.5.2 Additional Constraints below.

In the TEI system, a schema is built by combining element and attribute declarations, more or less as required. Each element is documented by an appropriate specification element and has an identifier unique across the whole TEI scheme. For convenience, these specifications are grouped into a number of discrete modules, which can also be combined more or less as required. Each major chapter of these Guidelines defines a distinct module. Each module declares a number of elements specific to that module, and may also populate particular classes. All classes are available globally, irrespective of the module in which they are declared; particular modules extend the meaning of a class by adding elements or attributes to it. Wherever possible, element content models are defined in terms of classes rather than in terms of specific elements. Modules can also declare particular patterns, which act as short-cuts for commonly used content models or class references.

In the present chapter, we discuss the components needed to support this system. In addition, section 23.1 Phrase Level Documentary Elements discusses some general purpose elements which may be useful in any kind of technical documentation, wherever there is need to talk about technical features of an XML encoding such as element names and attributes. Section 23.2 Modules and Schemas discusses the elements which are used to document XML modules and their high-level components. Section 23.3 Specification Elements discusses the elements which document XML elements and their attributes, element classes, and generic patterns or macros. Finally, section 23.9 Module for Documentation Elements provides a summary overview of the elements provided by this module.

⚓︎23.1 Phrase Level Documentary Elements

⚓︎23.1.1 Phrase Level Terms

In any kind of technical documentation, the following phrase-level elements may be found useful for marking up strings of text which need to be distinguished from the running text because they come from some formal language:

  • code contains literal code from some formal language such as a programming language.
    lang(formal language) a name identifying the formal language in which the code is expressed.
  • ident (identifier) contains an identifier or name for an object of some kind in a formal language. ident is used for tokens such as variable names, class names, type names, function names etc. in formal programming languages.

Like other phrase-level elements used to indicate the semantics of a typographically distinct string, these are members of the model.emphLike class. They are available anywhere that running prose is permitted when the module defined by this chapter is included in a schema.

The code and ident elements are intended for use when citing brief passages in some formal language such as a programming language, as in the following example:
<p>If the variable <ident>z</ident> has a value of zero, a statement such as <code>x=y/z</code> will
usually cause a fatal error.</p>

If the cited phrase is a mathematical or chemical formula, the more specific formula element defined by the figures module (15.2 Formulæ and Mathematical Expressions) may be more appropriate.

A further group of similar phrase-level elements is also defined for the special case of representing parts of an XML document:

  • att (attribute) contains the name of an attribute appearing within running text.
  • gi (element name) contains the name (generic identifier) of an element.
  • tag (tag) contains text of a complete start- or end-tag, possibly including attribute specifications, but excluding the opening and closing markup delimiter characters.
  • val (value) contains a single attribute value.

These elements constitute the model.phrase.xml class, which is also a subclass of model.phrase. They are also available anywhere that running prose is permitted when the module defined by this chapter is included in a schema.

As an example of the recommended use of these elements, we quote from an imaginary TEI working paper:
<p>The <gi>gi</gi> element is used to tag element
names when they appear in the text; the <gi>tag</gi> element however is used to show how a tag as
such might appear. So one might talk of an occurrence of the <gi>blort</gi> element which had been
tagged <tag>blort type='runcible'</tag>. The <att>type</att> attribute may take any name token as
value; the default value is <val>spqr</val>, in memory of its creator.</p>

Within technical documentation, it is also often necessary to provide more extended examples of usage or to present passages of markup for discussion. The following special elements are provided for these purposes:

  • eg (example) contains any kind of illustrative example.
  • egXML (example of XML) a single XML fragment demonstrating the use of some XML, such as elements, attributes, or processing instructions, etc., in which the egXML element functions as the root element.

Like the code element, the egXML element is used to mark strings of formal code, or passages of XML markup. The eg element may be used to enclose any kind of example, which will typically be rendered as a distinct block, possibly using particular formatting conventions, when the document is processed. It is a specialized form of the more general q element provided by the TEI core module. In documents containing examples of XML markup, the egXML element should be used for preference, as further discussed below in 23.4.2 Exemplification of Components, since the content of this element can be checked for well-formedness.

These elements are added to the class model.egLike when this module is included in a schema. That class is a part of the general model.inter class, thus permitting eg or egXML elements to appear either within or between paragraph-like elements.

⚓︎23.1.2 Element and Attribute Descriptions

Within the body of a document using this module, the following elements may be used to reference parts of the specification elements discussed in section 23.3 Specification Elements, in particular the brief prose descriptions these provide for elements and attributes.

  • specList (specification list) marks where a list of descriptions is to be inserted into the prose documentation.
  • specDesc (specification description) indicates that a description of the specified element, class, or macro should be included at this point within a document.
    atts(attributes) supplies attribute names for which descriptions should additionally be obtained.
TEI practice recommends that a specList listing the elements under discussion introduce each subsection of a module's documentation. The source for the present section, for example, begins as follows:
<div>
 <head>Element and Attribute Descriptions</head>
 <p>Within the body of a document using this module, the following elements may be used to reference parts of the specification elements
   discussed in section <ptr target="#TDcrystals"/>, in particular the brief prose descriptions these provide for elements and attributes.
 <specList>
   <specDesc key="specList"/>
   <specDesc key="specDescatts="atts"/>
  </specList>
 </p>
 <p>TEI practice recommends that a <gi>specList</gi> listing the elements under … </p>
<!-- ... -->
</div>

When formatting the ptr element in this example, an ODD processor might simply generate the section number and title of the section referred to, perhaps additionally inserting a link to the section. In a similar way, when processing the specDesc elements, an ODD processor may recover relevant details of the elements being specified (specList and specDesc in this case) from their associated declaration elements: typically, the details recovered will include a brief description of the element and its attributes. These, and other data, will be stored in a specification element elsewhere within the current document, or they may be supplied by the ODD processor in some other way, for example from a database. For this reason, the link to the required specification element is always made using a TEI-defined key rather than an XML IDREF value. The ODD processor uses this key as a means of accessing the specification element required. There is no requirement that this be performed using the XML ID/IDREF mechanism, but there is an assumption that the identifier be unique.

A specDesc generates in the documentation the identifier, and also the contents of the desc child of whatever specification element is indicated by its key attribute, as in the example above. Documentation for any attributes specified by the atts attribute will also be generated as an associated attribute list.

⚓︎23.2 Modules and Schemas

As mentioned above, the primary purpose of this module is to facilitate the documentation and creation of an XML schema derived from the TEI Guidelines. The following elements are provided for this purpose:

  • schemaSpec (schema specification) generates a TEI-conformant schema and documentation for it.
  • moduleSpec (module specification) documents the structure, content, and purpose of a single module, i.e. a named and externally visible group of declarations.
  • moduleRef (module reference) references a module which is to be incorporated into a schema.
    includesupplies a list of the elements which are to be copied from the specified module into the schema being defined.
    exceptsupplies a list of the elements which are not to be copied from the specified module into the schema being defined.
  • specGrp (specification group) contains any convenient grouping of specifications for use within the current module.
  • specGrpRef (reference to a specification group) indicates that the declarations contained by the specGrp referenced should be inserted at this point.
  • attRef (attribute pointer) points to the definition of an attribute or group of attributes.
  • elementRef points to the specification for some element which is to be included in a schema.

A module is a convenient way of grouping together element and other declarations, and of associating an externally-visible name with the resulting group. A specification group performs essentially the same function, but the resulting group is not accessible outside the scope of the ODD document in which it is defined, whereas a module can be accessed by name from any TEI schema specification. Elements, and their attributes, element classes, and patterns are all individually documented using further elements described in section 23.3 Specification Elements below; part of that specification includes the name of the module to which the component belongs.

An ODD processor generating XML DTD or schema fragments from a document marked up according to the recommendations of this chapter will generate such fragments for each moduleSpec element found. For example, the chapter documenting the TEI module for names and dates contains a module specification like the following:
<moduleSpec ident="namesdates">
 <idno type="FPI">Names and Dates</idno>
 <desc>Additional elements for names and dates</desc>
</moduleSpec>
together with specifications for all the elements, classes, and patterns which make up that module, expressed using elementSpec, classSpec, or macroSpec elements as appropriate. (These elements are discussed in section 23.3 Specification Elements below.) Each of those specifications carries a module attribute, the value of which is namesdates. An ODD processor encountering the moduleSpec element above can thus generate a schema fragment for the TEI namesdates module that includes declarations for all the elements (etc.) which reference it.

In most realistic applications, it will be desirable to combine more than one module together to form a complete schema. A schema consists of references to one or more modules or specification groups, and may also contain explicit declarations or redeclarations of elements (see further 23.8.1 TEI customizations). Any combination of modules can be used to create a schema 97

A schema can combine references to TEI modules with references to other (non-TEI) modules using different namespaces, for example to include mathematical markup expressed using MathML in a TEI document. By default, the effect of combining modules is to allow all of the components declared by the constituent modules to coexist (where this is syntactically possible: where it is not—for example, because of name clashes—a schema cannot be generated). It is also possible to over-ride declarations contained by a module, as further discussed in section 23.8.1 TEI customizations

It is often convenient to describe and operate on sets of declarations smaller than the whole, and to document them in a specific order: such collections are called specGrps (specification groups). Individual specGrp elements are identified using the global xml:id attribute, and may then be referenced from any point in an ODD document using the specGrpRef element. This is useful if, for example, it is desired to describe particular groups of elements in a specific sequence. Note however that the order in which element declarations appear within the schema code generated from an ODD file element is not in general affected by the order of declarations within a specGrp.

An ODD processor will generate a piece of schema code corresponding with the declarations contained by a specGrp element in the documentation being output, and a cross-reference to such a piece of schema code when processing a specGrpRef. For example, if the input text reads
<p>This module contains three red elements: <specGrp xml:id="RED">
  <elementSpec ident="beetroot">
<!-- ... -->
  </elementSpec>
  <elementSpec ident="east">
<!-- ... -->
  </elementSpec>
  <elementSpec ident="rose">
<!-- ... -->
  </elementSpec>
 </specGrp> and two blue ones: <specGrp xml:id="BLUE">
  <elementSpec ident="sky">
<!-- ... -->
  </elementSpec>
  <elementSpec ident="bayou">
<!-- ... -->
  </elementSpec>
 </specGrp>
</p>
then the output documentation will replace the two specGrp elements above with a representation of the schema code declaring the elements <beetroot>, <east>, and <rose> and that declaring the elements <sky> and <bayou> respectively. Similarly, if the input text contains elsewhere a passage such as
<div>
 <head>An overview of the imaginary module</head>
 <p>The imaginary module contains declarations for coloured things: <specGrpRef target="#RED"/>
  <specGrpRef target="#BLUE"/>
 </p>
</div>
then the specGrpRef elements may be replaced by an appropriate piece of reference text such as ‘The RED elements were declared in section 4.2 above’, or even by a copy of the relevant declarations. As stated above, the order of declarations within the imaginary module described above will not be affected in any way. Indeed, it is possible that the imaginary module will contain declarations not present in any specification group, or that the specification groups will refer to elements that come from different modules. Specification groups are always local to the document in which they are defined, and cannot be referenced externally (unlike modules).

⚓︎23.3 Specification Elements

The following elements are used to specify elements, classes, patterns, and datatypes:

  • elementSpec (element specification) documents the structure, content, and purpose of a single element type.
  • classSpec (class specification) contains reference information for a TEI element class; that is a group of elements which appear together in content models, or which share some common attribute, or both.
    generateindicates which alternation and sequence instantiations of a model class may be referenced. By default, all variations are permitted.
  • macroSpec (macro specification) documents the function and implementation of a pattern.
  • dataSpec (datatype specification) documents a datatype.

Unlike most elements in the TEI scheme, each of these ‘specification elements’ has a fairly rigid internal structure consisting of a large number of child elements which are always presented in the same order. Furthermore, since these elements all describe markup objects in broadly similar ways, they have several child elements in common. In the remainder of this chapter, we discuss first the elements which are common to all the specification elements, and then those which are specific to a particular type.

Specification elements may appear at any point in an ODD document, both between and within paragraphs as well as inside a specGrp element, but the specification element for any particular component may only appear once (except in the case where a modification is being defined; see further 23.8.1 TEI customizations). The order in which they appear will not affect the order in which they are presented within any schema module generated from the document. In documentation mode, however, an ODD processor will output the schema declarations corresponding with a specification element at the point in the text where they are encountered, provided that they are contained by a specGrp element, as discussed in the previous section. An ODD processor will also associate all declarations found with the nominated module, thus including them within the schema code generated for that module, and it will also generate a full reference description for the object concerned in a catalogue of markup objects. These latter two actions always occur irrespective of whether or not the declaration is included in a specGrp.

⚓︎23.4 Common Elements

This section discusses the child elements common to all of the specification elements; some of these are defined in the core module (3.4.1 Terms and Glosses). These child elements are used to specify the naming, description, exemplification, and classification of the specification elements.

⚓︎23.4.1 Description of Components

  • gloss (gloss) identifies a phrase or word used to provide a gloss or definition for some other word or phrase.
  • desc (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented.
  • equiv (equivalent) specifies a component which is considered equivalent to the parent element, either by co-reference, or by external link.
    uri(uniform resource identifier) references the underlying concept of which the parent is a representation by means of some external identifier.
    filterreferences an external script which contains a method to transform instances of this element to canonical TEI.
    namea single word which follows the rules defining a legal XML name (see https://www.w3.org/TR/REC-xml/#dt-name), naming the underlying concept of which the parent is a representation.
    predicate [att.predicate]the condition under which the element bearing this attribute applies, given as an XPath predicate expression.
  • altIdent (alternate identifier) supplies the recommended XML name for an element, class, attribute, etc. in some language.
  • listRef (list of references) supplies a list of significant references in the current document or elsewhere.
  • remarks (remarks) contains any commentary or discussion about the usage of an element, attribute, class, or entity not otherwise documented within the containing element.
The gloss element may be used to provide a brief explanation for the name of the object if this is not self-explanatory. For example, the specification for the element ab used to mark arbitrary blocks of text begins as follows:
<elementSpec module="linkingident="ab">
 <gloss>anonymous block</gloss>
<!--... -->
</elementSpec>
A gloss may also be supplied for an attribute name or an attribute value in similar circumstances:
<valList type="open">
 <valItem ident="susp">
  <gloss>suspension</gloss>
  <desc>the abbreviation provides the first letter(s) of the word or phrase, omitting the
     remainder.</desc>
 </valItem>
 <valItem ident="contr">
  <gloss>contraction</gloss>
  <desc>the abbreviation omits some letter(s) in the middle.</desc>
 </valItem>
<!--...-->
</valList>

Note that the gloss element is needed to explain the significance of the identifier for an item only when this is not apparent, for example because it is abbreviated, as in the above example. It should not be used to provide a full description of the intended meaning (this is the function of the desc element), nor to comment on equivalent values in other schemes (this is the purpose of the equiv element), nor to provide alternative versions of the ident attribute value in other languages (this is the purpose of the altIdent element).

The contents of the desc element provide a brief characterization of the intended function of the object being documented in a form that permits its quotation out of context, as in the following example:
<elementSpec module="coreident="foreign">
<!--... -->
 <desc xml:lang="en"
  versionDate="2007-07-21">
identifies a word or phrase as belonging to some
   language other than that of the surrounding text. </desc>
<!--... -->
</elementSpec>
By convention, a desc element begins with a verb such as contains, indicates, specifies, etc. and contains a single clause.

Both the gloss and desc elements (in addition to exemplum, remarks, and valDesc) are members of att.translatable, and thus carry the versionDate attributre. Where specifications are supplied in multiple languages, these elements may be repeated as often as needed. Each such element should carry both an xml:lang and a versionDate attribute to indicate the language used and the date on which the translated text was last checked against its source.

The equiv element is used to document equivalencies between the concept represented by this object and the same concept as described in other schemes or ontologies. The uri attribute is used to supply a pointer to some location where such external concepts are defined. For example, to indicate that the TEI death element corresponds to the concept defined by the CIDOC CRM category E69, the declaration for the former might begin as follows:
<elementSpec module="namesdates"
 ident="death">

 <equiv name="E69"
  uri="http://cidoc.ics.forth.gr/"/>

 <desc>
<!--... -->
 </desc>
</elementSpec>
The equiv element may also be used to map newly-defined elements onto existing constructs in the TEI, using the filter and name attributes to point to an implementation of the mapping. This is useful when a TEI customization (see 24.3 Customization) defines ‘shortcuts’ for convenience of data entry or markup readability. For example, suppose that in some TEI customization an element <bo> has been defined which is conceptually equivalent to the standard markup construct <hi rend='bold'>. The following declarations would additionally indicate that instances of the <bo> element can be converted to canonical TEI by obtaining a filter from the URI specified, and running the procedure with the name bold. The mimeType attribute specifies the language (in this case XSL) in which the filter is written:
<elementSpec ident="bo"
 ns="http://www.example.com/ns/nonTEI">

 <equiv filter="http://www.example.com/equiv-filter.xsl"
  mimeType="text/xslname="bold"/>

 <gloss>bold</gloss>
 <desc>contains a sequence of characters rendered in a bold face.</desc>
<!-- ... -->
</elementSpec>
The altIdent element is used to provide an alternative name for an object, for example using a different natural language. Thus, the following might be used to indicate that the abbr element should be identified using the German word Abkürzung:
<elementSpec ident="abbrmode="change">
 <altIdent xml:lang="de">Abkürzung</altIdent>
<!--...-->
</elementSpec>
In the same way, the following specification for the graphic element indicates that the attribute url may also be referred to using the alternate identifier href:
<elementSpec ident="graphicmode="change">
 <attList>
  <attDef mode="changeident="url">
   <altIdent>href</altIdent>
  </attDef>
<!-- ... -->
 </attList>
</elementSpec>

By default, the altIdent of a component is identical to the value of its ident attribute.

The remarks element contains any additional commentary about how the item concerned may be used, details of implementation-related issues, suggestions for other ways of treating related information etc., as in the following example:
<elementSpec module="coreident="foreign">
<!--... -->
 <remarks>
  <p>This element is intended for use only where no other element is available to mark the phrase
     or words concerned. The global <att>xml:lang</att> attribute should be used in preference to
     this element where it is intended to mark the language of the whole of some text element.</p>
  <p>The <gi>distinct</gi> element may be used to identify phrases belonging to sublanguages or
     registers not generally regarded as true languages.</p>
 </remarks>
<!--... -->
</elementSpec>
A specification element will usually conclude with a list of references, each tagged using the standard ptr element, and grouped together into a listRef element: in the case of the foreign element discussed above, the list is as follows:
<listRef>
 <ptr target="#COHQHF"/>
</listRef>
where the value COHQF is the identifier of the section in these Guidelines where this element is fully documented.

⚓︎23.4.2 Exemplification of Components

  • exemplum (exemplum) groups an example demonstrating the use of an element along with optional paragraphs of commentary.
  • eg (example) contains any kind of illustrative example.
  • egXML (example of XML) a single XML fragment demonstrating the use of some XML, such as elements, attributes, or processing instructions, etc., in which the egXML element functions as the root element.
    validindicates the intended validity of the example with respect to a schema.
    source [att.global.source]specifies the source from which some aspect of this element is drawn.

The exemplum element is used to combine a single illustrative example with an optional paragraph of commentary following or preceding it. The illustrative example itself may be marked up using either the eg or the egXML element.

The source attribute may be used on either element to indicate the source from which an example is taken, typically by means of a pointer to an entry in an associated bibliography, as in the following example:
  <exemplum versionDate="2008-04-06" xml:lang="fr">   <p>L'element <gi>foreign</gi> s'applique également aux termes considerés étrangers.</p>     <egXML xmlns="http://www.tei-c.org/ns/Examples" source="#fr-ex-Queneau_Journ">       <p>Pendant ce temps-là, dans le bureau du rez- de-chaussée, les secrétaires faisaient du         <foreign xml:lang="en">hulla-hoop</foreign>.</p>     </egXML>   </exemplum>

When, as here, an example contains valid XML markup, the egXML element should be used. In such a case, it will clearly be necessary to distinguish the markup within the example from the markup of the document itself. In an XML environment, this is easily done by using a different name space for the content of the egXML element. For example:

<p>The <gi>term</gi> element may be used 
to mark any technical term, thus:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  This <term>recursion</term> is 
  giving me a headache.</egXML></p>

Alternatively, the XML tagging within an example may be ‘escaped’, either by using entity references to represent the opening angle bracket, or by wrapping the whole example in a CDATA marked section:

<p>The <gi>term</gi> element may be used 
to mark any technical term, thus:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
  This &lt;term&gt;recursion&lt;/term&gt; is 
  giving me a headache.</egXML></p>

or, equivalently:

<p>The <gi>term</gi> element may be used 
to mark any technical term, thus:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><![CDATA[
  This <term>recursion</term> is 
  giving me a headache.]]></egXML></p>

However, escaping the markup in this way will make it impossible to validate, and should therefore generally be avoided.

If the XML contained in an example is not well-formed then it must either be enclosed in a CDATA marked section, or ‘escaped’ as above: this applies whether the eg or egXML is used. The valid attribute on egXML may be used to indicate the XML validity of the example with respect to some schema, as being valid, invalid, or feasibly valid.

The egXML element should not be used to tag non-XML examples: the general purpose eg or q elements should be used for such purposes.

⚓︎23.4.3 Classification of Components

In the TEI scheme elements are assigned to one or more classes, which may themselves have subclasses. The following elements are used to indicate class membership:

  • classes (classes) specifies all the classes of which the documented element or class is a member or subclass.
  • memberOf specifies class membership of the documented element or class.
    keyspecifies the identifier for a class of which the documented element or class is a member or subclass.

The classes element appears within either the elementSpec or classSpec element. It specifies the classes of which the element or class concerned is a member by means of one or more memberOf child elements. Each such element references a class by means of its key attribute. Classes themselves are defined by the classSpec element described in section 23.6 Class Specifications below.

For example, to show that the element gi is a member of the class model.phrase.xml, the elementSpec which documents this element contains the following classes element:
<classes>
 <memberOf key="model.phrase.xml"/>
</classes>

⚓︎23.5 Element Specifications

The elementSpec element is used to document an element type, together with its associated attributes. In addition to the elements listed above, it may contain the following subcomponents:

  • content (content model) contains a declaration of the intended content model for the element (or other construct) being specified.
    autoPrefixcontrols whether or not pattern names generated in the corresponding RELAX NG schema source are automatically prefixed to avoid potential nameclashes.
  • constraintSpec (constraint on schema) contains a formal constraint, typically expressed in a rule-based schema language, to which a construct must conform in order to be considered valid
    schemesupplies the name of the language in which the constraints are defined Suggested values include: 1] schematron (ISO Schematron)
  • attList (attribute list) contains documentation for all the attributes associated with this element, as a series of attDef elements.
    org(organization) specifies whether only one (choice) or all (group) of the attributes in the list are available.
  • model describes the processing intended for a specified element.
    behaviournames the process or function which this processing model uses in order to produce output. Suggested values include: 1] alternate; 2] anchor; 3] block; 4] body; 5] break; 6] cell; 7] cit; 8] document; 9] figure; 10] glyph; 11] graphic; 12] heading; 13] index; 14] inline; 15] link; 16] list; 17] listItem; 18] metadata; 19] note; 20] omit; 21] paragraph; 22] row; 23] section; 24] table; 25] text; 26] title

These subcomponents are discussed in the following sections.

⚓︎23.5.1 Defining Content Models

As described in Content Models: an Example and Content Model, the content of the element being defined — that is, what elements are allowed inside it, and in what order they are permitted — is described by its content model. The content model is defined by the content child of elementSpec. There are three distinctly different ways of specifying a content model:

  • The content model can be described using TEI elements defined by this chapter, as discussed in 23.5.1.1 Defining Content Models: TEI immediately below. Two such TEI elements that may be used to define a content model are dataRef and valList. But because these are most often used to define attribute values, they are discussed separately near the beginning and towards the end of 23.5.3.2 Value Specification, respectively.
  • Alternatively, and primarily for backwards compatibility, the content model may be expressed using a RELAX NG pattern. This is discussed in 23.5.1.2 Defining Content Models: RELAX NG, below.
  • Lastly, content models may be expressed using a schema language other than TEI or RELAX NG, but no further recommendations on doing so are provided by these Guidelines.
⚓︎23.5.1.1 Defining Content Models: TEI

In the simplest case, the content model of an element may be expressed using a single empty element as the only child of content. This describes the element being defined as empty, meaning a valid instance of said element can not have any content.98

  • empty indicates the presence of an empty node within a content model.

More commonly, one or more of the following elements are used to define a content model:

  • elementRef points to the specification for some element which is to be included in a schema.
  • anyElement indicates the presence of any elements in a content model.
  • classRef points to the specification for an attribute or model class which is to be included in a schema.
  • macroRef points to the specification for some pattern which is to be included in a schema.

An elementRef provides the name of an element which may appear at a certain point in a content model. An anyElement also asserts that an element may appear at a certain point in a content model, but rather than providing the name of a particular element type that may appear, any element regardless of its name may appear (and may have any attributes). A classRef provides the name of a model class, members of which may appear at a certain point in content model.99 A macroRef provides the name of a predefined macro, the expansion of which is to be inserted at a certain point in a content model.

These three elements are all members of an attribute class which provides attributes that further modify their significance as follows:

  • att.repeatable provides attributes for the elements which define component parts of a content model.
    minOccurs(minimum number of occurences) indicates the smallest number of times this component may occur.
    maxOccurs(maximum number of occurences) indicates the largest number of times this component may occur.
Additionally, two wrapper elements are provided to indicate whether the components listed as their children form a sequence or an alternation:
  • sequence indicates that the constructs referenced by its children form a sequence.
  • alternate indicates that the constructs referenced by its children form an alternation.
These two wrapper elements are also members of att.repeatable. References listed as children of sequence must appear in the order and cardinality specified. Only one of the references listed as children of alternate may appear, although the cardinality of the alternate itself applies. Thus the following fanciful content model permits either any number of ptr elements (except zero) or any number of ref elements (except zero); at least one element must be present, but having both a ptr and a ref would be invalid.
<content>
 <alternate>
  <elementRef key="ptrminOccurs="1"
   maxOccurs="unbounded"/>

  <elementRef key="refminOccurs="1"
   maxOccurs="unbounded"/>

 </alternate>
</content>
However, the following content model permits any number of either ptr or ref elements (except zero); one element must be present, and having both ptr elements and ref elements (even intermixed) would be valid.
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">

  <elementRef key="ptr"/>
  <elementRef key="ref"/>
 </alternate>
</content>
The sequence and alternate elements may be used in combination with great expressive power. For example, in the following example, which might be imagined as a clean replacement for the content of the choice element, one and only one of the element pairs sic and corr, orig and reg, or abbr and expan is allowed.
<content>
 <alternate>
  <sequence>
   <elementRef key="sic"/>
   <elementRef key="corr"/>
  </sequence>
  <sequence>
   <elementRef key="orig"/>
   <elementRef key="reg"/>
  </sequence>
  <sequence>
   <elementRef key="abbr"/>
   <elementRef key="expan"/>
  </sequence>
 </alternate>
</content>
In the following example, which might be imagined as a clean replacement for the content of the address element, the encoder is given a choice of either:
<content>
 <alternate>
  <sequence>
   <elementRef key="street"/>
   <elementRef key="placeName"/>
   <elementRef key="postCode"/>
   <elementRef key="countryminOccurs="0"
    maxOccurs="1"/>

  </sequence>
  <elementRef key="addrLineminOccurs="2"
   maxOccurs="4"/>

 </alternate>
</content>

In addition to expressing where certain elements, members of a class of elements, or constructs matching a predefined macro may occur inside an element, a content model may permit a string of zero or more Unicode characters to occur at a certain point in the content model. This is indicated by supplying the element textNode within the content element.

  • textNode indicates the presence of a text node in a content model.

If nothing but a textNode element is present inside a content element, valid instances of the element being defined may contain a sequence of zero or more Unicode characters, but may not contain any elements.100

⚓︎23.5.1.2 Defining Content Models: RELAX NG
Element content models may also be defined using RELAX NG patterns. Here is a very simple example
<content>
 <rng:text/>
</content>
The element within whose specification element this content element appears will have a content model which is expressed in RELAX NG as text, using the RELAX NG namespace. This model will be copied unchanged to the output when RELAX NG schemas are being generated. When an XML DTD is being generated, an equivalent declaration (in this case (#PCDATA)) will be output.
Here is a more complex example:
<content>
 <rng:group>
  <rng:ref name="fileDesc"/>
  <rng:zeroOrMore>
   <rng:ref name="model.teiHeaderPart"/>
  </rng:zeroOrMore>
  <rng:optional>
   <rng:ref name="revisionDesc"/>
  </rng:optional>
 </rng:group>
</content>
This is the content model for the teiHeader element, expressed in the RELAX NG syntax, which again is copied unchanged to the output during schema generation. The equivalent DTD notation generated from this is (fileDesc, (%model.teiHeaderPart;)*, revisionDesc?).

The RELAX NG language does not formally distinguish element names, attribute names, class names, or macro names: all names are patterns which are handled in the same way, as the above example shows. Within the TEI scheme, however, different naming conventions are used to distinguish amongst the objects being named. Unqualified names (fileDesc, revisionDesc) are always element names. Names prefixed with model. or att. (e.g. model.teiHeaderPart and att.typed) are always class names. In DTD language, classes are represented by parameter entities (%model.teiHeaderPart; in the above example); see further 1 The TEI Infrastructure.

The RELAX NG pattern names generated by an ODD processor by default include a special prefix, the default value for which is set using the prefix attribute on schemaSpec. The purpose of this is to ensure that the pattern name generated is uniquely identified as belonging to a particular schema, and thus avoid name clashes. For example, in a RELAX NG schema combining the TEI element ident with another element called ident from some other vocabulary, the former will be defined by a pattern called TEI_ident rather than simply ident. Most of the time, this behaviour is entirely transparent to the user; the one occasion when it is not will be where a content model (expressed using RELAX NG syntax) needs explicitly to reference either the TEI ident or the other one. In such a situation, the autoPrefix attribute on content may be used. For example, suppose that we wish to define a content model for term which permits either a TEI ident or the ident defined by some other vocabulary. A suitable content model would be generated from the following content element:
<content autoPrefix="false">
 <rng:choice>
  <rng:ref name="TEI_ident"/>
  <rng:ref name="ident"/>
 </rng:choice>
</content>

⚓︎23.5.2 Additional Constraints

In addition to the content element, a set of general constraintSpec elements can be used to express rules about the validity of an element. Like some other specification elements, they are identifiable (using the ident attribute) in order that a TEI customization may override, delete, or change them individually. Each constraintSpec can be expressed in any notation which is found useful; the notation used must be recorded using the scheme attribute.

Schematron is an ISO standard (ISO/IEC 19757-3:2006) that defines a simple XML vocabulary for an ‘assertion language’ which provides a powerful way of expressing constraints on the content of any XML document in addition to those provided by other schema languages. Such constraints can be embedded within a TEI schema specification — including a TEI customization specification — using the methods exemplified in this chapter.101 An ODD processor will typically process any constraintSpec elements in a TEI specification whose scheme attribute indicates that they are expressed in Schematron to create an ISO Schematron schema which may be used to validate document instances. The ISO Schematron schema may be a free-standing document, or may be embedded in the RELAX NG schema output of the ODD processor.

Constraints are generally used to model local rules which may be outside the scope of the target schema language. For example, in earlier versions of these Guidelines several constraints on the usage of the attributes of the TEI element relation were expressed informally as follows: ‘only one of the attributes active and mutual may be supplied; the attribute passive may be supplied only if the attribute active is supplied.’. In the current version of the Guidelines, constraint specifications expressed as Schematron rules have been added, as follows:
<constraintSpec ident="ref-or-key-or-name"
 scheme="schematron">

 <constraint>
  <sch:rule context="tei:relation">
   <sch:assert test="@ref or @key or @name">One of the attributes @name, @ref, or @key must be supplied</sch:assert>
  </sch:rule>
 </constraint>
</constraintSpec>
<constraintSpec ident="active-mutual"
 scheme="schematron">

 <constraint>
  <sch:rule context="tei:relation">
   <sch:report test="@active and @mutual">Only one of the attributes @active and @mutual may be supplied</sch:report>
  </sch:rule>
 </constraint>
</constraintSpec>
<constraintSpec ident="active-passive"
 scheme="schematron">

 <constraint>
  <sch:rule context="tei:relation">
   <sch:report test="@passive and not(@active)">the attribute @passive may be supplied only if the attribute @active is supplied</sch:report>
  </sch:rule>
 </constraint>
</constraintSpec>
Note that the ‘tei:’ prefix needs to be bound to the TEI namespace using the Schematron language <ns> element. This can be done using the constraintDecl element, which may appear in either the encodingDesc or schemaSpec element. A constraintDecl contains declarations specific to all the constraintSpec elements in the current TEI document whose scheme matches that of the constraintDecl.
<constraintDecl scheme="schematron"
 queryBinding="xslt3">

 <sch:ns prefix="tei"
  uri="http://www.tei-c.org/ns/1.0"/>

 <sch:ns prefix="xi"
  uri="http://www.w3.org/2001/XInclude"/>

 <sch:let name="edition"
  value="substring-before( /*/tei:teiHeader//tei:editionStmt/tei:edition/@n, '.')"/>

 <sch:let name="uses_old_encoding"
  value="$edition cast as xs:integer lt 3"/>

</constraintDecl>
In this example the Schematron query language binding is set (to xslt3) for use by a Schematron processor, two namespace prefixes are bound to namespace URIs for later use within constraint elements, and the $uses_old_encoding variable is set (to either true() or false()) so that Schematron assertions elsewhere in the TEI document can easily test whether the edition number of the document being checked is 3 or more. (Presumably in this project there are minor encoding differences between the older and newer editions which do not merit an entirely different schema, but do merit the occasional different constraint.)
The following two examples are written presuming that the constraintDecl above is in force, that is that the Schematron processor will use an xslt3 binding, that ‘tei:’ is bound to the TEI namespace, that ‘xi:’ is bound to the XInclude namespace, and that the the variable $uses_old_encoding is defined as a boolean. The first example models the constraint that a TEI div must contain either no subdivisions or at least two of them, with the added complication that larger subdivisions are provided as separate sectNN.xml files read in using XInclude:
<constraintSpec ident="subclauses"
 scheme="schematron">

 <constraint>
  <sch:rule context="tei:div">
   <sch:report test="count( tei:div | xi:include[ contains( @href, 'sect') ] ) eq 1">if it contains any
       subdivisions, a division must contain at least two of them</sch:report>
  </sch:rule>
 </constraint>
</constraintSpec>
The second example demonstrates that Schematron rules are also useful where an application needs to enforce rules on attribute values, in this case checking that various types of title are provided in the TEI header:
<constraintSpec ident="introtitle"
 scheme="schematron">

 <constraint>
  <sch:rule context="tei:teiHeader">
   <sch:assert test="tei:fileDesc/tei:titleStmt/tei:title[@type='introductory']"> an introductory component of the title is expected</sch:assert>
  </sch:rule>
 </constraint>
</constraintSpec>
<constraintSpec ident="maintitle"
 scheme="schematron">

 <constraint>
  <sch:rule context="tei:teiHeader[ $uses_old_encoding ]">
   <sch:assert test="tei:fileDesc/tei:titleStmt/tei:title[@type eq 'main']"> a main title must be supplied</sch:assert>
  </sch:rule>
  <sch:rule context="tei:teiHeader[ not( $uses_old_encoding ) ]">
   <sch:assert test="tei:fileDesc/tei:titleStmt/tei:title[ not( @type eq 'sub' ) ]"> a main title must be supplied</sch:assert>
  </sch:rule>
 </constraint>
</constraintSpec>
As a further example, Schematron may be used to enforce rules applicable to a TEI document which is going to be rendered into accessible HTML, for example to check that some sort of content is available from which the alt attribute of an HTML <img> can be created:
<constraintSpec ident="alt"
 scheme="schematron">

 <constraint>
  <sch:pattern id="altTags">
   <sch:rule context="tei:figure">
    <sch:assert test="tei:figDesc or tei:head"> You should provide information in a figure from
         which we can construct an alt attribute in HTML </sch:assert>
   </sch:rule>
  </sch:pattern>
 </constraint>
</constraintSpec>
Schematron rules can also be used to enforce other HTML accessibility rules about tables; note here the use of a report and an assertion within one pattern:
<constraintSpec ident="tables"
 scheme="schematron">

 <constraint>
  <sch:pattern id="Tables">
   <sch:rule context="tei:table">
    <sch:assert test="tei:head">A <table> should have a caption, using a <head> element</sch:assert>
    <sch:report test="parent::tei:body">Do not use tables to lay out the document body</sch:report>
   </sch:rule>
  </sch:pattern>
 </constraint>
</constraintSpec>
Constraints can be expressed using any convenient language. The following example uses a pattern matching language called SPITBOL to express the requirement that title and author should be different.
<constraintSpec ident="title_ne_author"
 scheme="SPITBOL">

 <constraint> (output = leq(title,author) "title and author cannot be the same") </constraint>
</constraintSpec>
Note that the value of scheme is SPITBOL. In order to properly constrain and document the values of scheme used in their customization file, a project may wish to create a customization that (among other things) adds and explains this value for use in validating their customization file. Thus using schemes other than those provided for by the TEI (currently only schematron) may require somewhat more effort when creating a customization file. Such private schemes will generally be even more problematic on implementation of the constraints themselves, as it may require significant programming work. The TEI only provides this capability for the suggested value (schematron).

⚓︎23.5.3 Attribute List Specification

The attList element is used to document information about a collection of attributes, either within an elementSpec, or within a classSpec. An attribute list can be organized either as a group of attribute definitions, all of which are understood to be available, or as a choice of attribute definitions, of which only one is understood to be available. An attribute list may thus contain nested attribute lists.

The attribute org is used to indicate whether its child attDef elements are all to be made available, or whether only one of them may be used. For example, the attribute list for the element moduleRef contains a nested attribute list to indicate that either the include or the except attribute may be supplied, but not both:
<attList>
<!-- other attribute definitions here -->
 <attList org="choice">
  <attDef ident="include">
<!-- definition for the include attribute -->
  </attDef>
  <attDef ident="except">
<!-- definition for the except attribute -->
  </attDef>
 </attList>
</attList>

The attDef element is used to document a single attribute, using an appropriate selection from the common elements already mentioned and the following :

  • attDef (attribute definition) contains the definition of a single attribute.
    usagespecifies the optionality of the attribute.
  • datatype (datatype) specifies the declared value for an attribute, by referring to any datatype defined by the chosen schema language.
    minOccurs(minimum number of occurences) indicates the minimum number of times this datatype may occur in an instance of the attribute being defined.
    maxOccurs(maximum number of occurences) indicates the maximum number of times this datatype may occur in an instance of the attribute being defined.
  • dataRef identifies the datatype of an attribute value, either by referencing an item in an externally defined datatype library, or by pointing to a TEI-defined data specification
  • defaultVal (default value) specifies the default declared value for an attribute.
  • valDesc (value description) specifies any semantic or syntactic constraint on the value that an attribute may take, additional to the information carried by the datatype element.
  • valList (value list) contains one or more valItem elements defining possible values.
  • valItem documents a single value in a predefined list of values.

The attList within an elementSpec is used to specify only the attributes which are specific to that particular element. Instances of the element may carry other attributes which are declared by the classes of which the element is a member. These extra attributes, which are shared by other elements, or by all elements, are specified by an attList contained within a classSpec element, as described in section 23.6 Class Specifications below.

⚓︎23.5.3.1 Datatypes

The ‘datatype’ (i.e. the kind of value) for an attribute may be specified using the elements datatype and dataRef. A datatype may be defined in any of the following three ways:

  • by reference to an existing TEI datatype definition, itself defined by a dataSpec;
  • by use of its name in XML Schema Part 2: Datatypes Second Edition, the widely used datatype library maintained by the W3C as part of the definition of its schema language;
  • by referencing its URI within some other datatype library.

The TEI defines a number of datatypes, each with an identifier beginning teidata., which are used in preference to the datatypes available natively from a target schema such as RELAX NG or W3C Schema since the facilities provided by different schema languages vary so widely. The TEI datatypes available are described in section 1.4.2 Datatype Specifications above. Note that each is, of necessity, mapped eventually to an externally defined datatype such as W3C Schema's text or name, possibly combined to give more expressivity, or constrained to a particular defined usage.

It is possible to reference a W3C schema datatype directly using name. In this case, the child dataFacet can be used instead of restriction to set W3C schema compliant restrictions on the datatype. A dataFacet is particularly useful for restrictions that can be difficult to impose and to read as a regular expression pattern.
<dataRef name="decimal">
 <dataFacet name="maxInclusive"
  value="360.0"/>

 <dataFacet name="minInclusive"
  value="-360.0"/>

</dataRef>
Note that restrictions are either expressed with restriction or dataFacet, never both.

A datatype may be used to constrain the textual content of an element, rather than the value of an attribute. But because they are intended for use in defining ranges of attribute values, datatypes may not contain elements or attributes.

The attributes minOccurs and maxOccurs are available for the case where an attribute may take more than one value of the type specified. For example, the columns attribute of the layout element can have one or two non-negative integers as its value:
<attDef ident="columns">
 <gloss xml:lang="en"
  versionDate="2007-06-12">
columns</gloss>
 <desc versionDate="2005-01-14"
  xml:lang="en">
specifies the number of columns per page</desc>
 <datatype minOccurs="1maxOccurs="2">
  <dataRef key="teidata.count"/>
 </datatype>
 <remarks xml:lang="en"
  versionDate="2017-07-09">

  <p>If a single number is given, all pages referenced
     have this number of columns. If two numbers are given,
     the number of columns per page varies between the
     values supplied. Where <att>columns</att> is omitted
     the number is assumed to be <val>1</val>.</p>
 </remarks>
</attDef>
indicating that the target attribute may take any number of values, each being of the same datatype, namely the TEI data specification teidata.pointer. As is usual in XML, multiple values for a single attribute are separated by one or more white space characters. Hence, values such as #a #b #c or http://example.org http://www.tei-c.org/index.xml may be supplied.
⚓︎23.5.3.2 Value Specification
The valDesc element may be used to describe constraints on data content in an informal way: for example
<valDesc>must point to another <gi>align</gi>
element logically preceding this one.</valDesc>
<valDesc>Values should be Library of Congress
subject headings.</valDesc>
<valDesc>A bookseller's surname, taken from the list
in <title>Pollard and Redgrave</title>
</valDesc>
Constraints expressed in this way are purely documentary; to enforce them, the constraintSpec element described in section 23.5.2 Additional Constraints must be used. For example, to specify that an imaginary attribute ageAtDeath must take positive integer values less than 150, the datatype teidata.numeric might be used in combination with a constraintSpec such as the following:
<attDef ident="ageAtDeath">
 <desc>age in years at death</desc>
 <datatype>
  <dataRef key="teidata.count"/>
 </datatype>
 <constraintSpec ident="lessThan150"
  scheme="schematron">

  <constraint>
   <sch:rule context="@ageAtDeath">
    <sch:assert test=". le 150">age at death must be an integer less than 150</sch:assert>
   </sch:rule>
  </constraint>
 </constraintSpec>
</attDef>
The elements altIdent, equiv, gloss and desc may all be used in the same way as they are elsewhere to describe fully the meaning of a coded value, as in the following example:
<valItem ident="dub">
 <altIdent xml:lang="fr">dou</altIdent>
 <equiv name="unknown"/>
 <gloss>dubious</gloss>
 <desc>used when the application of this element is doubtful or uncertain</desc>
</valItem>
Where all the possible values for an attribute can be enumerated, the datatype teidata.enumerated should be used, together with a valList element specifying the values and their significance, as in the following example:
<valList type="closed">
 <valItem ident="req">
  <gloss>required</gloss>
 </valItem>
 <valItem ident="rec">
  <gloss>recommended</gloss>
 </valItem>
 <valItem ident="opt">
  <gloss>optional</gloss>
 </valItem>
</valList>
Note the use of the gloss element here to explain the otherwise less than obvious meaning of the codes used for these values. Since this value list specifies that it is of type closed, only the values enumerated are legal, and an ODD processor will typically enforce these constraints in the schema fragment generated.
The valList element can also be used to provide illustrative examples of the kinds of values expected without listing all of them. In such cases the type attribute will have the value open, as in the following example:
<attDef ident="typeusage="opt">
 <desc versionDate="2005-01-14"
  xml:lang="en">
characterizes the movement, for example as an
   entrance or exit.</desc>
 <desc versionDate="2007-12-20"
  xml:lang="ko">
예를 들어 입장 또는 퇴장과 같은, 이동의 특성을 기술한다.</desc>
 <datatype>
  <dataRef key="teidata.enumerated"/>
 </datatype>
 <valList type="open">
  <valItem ident="entrance">
   <desc versionDate="2007-06-27"
    xml:lang="en">
character is entering the stage.</desc>
   <desc versionDate="2007-12-20"
    xml:lang="ko">
등장인물이 무대에 등장하고 있다.</desc>
  </valItem>
  <valItem ident="exit">
   <desc versionDate="2007-06-27"
    xml:lang="en">
character is exiting the stage.</desc>
   <desc versionDate="2007-12-20"
    xml:lang="ko">
등장인물이 무대에서 퇴장하고 있다.</desc>
  </valItem>
  <valItem ident="onStage">
   <desc versionDate="2007-07-04"
    xml:lang="en">
character moves on stage</desc>
   <desc versionDate="2007-12-20"
    xml:lang="ko">
등장인물이 무대에서 이동한다.</desc>
  </valItem>
 </valList>
</attDef>
The datatype will be teidata.enumerated in either case.

The valList or dataRef elements may also be used (as a child of the content element) to put constraints on the permitted content of an element, as noted at 23.5.1.2 Defining Content Models: RELAX NG. This use is not however supported by all schema languages, and is therefore not recommended if support for non-RELAX NG systems is a consideration.

⚓︎23.5.4 Processing Models

As far as possible, the TEI defines elements and their attributes in a way which is entirely independent of their subsequent processing, since its intention is to maximize the reusability of encoded documents and their use in multiple contexts. Nevertheless, it can be very useful to specify one or more possible models for such processing, both to clarify the intentions of the encoder, and to provide default behaviours for a software engineer to implement when documents conforming to a particular TEI customization are processed. To that end, the following elements may be used to document one or more processing models for a given element.

One or more of these elements may appear directly within an element specification to define the processing anticipated for that element, more specifically how it should be processed to produce the kind of output indicated by the output attribute. Where multiple such elements appear directly within an elementSpec, they are understood to document mutually exclusive processing models, possibly for different outputs or applicable in different contexts. Alternatively, the modelGrp element may be used to group alternative model elements intended for a single kind of output. The modelSequence element is provided for the case where a sequence of models is to be processed, functioning as a single unit.

A processing model suggests how a given element may be transformed to produce one or more outputs. The model is expressed in terms of behaviours and parameters, using high-level formatting concepts familiar to software engineers and web designers, such as ‘block’ or ‘inline’. As such, it has a different purpose from existing TEI mechanisms for documenting the appearance of source materials, such as the global attributes rend, rendition and style, described in sections 2.3.4.1 Rendition and 3.3.1 What Is Highlighting?. It does not necessarily describe anything present in the original source, nor does it necessarily represent its original structure or semantics. A processing model is a template description, which may be used to simplify the task of producing or customizing the stylesheets needed by a formatting engine or any other form of processor.

⚓︎23.5.4.1 The TEI processing model

The model element is used to document the processing model intended for a particular element in an abstract manner, independently of its implementation in whatever processing language is chosen. This is achieved by means of the following attributes and elements:

  • model describes the processing intended for a specified element.
    predicate [att.predicate]the condition under which the element bearing this attribute applies, given as an XPath predicate expression.
    behaviournames the process or function which this processing model uses in order to produce output. Suggested values include: 1] alternate; 2] anchor; 3] block; 4] body; 5] break; 6] cell; 7] cit; 8] document; 9] figure; 10] glyph; 11] graphic; 12] heading; 13] index; 14] inline; 15] link; 16] list; 17] listItem; 18] metadata; 19] note; 20] omit; 21] paragraph; 22] row; 23] section; 24] table; 25] text; 26] title
    outputthe intended output. Sample values include: 1] web; 2] print; 3] plain
    useSourceRenditionwhether to obey any rendition attribute that is present.
    cssClassthe name of a CSS class which should be associated with this element
  • outputRendition describes the rendering or appearance intended for all occurrences of an element in a specified context for a specified type of output.
    scopeprovides a way of defining ‘pseudo-elements’, that is, styling rules applicable to specific sub-portions of an element. Sample values include: 1] first-line; 2] first-letter; 3] before; 4] after

The mandatory behaviour attribute defines in broad terms how an element should be processed, for example as a block or as an inline element. The optional predicate attribute may be used to specify a subset of contexts in which this model should be applicable: for example, an element might be treated as a block element in some contexts, but not in others. The output attribute supplies a name for the output for which this model is intended, for example for screen display, for a printed reading copy, for a scholarly publication, etc. The way in which an element should be rendered is declared independently of its behaviour, using either the attribute useSourceRendition or the element outputRendition. These Guidelines recommend that outputRendition be expressed using the W3C Cascading Stylesheet Language (CSS), but other possibilities are not excluded. The particular language used may be documented by means of the styleDefDecl element described in 2.3.5 The Default Style Definition Language Declaration.

⚓︎23.5.4.2 Output Rendition
Here is a simple example of a processing model which might be included in the specification for an element such as hi or foreign. The intent is that these elements should be presented inline using an italic font.
<model behaviour="inline">
 <outputRendition>font-style: italic;</outputRendition>
</model>
If the rendition element, or the attributes style, rend, or rendition have already been used in the source document to indicate elements that were originally rendered in italic, and we wish simply to follow this in our processing, then there is no need to include an outputRendition element, and the attribute useSourceRendition could be used as follows:
<model behaviour="inline"
 useSourceRendition="true"/>
Any rendition information present in the source document will be ignored unless the useSourceRendition attribute has the value true. If that is the case, then such information will be combined with any rendition information supplied by the outputRendition element. For example, using CSS, an element which appears in the source as follows
<hi style="font-weight:bold;">this is in bold</hi>
would appear in bold and italic if processed by the following model
<model behaviour="inline"
 useSourceRendition="true">

 <outputRendition>font-style: italic;</outputRendition>
</model>
⚓︎23.5.4.3 CSS Class
In a typical workflow processing TEI documents for display on the web, a system designer will often wish to use an externally defined CSS stylesheet. The cssClass attribute simplifies the task of maintaining compatibility amongst the possibly many applications using such a stylesheet and also enables a TEI application to specify the names of classes to be used for particular processing models. For example, supposing that the associated CSS stylesheet includes a CSS class called labeled-list, the following processing model might be used to request it be used for list elements containing a child label element:
<elementSpec ident="listmode="change">
 <model predicate="labelbehaviour="list"
  cssClass="labeled-list">

<!-- ... -->
 </model>
</elementSpec>
In the following example, a table will be formatted using renditional information provided in the source if that is available, or by an external stylesheet, using one of the CSS classes specified, if it is not:
<elementSpec mode="changeident="table">
<!-- Preserve original rendition for tables which contain @rendition hints -->
 <model predicate=".//row/@rendition or .//cell/@rendition"
  behaviour="tableuseSourceRendition="true"/>

<!-- Use bootstrap for default table styling -->
 <model behaviour="table"
  useSourceRendition="true"
  cssClass="table table-hover table-bordered"/>

</elementSpec>

As discussed further below, the input data available to a processing model is by default the content of the element being processed, together with its child nodes.

⚓︎23.5.4.4 Model Contexts and Outputs
Sometimes different processing models are required for the same element in different contexts. For example, we may wish to process the quote element as an inline italic element when it appears inside a p element, but as an indented block when it appears elsewhere. To achieve this, we need to change the specification for the quote element to include two model elements as follows:
<elementSpec ident="quotemode="change">
 <model predicate="ancestor::p"
  behaviour="inline">

  <outputRendition>font-style: italic;</outputRendition>
 </model>
 <model behaviour="block">
  <outputRendition>left-margin: 2em;</outputRendition>
 </model>
</elementSpec>
As noted above, these two models are mutually exclusive. The first processing model will be used only for quote elements which match the XPath expression given as value for the predicate attribute. All other element occurrences will use the second processing model.

When, as here, multiple behaviours are required for the same element, it will often be the case that the appropriate processing will depend on the context. It may however be the case that the choice of an appropriate model will be made on the basis of the intended output. For example, we might wish to define quite different behaviours when a document is to be displayed on a mobile device and when it is to be displayed on a desktop screen. Different behaviours again might be specified for a print version intended for the general reader, and for a print version aimed at the technical specialist.

The modelGrp element can be used to group together all the processing models which have in common a particular intended output, as in the following example:
<modelGrp output="mobile">
 <model behaviour="inline"
  predicate="@rend='inline'">

  <outputRendition>font-size: 7pt;</outputRendition>
 </model>
 <model behaviour="block"
  predicate="@rend='block'">

  <outputRendition>text-color: red;</outputRendition>
 </model>
</modelGrp>
<modelGrp output="print">
 <model behaviour="inline"
  predicate="@rend='inline'">

  <outputRendition>font-size: 12pt;</outputRendition>
 </model>
 <model behaviour="block"
  useSourceRendition="truepredicate="@rend='block'">

  <outputRendition>text-align: center;</outputRendition>
 </model>
</modelGrp>
⚓︎23.5.4.5 Behaviours and their parameters

In the examples above we have used without explanation or definition two simple behaviours: inline and block, but many other behaviours are possible. A list of recommended behaviour names forms part of the specification for the element model. A processing model can specify any named behaviour, some of which have additional parameters. The parameters of a behaviour resemble the arguments of a function in many programming languages: they provide names which can be used to distinguish particular parts of the input data available to the process used to implement the behaviour in question.

The following elements are used to represent and to define parameters:

  • param provides a parameter for a model behaviour by supplying its name and an XPath expression identifying the location of its content.
    namea name for the parameter being supplied. Suggested values include: 1] alternate; 2] default; 3] height; 4] id; 5] label; 6] level; 7] link; 8] place; 9] type; 10] url; 11] width
  • paramList list of parameter specifications.
  • paramSpec supplies specification for one parameter of a model behaviour.

By default, a processor implementing the TEI processing model for a particular element has available to it as input data the content of the element itself, and all of its children. One or more param elements may be supplied within a model element to specify parameters which modify this, either by selecting particular parts of the default input data, or by selecting data which would not otherwise be available. In either case, the value supplied for the parameter is given as an XPath expression, evaluated with respect to the element node being processed. An arbitrary name (defined in the corresponding paramSpec) is also supplied to a processor to identify each parameter.

For example, an element such as the TEI ref element will probably be associated with a processing model which treats it as a hyperlink. But a hyperlink (in most implementations) often has two associated pieces of information: the address indicated, and some textual content serving to label the link . In HTML, the former is provided as value of the href element, and the latter by the content of an <a> element. In the following processing model we define a behaviour called link, which will use whatever is indicated by the parameter called uri to provide the former, while the latter is provided by the content of the ref element itself:
<elementSpec ident="refmode="add">
 <model behaviour="link">
  <param name="urivalue="@target"/>
  <param name="contentvalue="."/>
 </model>
</elementSpec>
The value attribute of a param element supplies an XPath expression that indicates where the required value may be found. The context for this XPath is the element which is being processed; hence in this example, the uri parameter takes the value of the target attribute on the ref element being processed. The content parameter indicates that the content of that ref element should be provided as its value. (This parameter is not strictly necessary, since by default the whole content of the element being processed is always available to a processor, but supplying it in this way makes the procedure more explicit).

All the parameters available for a given behaviour are defined as a part of the definition of the behaviour itself, as further discussed in section 23.5.4.8 Defining a processing model below.

As a further example, the TEI choice element requires a different behaviour for which the name alternate is proposed as in the following example:
<elementSpec ident="choicemode="change">
 <model predicate="sic and corr"
  behaviour="alternate">

  <param name="defaultvalue="corr"/>
  <param name="alternatevalue="sic"/>
 </model>
</elementSpec>
The processing model shown here will be selected for processing a choice element which has both sic and corr child elements. The names default and alternate here are provided for convenience. The default parameter provides the value of the child corr element, and the alternate parameter will provide that of the child sic elements. If neither param element was supplied, both elements would still be available to an application, but the application would need to distinguish them for itself.
A choice element might contain multiple corrections, each with differing values for their cert attribute. In the following processing model, we will accept as value of the default attribute only those child corr elements which have a value high for that attribute:
<elementSpec ident="choicemode="change">
 <model predicate="sic and corr"
  behaviour="alternate">

  <param name="default"
   value="corr[@cert='high']"/>

  <param name="alternatevalue="sic"/>
 </model>
</elementSpec>
A choice element might contain several different pairs of alternate elements (abbr and expan, orig and reg, etc.) We might wish to group together a set of processing models for these, for example to determine which of the possible alternatives is displayed by default whenever a choice element is processed for output to the web:
<elementSpec ident="choicemode="change">
 <modelGrp output="web">
  <model predicate="sic and corr"
   behaviour="alternate">

   <param name="default"
    value="corr[@cert='high']"/>

   <param name="alternatevalue="sic"/>
  </model>
  <model predicate="abbr and expan"
   behaviour="alternate">

   <param name="defaultvalue="expan[1]"/>
   <param name="alternatevalue="abbr"/>
  </model>
  <model predicate="orig and reg"
   behaviour="alternate">

   <param name="defaultvalue="reg"/>
   <param name="alternatevalue="orig"/>
  </model>
 </modelGrp>
</elementSpec>

If nothing matches the XPath defining the value of a particular parameter (e.g. if in the above example there is no correction with cert=high) then the default parameter has no value. It is left to implementors to determine how null-valued parameters should be processed.

⚓︎23.5.4.6 Outputs

As noted above, the output attribute is used to associate particular processing models with a specific type of output. The following example documents a range of processing intentions for the date element, intended to cope with at least the following three situations:

  1. there is text inside the element, and the output is print;
  2. there is no text inside the element but there is a when attribute, and the output is print;
  3. there is a when attribute, there is text inside the element, and the output is web
<elementSpec ident="datemode="change">
 <modelGrp output="print">
  <model predicate="text()"
   behaviour="inline"/>

  <model predicate="@when and not(text())"
   behaviour="inline">

   <param name="contentvalue="@when"/>
  </model>
 </modelGrp>
 <model output="webpredicate="@when"
  behaviour="alternate">

  <param name="defaultvalue="."/>
  <param name="alternatevalue="@when"/>
 </model>
</elementSpec>

For output to print we supply two processing models, one for the simplest case where the content of the date is to be treated as an inline element, and the other for the case where there is no content and the value of the when attribute is to be used in its place. This is specified by a parameter, called content in this example. For output to web, we use the alternate behaviour discussed in the previous section to indicate that by default the content of the element will be used, while retaining access to the value of the when attribute, this time via a parameter called alternate.

⚓︎23.5.4.7 Model sequence
As well as being combined to form model groups, several models may be combined to form a model sequence. All of the individual components of a model sequence are understood to be applied, rather than considered to be mutually exclusive alternatives. For example, we might wish to define two different behaviours for a note element: the inline behaviour should be used to display the value of the note number (given by its n attribute), while a different behaviour (here called footnote) should be used to display the content of the element at a specified place, given by the parameter place. Because both of these actions are required, the two models are grouped by a modelSequence element:
<elementSpec ident="notemode="change">
 <modelSequence output="print">
  <model behaviour="inline">
   <param name="contentvalue="@n"/>
  </model>
  <model behaviour="footnote">
   <param name="placevalue="'foot'"/>
  </model>
 </modelSequence>
</elementSpec>

The value of the parameter called place above is an XPath expression supplying an arbitrary string (‘foot’), which is therefore quoted. It is left to implementors to validate or constrain the possible values for such expressions.

⚓︎23.5.4.8 Defining a processing model
The processing model for an element is defined using some combination of model, modelSequence, or modelGrp elements within the elementSpec element containing its specification. The processing to be carried out is defined by means of the behaviour specified for each model element, possibly supplying specified values for a number of named parameters. The parameters available for a given behaviour are specified using a number of param elements grouped together in a paramList element. This paramList is supplied within the valItem used to document and name the behaviour. Here for example is the valItem which defines the link behaviour presented above:
<valItem ident="link">
 <desc>create a hyperlink</desc>
 <paramList>
  <paramSpec ident="content">
   <desc>supplies the location of some content describing the link</desc>
  </paramSpec>
  <paramSpec ident="uri">
   <desc>supplies the location of the intended hyperlink</desc>
  </paramSpec>
 </paramList>
</valItem>
Similarly the valItem which defines the behaviour named alternate includes specifications for two parameters: one also called alternate and the other called default
<valItem ident="alternate">
 <desc versionDate="2015-08-21"
  xml:lang="en">
support display of alternative visualisations, for
   example by displaying the preferred content, by displaying both in parallel, or by toggling
   between the two.</desc>
 <paramList>
  <paramSpec ident="default">
   <desc versionDate="2015-08-21"
    xml:lang="en">
supplies the location of the preferred
       content</desc>
  </paramSpec>
  <paramSpec ident="alternate">
   <desc versionDate="2015-08-21"
    xml:lang="en">
supplies the location of the alternative
       content</desc>
  </paramSpec>
 </paramList>
</valItem>

The suggested behaviours provided by the model element are informally defined using commonly understood terminology, but specific details of how they should be implemented are left to the implementor. Such decisions may vary greatly depending on the kind of processing environment, the kind of output envisaged, etc. The intention is to reduce as far as possible any requirement for the implementor to be aware of TEI-specific rules, and to maximize the ability of the ODD to express processing intentions without fully specifying an implementation.

⚓︎23.5.4.9 Implementation of Processing Models
As elsewhere in these Guidelines, the recommendations of this section are intended to be implementation-agnostic, not favouring any particular implementation method or technology. An implementor may choose, for example, whether to pre-process processing model specifications into a free standing ‘pipeline’, or to extract and process them dynamically during document processing. However, although implementors are generally free to interpret the processing model documentation according to their own requirements, some general assumptions underlie the recommendations made here:
  • If a model element has no child param elements, the action specified by its behaviour should be applied to the whole element node concerned, including its child nodes of whatever type. If that behaviour requires distinguishing particular parts of the input, an implementation may choose either to make those distinctions itself, or to raise an error.
  • If a model element has no predicate or output attribute then it is assumed to apply to all instances of the element defined in its parent elementSpec for all outputs. Otherwise its applicability depends on these attributes.
  • Only one of the models is to be applied for a particular instance of the element, except when they appear within a modelSequence
  • A ‘matching’ model is one where the element to be processed satisfies the XPath in the predicate attribute of the model or modelSequence and the current output method matches the method specified in the output attribute of the model, modelSequence, or a containing modelGrp. A model or modelSequence without a predicate always matches the element to be processed. A model, modelGrp, or modelSequence without an output attribute matches any output method.
  • Processing Model implementations must execute only the first matching model or modelSequence in document order.
  • If there are two or more model elements provided for an elementSpec but they have different output attributes then the implementation should choose the model appropriate to the desired output.
  • If there are two or more model elements provided for an elementSpec but they have different predicate attributes then the implementation should choose the model whose predicate provides the most specific context (where specific is understood in the same way as in XSLT)
In the following example, which shows part of the element specification for the head element, there are two model elements, one with and one without a predicate attribute:
<model behaviour="inline"
 predicate="parent::list">

 <desc versionDate="2015-03-02"
  xml:lang="en">
Model for list headings</desc>
<!-- ... -->
</model>
<model behaviour="heading">
 <desc versionDate="2016-03-02"
  xml:lang="en">
Default model for all headings.</desc>
<!-- ... -->
</model>
In this example, an implementation should use the first processing model only for head elements with a list element as parent; for all other head elements, the second processing model should be used.

⚓︎23.6 Class Specifications

The element classSpec is used to document either an attribute class or a ‘model class’, as defined in section 1.3 The TEI Class System. A corresponding classRef element may be used to select a specific named class from those available.

  • classSpec (class specification) contains reference information for a TEI element class; that is a group of elements which appear together in content models, or which share some common attribute, or both.
    typeindicates whether this is a model class or an attribute class
  • classRef points to the specification for an attribute or model class which is to be included in a schema.
    expandindicates how references to this class within a content model should be interpreted.
  • attList (attribute list) contains documentation for all the attributes associated with this element, as a series of attDef elements.
    org(organization) specifies whether only one (choice) or all (group) of the attributes in the list are available.
A model class specification does not list all of its members. Instead, its members declare that they belong to it by means of a classes element contained within the relevant elementSpec. This will contain a memberOf element for each class of which the relevant element is a member, supplying the name of the relevant class. For example, the elementSpec for the element hi contains the following:
<classes>
 <memberOf key="model.hiLike"/>
</classes>
This indicates that the hi element is a member of the class with identifier model.hiLike. The classSpec element that documents this class contains the following declarations:
<classSpec type="model"
 ident="model.hiLike">

 <desc>groups phrase-level elements related to highlighting that have no specific semantics </desc>
 <classes>
  <memberOf key="model.highlighted"/>
 </classes>
</classSpec>
which indicate that the class model.hiLike is actually a member (or subclass) of the class model.highlighted.

The function of a model class declaration is to provide another way of referring to a group of elements. It does not confer any other properties on the elements which constitute its membership.

The attribute type is used to distinguish between ‘model’ and ‘attribute’ classes. In the case of attribute classes, the attributes provided by membership in the class are documented by an attList element contained within the classSpec. In the case of model classes, no further information is needed to define the class beyond its description, its identifier, and optionally any classes of which it is a member.

When a model class is referenced in the content model of an element (i.e. by means of a classRef element within the content of an elementSpec), its meaning will depend on the value of its expand attribute.

If this attribute is not specified, the classRef is interpreted to mean an alternated list of all the current members of the class named. For example, suppose that the members of the class model.hiLike are elements hi, <it>, and <bo>. Then a content model such as
<content>
 <classRef key="model.hiLike"/>
</content>
would be equivalent to the explicit content model:
<content>
 <alternate>
  <elementRef key="hi"/>
  <elementRef key="it"/>
  <elementRef key="bo"/>
 </alternate>
</content>
(or, to use RELAX NG compact syntax, ( hi | it | bo )). However, a content model of <classRef expand="sequence"/> would be equivalent to the following explicit content model:
<content>
 <sequence>
  <elementRef key="hi"/>
  <elementRef key="it"/>
  <elementRef key="bo"/>
 </sequence>
</content>
(or, in RELAX NG compact syntax, ( hi, it, bo )).

An attribute class (a classSpec of type atts) contains an attList element which lists the attributes that all the members of that class inherit from it. For example, the class att.interpLike defines a small set of attributes common to all elements which are members of that class: those attributes are listed by the attList element contained by the classSpec for att.interpLike. When processing the documentation elements for elements which are members of that class, an ODD processor is required to extend the attList (or equivalent) for such elements to include any attributes defined by the classSpec elements concerned. There is a single global attribute class, att.global, to which some modules contribute additional attributes when they are included in a schema.

⚓︎23.7 Macro Specifications

The macroSpec element is used to declare and document predefined strings or patterns not otherwise documented by the elements described in this section. A corresponding macroRef element may be used to select a specific named pattern from those available. Patterns are used as a shorthand chiefly to describe common content models and datatypes, but may be used for any purpose. The following elements are used to represent patterns:

  • macroSpec (macro specification) documents the function and implementation of a pattern.
  • macroRef points to the specification for some pattern which is to be included in a schema.
    keythe identifier used for the required pattern within the source indicated.

⚓︎23.8 Building a TEI Schema

The specification elements, and some of their children, are all members of the att.identified class, from which they inherit the following attributes:

  • att.identified provides identifying attributes for elements which can be subsequently referenced by means of a key attribute.
    identsupplies the identifier by which this element may be referenced.
    predeclaresays whether this object should be predeclared in the tei infrastructure module.
    modulesupplies a name for the module in which this object is to be declared.

This attribute class is a subclass of the att.combinable class from which it (and some other elements) inherits the following attribute:

  • att.combinable provides attributes indicating how multiple references to the same object in a schema should be combined.
    modespecifies the effect of this declaration on its parent object.

This attribute class, in turn, is a subclass of the att.deprecated class, from which it inherits the following attribute:

  • att.deprecated provides attributes indicating how a deprecated feature will be treated in future releases.
    validUntilprovides a date before which the construct being defined will not be removed.

The validUntil attribute may be used to signal an intent to remove a construct from future versions of the schema being specified.

The elementSpec, attDef and schemaSpec specification elements also have an attribute which determines which namespace to which the object being created will belong. In the case of schemaSpec, this namespace is inherited by all the elements created in the schema, unless they have their own ns.

  • att.namespaceable provides attributes indicating the target namespace for an object being created.

These attributes are used by an ODD processor to determine how declarations are to be combined to form a schema or DTD, as further discussed in this section.

⚓︎23.8.1 TEI customizations

As noted above, a TEI schema is defined by a schemaSpec element containing an arbitrary mixture of explicit declarations for objects (i.e. elements, classes, patterns, or macro specifications) and references to other objects containing such declarations (i.e. references to specification groups, or to modules). A major purpose of this mechanism is to simplify the process of defining user customizations, by providing a formal method for the user to combine new declarations with existing ones, or to modify particular parts of existing declarations.

In the simplest case, a user-defined schema might just combine all the declarations from two nominated modules:
<schemaSpec ident="example">
 <moduleRef key="core"/>
 <moduleRef key="linking"/>
</schemaSpec>
An ODD processor, given such a document, should combine the declarations which belong to the named modules, and deliver the result as a schema of the requested type. It may also generate documentation for the elements declared by those modules. No source is specified for the modules, and the schema will therefore combine the declarations found in the most recent release version of the TEI Guidelines known to the ODD processor in use.

The value specified for the source attribute, when it is supplied as a URL, specifies any convenient location from which the relevant ODD files may be obtained. For the current release of the TEI Guidelines, a URL in the form http://www.tei-c.org/Vault/P5/x.y.z/xml/tei/odd/p5subset.xml may be used, where x.y.z represents the P5 version number, e.g. 1.3.0. Alternatively, if the ODD files are locally installed, it may be more convenient to supply a value such as ../ODDs/p5subset.xml".

The value for the source attribute may be any form of URI. A set of TEI-conformant specifications in a form directly usable by an ODD processor must be available at the location indicated. When no source value is supplied, an ODD processor may either raise an error or assume that the location of the current release of the TEI Guidelines is intended.

If the source is specified in the form of a private URI, the form recommended is aaa:x.y.z, where aaa is a prefix indicating the markup language in use, and x.y.z indicates the version number. For example, tei:1.2.1 should be used to reference release 1.2.1 of the current TEI Guidelines. When such a URI is used, it will usually be necessary to translate it before such a file can be used in blind interchange.

The effect of a moduleRef element is to include in the schema all declarations provided by that module. This may be modified by means of the attributes include and except which allow the encoder to supply an explicit lists of elements from the stated module which are to be included or excluded respectively. For example:
<schemaSpec ident="example">
 <moduleRef key="core"
  except="add del orig reg"/>

 <moduleRef key="linking"
  include="linkGrp link"/>

</schemaSpec>
The schema specified here will include all the elements supplied by the core module except for add, del, orig, and reg. It will also include only the linkGrp and link elements from the linking module.
Alternatively, the element elementRef may be used to indicate explicitly which elements are to be included in a schema. The same effect as the preceding example might thus be achieved by the following:
<schemaSpec ident="example">
 <moduleRef key="core"
  except="add del orig reg"/>

 <elementRef key="linkGrp"/>
 <elementRef key="link"/>
</schemaSpec>
Note that in this last case, there is no need to specify the name of the module from which the two element declarations are to be found; in the TEI scheme, element names are unique across all modules. The module is simply a convenient way of grouping together a number of related declarations.
In the same way, a schema may select a subset of the attributes available in a specific class, using the classRef element to point to an attribute class:
<schemaSpec ident="example">
 <moduleRef key="tei"/>
<!-- ... -->
 <classRef key="att.global.linking"
  include="corresp"/>

<!-- ... -->
</schemaSpec>
Here, only the corresp attribute is included; the other attributes in the class are not available. The same effect can be achieved using except:
<schemaSpec ident="example">
 <moduleRef key="tei"/>
<!-- ... -->
 <classRef key="att.global.linking"
  except="copyOf exclude next prev sameAs select synch"/>

<!-- ... -->
</schemaSpec>
A schema may also include declarations for new elements, as in the following example:
<schemaSpec ident="example">
 <moduleRef key="header"/>
 <moduleRef key="verse"/>
 <elementSpec ident="soundClip">
  <classes>
   <memberOf key="model.pPart.data"/>
  </classes>
 </elementSpec>
</schemaSpec>
A declaration for the element <soundClip>, which is not defined in the TEI scheme, will be added to the output schema. This element will also be added to the existing TEI class model.pPart.data, and will thus be available in TEI conformant documents. Attributes from existing TEI classes could be added to the new element using attRef:
<schemaSpec ident="example">
 <moduleRef key="header"/>
 <moduleRef key="verse"/>
 <elementSpec ident="soundClip">
  <classes>
   <memberOf key="model.pPart.data"/>
  </classes>
  <attList>
   <attRef class="att.global.source"
    name="source"/>

  </attList>
 </elementSpec>
</schemaSpec>
This will provide the source attribute from the att.global.source class on the new <soundClip> element.
A schema might also include re-declarations of existing elements, as in the following example:
<schemaSpec ident="example">
 <moduleRef key="header"/>
 <moduleRef key="teistructure"/>
 <elementSpec ident="headmode="change">
  <content>
   <macroRef key="macro.xtext"/>
  </content>
 </elementSpec>
</schemaSpec>
The effect of this is to redefine the content model for the element head as plain text, by over-riding the content child of the selected elementSpec. The attribute specification mode="change" has the effect of over-riding only those children elements of the elementSpec which appear both in the original specification and in the new specification supplied above: content in this example. Note that if the value for mode were replace, the effect would be to replace all children elements of the original specification with the the children elements of the new specification, and thus (in this example) to delete all of them except content.

A schema may not contain more than two declarations for any given component. The value of the mode attribute is used to determine exactly how the second declaration (and its constituents) should be combined with the first. The following table summarizes how a processor should resolve duplicate declarations; the term identifiable refers to those elements which can have a mode attribute:

mode value existing declaration effect
add no add new declaration to schema; process its children in add mode
add yes raise error
replace no raise error
replace yes retain existing declaration; process new children in replace mode; ignore existing children
change no raise error
change yes process identifiable children according to their modes; process unidentifiable children in replace mode; retain existing children where no replacement or change is provided
delete no raise error
delete yes ignore existing declaration and its children

⚓︎23.8.2 Combining TEI and Non-TEI Modules

In the simplest case, all that is needed to include a non-TEI module in a schema is to reference its RELAX NG source using the url attribute on moduleRef. The following specification, for example, creates a schema in which declarations from the non-TEI module svg11.rng (defining Standard Vector Graphics) are included. To avoid any risk of name clashes, the schema specifies that all TEI patterns generated should be prefixed by the string "TEI_".
<schemaSpec prefix="TEI_ident="testsvg"
 start="TEI svg">

 <moduleRef key="header"/>
 <moduleRef key="core"/>
 <moduleRef key="tei"/>
 <moduleRef key="textstructure"/>
 <moduleRef url="svg11.rng"/>
</schemaSpec>
This specification generates a single schema which might be used to validate either a TEI document (with the root element TEI), or an SVG document (with a root element <svg:svg>), but would not validate a TEI document containing <svg:svg> or other elements from the SVG language. For that to be possible, the <svg:svg> element must become a member of a TEI model class (1.3 The TEI Class System), so that it may be referenced by other TEI elements. To achieve this, we modify the last moduleRef in the above example as follows:
<moduleRef url="svg11.rng">
 <content>
  <rng:define name="TEI_model.graphicLike"
   combine="choice">

   <rng:ref name="svg"/>
  </rng:define>
 </content>
</moduleRef>

This states that when the declarations from the svg11.rng module are combined with those from the other modules, the declaration for the model class model.graphicLike in the TEI module should be extended to include the element <svg:svg> as an alternative. This has the effect that elements in the TEI scheme which define their content model in terms of that element class (notably figure) can now include it. A RELAX NG schema generated from such a specification can be used to validate documents in which the TEI figure element contains any valid SVG representation of a graphic, embedded within an <svg:svg> element.

⚓︎23.8.3 Linking Schemas to XML Documents

Schemas can be linked to XML documents by means of the <?xml-model?> processing instruction described in the W3C Working Group Note Associating Schemas with XML documents (https://www.w3.org/TR/xml-model/). <?xml-model?> can be used for any type of schema, and may be used for multiple schemas:

<?xml-model href="tei_tite.rng" type="application/xml" ?>
<?xml-model href="checkLinks.sch" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron" ?>
<?xml-model href="tei_tite.odd" type="application/tei+xml" schematypens="http://www.tei-c.org/ns/1.0" ?>

This example includes a standard RELAX NG schema, a Schematron schema which might be used for checking that all pointing attributes point at existing targets, and also a link to the TEI ODD file from which the RELAX NG schema was generated. See also 2.3.10 The Schema Specification for details of another method of linking an ODD specification into your file by including a schemaSpec element in encodingDesc.

⚓︎23.9 Module for Documentation Elements

The module described in this chapter makes available the following components:

Module tagdocs: Documentation of TEI and other XML markup languages

The selection and combination of modules to form a TEI schema is described in 1.2 Defining a TEI Schema.

The elements described in this chapter are all members of one of three classes: model.oddDecl, model.oddRef, or model.phrase.xml, with the exceptions of schemaSpec (a member of model.divPart) and both eg and egXML (members of model.common and model.egLike). All of these classes are declared along with the other general TEI classes, in the basic structure module documented in 1 The TEI Infrastructure.

In addition, some elements are members of the att.identified class, which is documented in 23.8.1 TEI customizations above.

Notes
96
ODD is short for ‘One Document Does it all’, and was the name invented by the original TEI Editors for the predecessor of the system currently used for this purpose. See further Burnard and Sperberg-McQueen (1995) and Burnard and Rahtz (2004).
97
The distinction between base and additional tagsets in earlier versions of the TEI scheme has not been carried forward into P5.
98
It would still be allowed to contain comments or processing instructions, as these are not considered part of the content model.
99
The classRef element may be used to refer to attribute classes, but this should not be done within a content.
100
This content model is not used very often in the TEI scheme. Because only Unicode characters are permitted, there is no way to record characters that are not (yet) represented in Unicode. Thus in TEI instead of textNode we often use a reference to macro.xtext which permits both Unicode characters and the g element.
101
Indeed, these Guidelines themselves include many additional constraints which are expressed in the ISO Schematron language using this mechanism. A conformant TEI document should respect these constraints, although automatic validation of them may not be possible for all processors.

[English] [Deutsch] [Español] [Italiano] [Français] [日本語] [한국어] [中文]



TEI Guidelines P5 Version 4.9.0a. Last updated on 14th January 2025, revision 3925b3613. This page generated on 2025-01-14T18:24:09Z.