Tagging Texts


Tagging Texts

  • Choosing a TEI subset
  • Starting a new document
  • Tagging an existing document
  • The TEI header
  • Applying specific tags

Starting a new document

A variety of XML editors exist. A very powerful SGML/XML editor is the Emacs sgml-mode PSGML. Starting a new document here involves:

  • "Finding" a new document (usually invoked with C-xC-f)
  • If the filename extension is recognized, it automatically turns the relevant mode (xml-mode or sgml-mode) on.
  • The DOCTYPE can be inserted from a menu for known document types.

For example with Emacs and PSGML

Powerful SGML/XML aware editing features of PSGML:

  • Only possible elements are offered for insertion.
  • Required elements are inserted automatically.
  • Indentation is available according to the nesting depth of the element.
  • Completion is available while inserting new elements.
  • A list of possible content elements is inserted in a comment if the content model does not allow character data.
  • Tags can be colored and selectively hidden when necessary.

Tagging an existing document

  • Start a new document.
  • Insert a skeleton of markup for the header.
  • Copy the existing document into the body.
  • Split and modify the markup as required.
  • Again, PSGML offers powerful features to help with this task:
    • The keystroke C-RET splits the current element.
    • Regions can be selected and tagged with specific tags.
    • SGML/XML validation is invoked easily.
    • PSGML can automatically move to positions that contain an error.

The TEI Header

  • The TEI Header is much more strictly regulated than any other part of TEI.
  • PSGML can greatly help with editing the TEI Header by inserting all required elements.
  • Some of the elements need to be filled in, others added.

Applying specific tags

Some parts of the TEI markup describes the structure of a text, other parts are used to encode features of a text. Experience has shown that the following procedure circumvents a lot of problems:

  • Don't try to tag everything you won't to tag in one run.
  • First put in all the structural tags that describe the components of the document clearly and indisputable visible in the text.
  • In successive runs, specific features can be marked, where the document analysis showed they are required.
  • In some cases, a branching is possible for e.g. linguistic and historic markup. Consider using a version control system in these cases.

7 Next | First| Previous Introduction to XML, Markup and the TEI Guidelines