Introduction to the TEI


Introduction to the TEI

  • where the TEI came from
  • architecture of the TEI scheme
  • main and auxiliary schemas
  • within the main schema: core, base, and additional tag sets

Where does the TEI come from?

From the research community:

Sponsors

  • ACH Association for Computers and the Humanities
  • ACL Association for Computational inguistics
  • ALLC Association for Literary and Linguistic Computing

Funders

  • U.S. National Endowment for the Humanities
  • Mellon Foundation
  • Commission of European Communities DG XIII
  • Social Science and Humanities Research Council of Canada

TEI today

To ensure continued maintenance, the formation of a membership onsortium has been announced in April 1999 and the development of the TEI has been transferred to it. Currently, the hosting members are:

  • Brown University
  • University of Virginia, USA
  • A group of institutions in Nancy, France
  • Oxford University, United Kingdom

Goals of the TEI

  • better interchange and integration of data
  • support for all texts, in all languages, from all periods
  • guidance for the perplexed: what to encode
  • assistance for the specialist: how to encode any information of interest

TEI Deliverables

  • A coherent set of recommendations for text encoding
    • comprising several distinct XML tag sets
    • based on existing practice
    • documented in a reference manual
  • Tutorials for general and specialized audiences (in progress)
  • Sample texts (not yet)

... but no TEI software

TEI Timeline

TEI releases are named as P (proposal) + release number

  • P1: ca. 1990, not complete, circulation only internally
  • P2: ca. 1992, circulation internally
  • P3: May 1994, first public and complete release, printed in 2 (green) volumes.
  • P4: May 2002, first XML aware release
  • P5: prerelease January 2005, development continuing

    Major new developments in P5:

    • Namespace for TEI, root element changed to <TEI>
    • Extension module `gaiji' for additional non-standard characters
    • xml:id and xml:lang instead of id and lang
    • XML Pointers instead of TEI Pointers
    • Tagset documentation and manuscript description modules
    • Internal representation changed from DTD fragments to RelaxNG
    • Multiple schema languages (DTD, XSD, RelaxNG)
    • Open development at tei.sourceforge.net

TEI schema Structure

  • how to make one markup scheme handle infinite variety of requirements and interests
  • all texts are alike
  • every text is different
  • similar to the database design problem: one construct, many views
  • each view a selection from the whole

How Many schemas?

How many schemas are necessary for a project like the TEI?

  • one (a `Prussian' schema)
  • none (a `Waterloo' schema)
  • one per document (a `Californian' schema)

The TEI schemas

  • a single main schema with many faces (a `British' schema)
  • many tags (over 400)
  • organized into tag sets
  • grouped into classes
  • several auxiliary schemas for specialized information:
    • writing system / character set
    • feature system (for feature-structure notation)
    • tag set documentation
    • independent, free-standing TEI header

The Pizza Model (XML version)

<!ELEMENT pizza 
 (base, (tomatoSauce & cheese), topping*) > 
 <!ELEMENT base (thinCrust | pan | stuffed) > 
 <!ELEMENT topping (mushrooms | pepperoni | 
  sausage | pepper | anchovies | ...) >


4 Next | First| Previous Introduction to Markup and the TEI Guidelines