|
TEI Lite
- a subset of the main TEI schema -- with
extensions
- small, simple
- realistic for existing texts (OTA, Virginia)
- realistic for document production (TEI technical
documents)
- A good introduction to TEI Lite is available at
http://www.tei-c.org.uk/Lite/
The Structure(s) of a TEI text
- a text contains a header followed by a
text
- the header contains:
-
file description containing
bibliographic information about the machine-readable
text itself, and its source
-
encoding description explaining how the
electronic text was encoded
-
profile description containing further
information about the text
-
revision description containing version
control information about the text
- the text may be unitary or
composite
- a text contains:
- front matter
- back matter
- a body
- in a composite text, the body is a
group of texts (or nested groups)
TEI Structures Summarized
TEI :: teiHeader text
text :: front? (body|group) back?
group :: (text|group)+
teiCorpus :: teiHeader TEI+
Text divisions
- generic, hierarchic subdivisions
-
`vanilla' or numbered
-
type attribute
- associated <head> and <trailer>
elements
Global Attributes
-
xml:id for unique identification
-
n for (non-unique) name or number
-
rend for rendition
-
xml:lang for language and hence
writing-system
Applicable to all elements in
TEI scheme.
Text components
What are divisions composed of?
-
prose is mostly paragraphs
(<p>)
-
verse is mostly lines
(<l>),sometimes in hierarchic groups
(<lg>)
-
drama is mostly speeches
(<sp>) containing <p> or <l> and
interspersed with stage directions
(<stage>)
These may be mixed, and may also appear directly within
undivided texts.
Phrase level elements include...
-
typographically highlighted phrases
(emphasis, technical terms, foreign language matter,
titles, quoted matter, linguistically distinct
etc.)
-
data-like (names, numbers, dates, times,
addresses)
- notes and cross references
-
editorial intervention (corrections,
regularizations, additions, omissions etc.)
Boundary Points
Texts are not always neatly hierarchic:
- page and line breaks (<pb>, <lb>, and
<cb>)
- requires left-to-right processing, may not fit well
into hierarchical model of XML and XML software
Notes and Cross References
-
Notes of any kind: use <note>
- in-line or out of line: (use place
value to specify)
<lg>
<l>The self-same moment I could pray </l>
<l>And from my neck so free </l>
<l>The albatross fell off, and sank </l>
<l>Like lead into the sea.
<note type="auth" place="margin">
The spell begins to break.</note></l>
</lg>
- cross references : <ptr> and <ref>
See especially <ref target="SEC12">
section 12 on page 34</ref>.
See especially <ptr target="SEC12"/>.
- target is most conveniently an identifier
(id value)
... see especially <ptr target="SEC12"/>.
... <div1 id="SEC12">
<head>Concerning Identifiers</head> ...
- Together, these provide simple hypertext
capability.
XML Pointers
With P5, the TEI adopts the use of XML Pointers instead of the TEI Extended Pointers that had been used in previous versions.
The development of XML Pointer languages, including
XPath, XPointer and XLink have been strongly influenced by
the TEI Extended Pointers.
XPath, XLink and Xpointer are by now W3C Recommendations, together they are are much more general and
powerful than the TEI Extended Pointers, therefore the TEI Council decided to adopt them for P5.
Lists
- for lists of any kind (use type
attribute to distinguish)
- use <label> for two-column lists or as
alternative to n attribute
- may be nested as necessary
Bibliographic References
Use simple <bibl> with subcomponents:
-
<respStmt> (for any kind of
responsibility)
- or <author>, <editor>, etc.
-
<title> with optional level
attribute
-
<imprint> groups publication details
-
<biblScope> adds page references etc.
The full Guidelines have the more detailed structured
elements <biblItem> and <biblFull>.
Use <listBibl> (a spezialized <list> of bibliographic elements) for a list of references
The TEI Header
- mandatory
- independently interchangeable
- support for librarians
- support for corpus builders
The File Description contains...
- five ISBD (International Standard Book
Description)areas:
- titleStmt
- editionStmt
- extent
- publicationStmt
- seriesStmt
- sourceDesc
- notesStmt
The Encoding Description contains...
- projectDesc
- samplingDecl
- editorialDecl
- tagsDecl
- refsDecl
- classDecl
The Profile Description contains...
miscellaneous additional information such as
-
<creation>
-
<langUsage>
-
<textClass>
plus, when the corpus tagset is chosen,
- textDesc
- particDesc
- settingDesc
|