The following sections detail the different file formats that have been defined for the extension of the Kanseki Repository. Although they constitute very different information for different purposes, for the convenience of describing the files and processing the information, they have been combined into one single schema, adressed under one single namespace. The schema allows the following entry points:
As can be seen, the manifest definitions can be grouped together into a list of manifests, thus providing two entry points for the schema for manifests, while the other two schemas define a list of nexus respectively t (token) elements, which are grouped together into lists, thus the lists provide the entry point in this case.
The Manifest.xml
described here contains information about a set of editions that are grouped here together, usually for the purpose of further description and processing.
There are two main elements under the root element manifest1:
The editions element holds information about the editions that are collected here. It contains edition elements, which give the details for each edition. This includes also the type, which can be either "documentary" or "interpretative". Documentary editions are editions that strive to reproduce an existing print edition, while interpretative editions do reflect the views of the editor and do not follow one single edition.
Other details for editions that will be collected here are the id, which is a unique label (or identifier) used to refer to this specific edition within the manifest and the processing systems.
The edition element may have the following children:
Both of these elements are optional. description contains a description of the edition, which could be the title, but also other information deemed relevant. divisions allows reference to divisions within the edition. This element is repeatable and when occuring more than once the edition is considered made up of the sequence of these divisions.
The nexus files described here describe links between locations in texts. The links consist of references to a span of one or more consecutive characters in a text. Related links can be grouped together to form a nexus. This can be used for example to describe corresponding passages in different versions of a text.
The main elements under the root element nexusListare::
The nexus element holds the locationRef elements, which contain the reference information to locate the passage of the text. The reference is expressed by pointing to a sequence of one or more tokens in a token file for the edition.
The token files described here serve as a shadow of other digital files that more thoroughly describe the texts documented there. This relieves the token files from the burden to describe the physical appearence, structure and transmission of the text. This information is available at any time by following the links back to these other files. The purpose of the token files is to provide a minimal description, containing only the characters of the text in a form that allows easy comparison and alignment of multiple versions. The function is similar to a concordance in that it provides access to the whole text, but without much of what a reader would expect to make reading (or editing) convenient, or even feasible. On the other hand, enough information should be retained to reconstruct a very basic version of the text.
The main elements under the root element <tlist>are::
The tg element holds the t elements, which have the character content of the text, one token per t. The purpose of the tg element is to group related t elements. tg can nest, and provide thus for a rudimentary structure in the token files.
<creation> Information about the creation | |
Module | KRXManifest |
Contained by | KRXManifest: description edition editionGroup |
May contain | |
Content model | <content> |
Schema Declaration | element creation { ( krx_date | krx_resp | krx_title )* } |
<date> Date of the work | |||||||||||||||||||
Module | KRXManifest | ||||||||||||||||||
Attributes |
| ||||||||||||||||||
Contained by | KRXManifest: creation | ||||||||||||||||||
May contain | Character data only | ||||||||||||||||||
Content model | <content> | ||||||||||||||||||
Schema Declaration | element date { attribute notbefore { string }?, attribute notafter { string }?, attribute cert { "high" | "middle" | "low" }?, text } |
<description> Description of the edition or item this element is attached to. | |
Module | KRXManifest |
Contained by | |
May contain | |
Content model | <content> |
Schema Declaration | element description { ( text | krx_note? | krx_title? | krx_creation? )* } |
<div> One specific subdivision on any level. | |||||||||||||||||||||||||||||||||||||
Module | KRXManifest | ||||||||||||||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||||||||||||||
Contained by | |||||||||||||||||||||||||||||||||||||
May contain | KRXManifest: description div edRef label | ||||||||||||||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||||||||||||||
Schema Declaration | element div { attribute label { token }?, attribute edition { xsd:IDREF }?, attribute sequence { xsd:nonNegativeInteger }?, attribute start { xsd:nonNegativeInteger }?, attribute end { xsd:nonNegativeInteger }?, attribute divid { token }?, ( krx_label*, krx_description?, krx_edRef*, krx_div* ) } |
<divisions> The internal subdivisions of the work under consideration. | |||||||
Module | KRXManifest | ||||||
Attributes |
| ||||||
Contained by | |||||||
May contain | KRXManifest: div | ||||||
Content model | <content> | ||||||
Schema Declaration | element divisions { attribute edition { token }?, krx_div+ } |
<edition> One edition of the work. If there are multiple divisions, this indicates that the sequence of these divisions make up the work. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Module | KRXManifest | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Contained by | KRXManifest: editionGroup editions | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
May contain | KRXManifest: creation description divisions title tokenmap | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Schema Declaration | element edition { attribute xml:id { xsd:ID }?, attribute id { xsd:ID }, attribute format { "xml/TEI" | "txt/mandoku" }, attribute location { string }, attribute base { "true" | "false" }?, attribute type { "documentary" | "interpretative" }, attribute role { "base" | "reference" }?, attribute language { xsd:language }?, attribute sigle { string }?, ( krx_title?, krx_creation?, krx_description, krx_tokenmap?, krx_divisions* ) } |
<editionGroup> A group of the editions representing the work under consideration. | |||||||||||||
Module | KRXManifest | ||||||||||||
Attributes |
| ||||||||||||
Contained by | KRXManifest: editions | ||||||||||||
May contain | |||||||||||||
Content model | <content> | ||||||||||||
Schema Declaration | element editionGroup { attribute type { "root" | "root+annotation" | "annotation" | "translation" | "other" }, attribute sigle { string }?, ( krx_title?, krx_creation?, krx_edition+ ) } |
<editions> The editions representing the work under consideration. Work is taken in a very broad sense here. | |
Module | KRXManifest |
Contained by | KRXManifest: manifest |
May contain | KRXManifest: edition editionGroup |
Content model | <content> |
Schema Declaration | element editions { krx_editionGroup+ | krx_edition+ } |
<edRef> Reference to this subdivision in one specific edition, identified by the key . | |||||||||||||||||||||||||||||||
Module | KRXManifest | ||||||||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||||||||
Contained by | KRXManifest: div | ||||||||||||||||||||||||||||||
May contain | Empty element | ||||||||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||||||||
Schema Declaration | element edRef { attribute start { xsd:nonNegativeInteger }?, attribute end { xsd:nonNegativeInteger }?, attribute key { xsd:IDREF }?, attribute timestamp { xsd:dateTime }?, attribute label { token }?, empty } |
<label> Additional label | |||||||
Module | KRXManifest | ||||||
Attributes |
| ||||||
Contained by | KRXManifest: div | ||||||
May contain | Character data only | ||||||
Content model | <content> | ||||||
Schema Declaration | element label { attribute language { xsd:language }?, text } |
<lb> This element marks the beginning of a new line or line-like section on the text-bearing surface. | |||||||||||||||||||
Module | derived-module-KRX | ||||||||||||||||||
Attributes |
| ||||||||||||||||||
Contained by | KRXToken: tg | ||||||||||||||||||
May contain | Empty element | ||||||||||||||||||
Content model | <content> | ||||||||||||||||||
Schema Declaration | element lb { attribute ed { string }?, attribute n { string }?, attribute xml:id { ID }?, empty } |
<locationRef> Reference to a location in the token file. Optionally might hold a copy of the referenced text as a string of characters. | |||||||||||||||||||||||||||||||||
Module | KRXNexus | ||||||||||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||||||||||
Contained by | KRXNexus: nexus | ||||||||||||||||||||||||||||||||
May contain | Character data only | ||||||||||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||||||||||
Schema Declaration | element locationRef { attribute ed { string }, attribute tp { xsd:nonNegativeInteger }, attribute tcount { xsd:nonNegativeInteger }?, attribute target { string }, attribute n { string }?, text } |
<manifest> The root of the manifest. One manifest describes one work. | |||||||
Module | KRXManifest | ||||||
Attributes |
| ||||||
Contained by | KRXManifest: manifests | ||||||
May contain | KRXManifest: description divisions editions title | ||||||
Note | Currently, only one work can be described per one manifest file. Need to think about what to do with use cases that need multiple works. Use several manifest in a file? | ||||||
Content model | <content> | ||||||
Schema Declaration | element manifest { attribute xml:id { xsd:ID }?, ( krx_title?, krx_description, krx_editions, krx_divisions? ) } |
<manifests> Root for manifests that contain multiple manifest elements. | |
Module | KRXManifest |
Contained by | — |
May contain | KRXManifest: manifest |
Content model | <content> |
Schema Declaration | element manifests { krx_manifest+ } |
<map> Map of one textual feature to a specific token type | |||||||||||||
Module | derived-module-KRX | ||||||||||||
Attributes |
| ||||||||||||
Contained by | KRXManifest: tokenmap | ||||||||||||
May contain | Empty element | ||||||||||||
Content model | <content> | ||||||||||||
Schema Declaration | element map { attribute src { string }?, attribute tok { "h" | "p" | "n" | "q" | "v" }?, empty } |
<nexus> A group of locationRef elements. | |||||||||||||||||||||
Module | KRXNexus | ||||||||||||||||||||
Attributes |
| ||||||||||||||||||||
Contained by | KRXNexus: nexusList | ||||||||||||||||||||
May contain | KRXManifest: note KRXNexus: locationRef | ||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||
Schema Declaration | element nexus { attribute xml:id { xsd:ID }?, attribute tp { xsd:nonNegativeInteger }, attribute tcount { xsd:nonNegativeInteger }?, ( krx_note*, krx_locationRef* ) } |
<nexusList> Root for Nexus that may contain one or more nexus elements. | |||||||||||||||||||
Module | KRXNexus | ||||||||||||||||||
Attributes |
| ||||||||||||||||||
Contained by | — | ||||||||||||||||||
May contain | |||||||||||||||||||
Content model | <content> | ||||||||||||||||||
Schema Declaration | element nexusList { attribute xml:id { ID }?, attribute ed { string }, attribute n { string }?, ( krx_note?, krx_nexus+ ) } |
<note> An additional note | |
Module | KRXManifest |
Contained by | KRXManifest: description |
May contain | Character data only |
Content model | <content> |
Schema Declaration | element note { text } |
<pb> This element marks the beginning of a new page or page-like section on the text-bearing surface. | |||||||||||||||||||
Module | derived-module-KRX | ||||||||||||||||||
Attributes |
| ||||||||||||||||||
Contained by | KRXToken: tg | ||||||||||||||||||
May contain | Empty element | ||||||||||||||||||
Content model | <content> | ||||||||||||||||||
Schema Declaration | element pb { attribute ed { string }?, attribute n { string }?, attribute xml:id { ID }?, empty } |
<resp> Person responsible for some aspect of the work | |||||||||||||||
Module | KRXManifest | ||||||||||||||
Attributes |
| ||||||||||||||
Contained by | KRXManifest: creation | ||||||||||||||
May contain | Character data only | ||||||||||||||
Content model | <content> | ||||||||||||||
Schema Declaration | element resp { attribute role { string }?, attribute key { string }?, text } |
<t> A token. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Module | KRXToken | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Contained by | KRXToken: tg | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
May contain | Character data only | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Schema Declaration | element t { attribute role { "h" | "p" | "s" | "n" | "q" | "v" | "o" }, attribute pos { xsd:nonNegativeInteger }?, attribute tp { xsd:nonNegativeInteger }, attribute f { string }?, attribute p { string }?, attribute n { string }, attribute cp { xsd:nonNegativeInteger }?, attribute position { string }?, attribute kundokuten { string }?, attribute ruby { string }?, text } |
<tg> A group of tokens. | |||||||||||||||||||||||||||||||||||||
Module | KRXToken | ||||||||||||||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||||||||||||||
Contained by | |||||||||||||||||||||||||||||||||||||
May contain | |||||||||||||||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||||||||||||||
Schema Declaration | element tg { attribute xml:id { xsd:ID }?, attribute n { string }?, attribute role { "h" | "p" | "s" | "n" | "q" | "v" | "o" }?, attribute position { string }?, attribute kundokuten { string }?, attribute ruby { string }?, ( krx_tg* | krx_t* | krx_pb* | krx_lb* )* } |
<title> Title of the work. | |
Module | KRXManifest |
Contained by | KRXManifest: creation description edition editionGroup manifest |
May contain | Character data only |
Content model | <content> |
Schema Declaration | element title { text } |
<tList> Root for token that may contain one or more tg elements. | |||||||||||||||||||||||||
Module | KRXToken | ||||||||||||||||||||||||
Attributes |
| ||||||||||||||||||||||||
Contained by | — | ||||||||||||||||||||||||
May contain | KRXToken: tg | ||||||||||||||||||||||||
Content model | <content> | ||||||||||||||||||||||||
Schema Declaration | element tList { attribute xml:id { ID }?, attribute ed { string }, attribute n { string }?, attribute fileseq { xsd:nonNegativeInteger }?, krx_tg+ } |