|
From texts to Textbases to Knowledgebases
- TEI provides all the necessary elements for marking the
content of texts for later reuse
- Large textbases can be built using these features
- To built these textbases into Knowledgebases, a further
layer of informations is needed
- This layer abstracts and condenses information items from
the textbases and makes them accessible
- One technology to built such information spaces is called
topic maps
- There exists other technologies, but topic maps are the
most powerful and adaptable, so I will concentrate on
them
Topic maps are ...
- some kind of metadata
- a way to make information about information
exchangeable
- ‘the global positioning system (GPS) of information’
(Charles Goldfarb)
- defined in ISO 13250:2000
- a member of the SGML family of standards
Topic maps have been ...
- developped for almost ten years
- accepted as an international standard in early 2000
- perceived as a tool to navigate huge collections of
information
- designed to ‘enable multiple, concurrent views of sets of
information objects’
Development of topic maps
- started out with digitization of indices, glossaries and
authority files
- consequently generalized and abstracted the underlying
concepts
- to enable set operations like merging and splitting of
topic maps, `published subject identifiers'
(PSI) have been introduced
- among other things topic maps can also be used to assemble
virtual documents
- implementation of tools operating on topic maps is still in
a very early stage
Topic maps and metadata
There are important differences between topic maps and metadata
in the usual sense of the term:
- Metadata is information about a specific document
- Topic maps is information about the information in a
document
- Metadata are encoded in a variety of formats, e.g. US-MARC,
Dublin core, ...
- Topic maps are encoded as SGML (WebSGML) documents
Topic maps and SGML/XML
- In the ISO standard, topic maps are expressed as SGML and
use HyTime (ISO 10744:1997) constructs to address into
documents
- After the acceptance of ISO 13250, a informal workgroup of
TM vendors started working on a XML version of topic maps
- The XML version, called XTM (Xml Topic Maps) has been
published on Dec. 5, 2000 (see http://www.topicmaps.org)
- This version does not simply recast ISO 13250, but
introduces some changes in the underlying topic map model and
syntax and uses XPointer to address into documents
- XTM adds `published subject indicators'
(PSI) to enable public sharing of topic maps
- In this presentation, I will talk about topic maps as
defined in ISO 13250
Topic maps and TEI
- Topic maps complement the TEI markup in an ideal way by
establishing a general and exchangeable way to describe views of a
TEI encoded document.
- TEI markup can be used to describe the features of a text,
including its structure and other noteworthy elements like
names, dates and the like.
- Topic maps can be used to link the features marked up using
TEI to ressources internal and external of the text.
- It is thus possible to construct a very flexible and
powerful information architecture around the TEI encoded
texts.
|