to Home Page

Buddhist Databases and Input Projects

by Urs App


(This survey stems from the Electronic Bodhidharma No. 3 [July 93]. Some additional projects and updates are included in the Electronic Bodhidharma No. 4 [June 95]). No claim is made for comprehensive coverage, and especially the references to electronic resources on networks are very much in need of an update.


Many institutions and individuals all over the world are already engaged in inputting information related to Buddhism. Such information covers a wide range: ancient texts in a variety of languages, bibliographic information, dictionary information, lists of Buddhist masters, biographies, talks by living Buddhist teachers, books and articles by scholars, Buddhist maps, reproductions of works of art, and so on. The quality of this input varies, and much of the input data is not (or not yet) made available to the interested public. The following list reflects only what I presently know; but I am sure that all readers will be surprised to learn how much activity there is. I trust that you will return the favor by informing me of ongoing and planned projects not yet listed in this first survey. I must point out that many of the listed institutions are engaged in input of more texts than I could mention here; this list only deals with Buddhist databases.


The Thai Buddhist Canon Project

Institution: Mahidol University, Prof. T. Supachai
Address: Mahidol University Computing Center
Faculty of Science
Rama 6 Road
Bangkok 10400
Thailand
Tel:(662) 245-5410
Fax: (662) 246-7308 Content:

The entire Siam edition of the Pali canon (45 vols.; over 30 million characters), both in Thai and romanized Pali script.

70 volumes of commentary and textbook information

Buddhist Scripture Information Retrieval (BUDSIR) software (version IV)

History: This database was created to celebrate the sixtieth birthday of the King of Thailand. The King then provided funds to input seventy volumes of commentary. The text without commentary was first sold on hard disk with password protection but sold badly. The CD-ROM version will include some improved features, but the text will still be encrypted.

Data volume:

Canon: 30 MB text, 25 MB index, 7 MB dictionary.
Commentary: 53 MB text, 45 MB index
16 MB dictionary.

Data input and correction: The data was input by two typists and then compared by machine. No overall systematic data correction by scholars reading the entire text has so far been done. Some data correction was performed with the help of an automatically generated vocabulary list. A Thai monk appointed for this task by the King made additional corrections; some of these are said to remove mistakes of the printed text. Because such corrections are not identified in the data set, one must regard this CD-ROM not simply as an electronic version of the printed text but as a new edition whose differences from the printed version are unknown.

Environment: Will be delivered as a CD-ROM disc for use on PCs (IBM compatibles). The text data on the CD is encrypted; thus one cannot use alternative search software. The search and display software is basic; one is able to search for single terms, see them in context (at present only in full screen mode, no windows), and view some frequency statistics.

Distribution: Release planned for autumn of 1993. Distribution in Thailand by Mahidol University, in other countries by the American Academy of Religion (AAR; e-mail; contact SCHOLARS@EMORYU1.BITNET or fax (404) 727-7959). Approx. price: US $ 500 for institutions, US $ 300 for individuals.

Prospects: The BUDSIR V software now in development will run on a proprietary Mahidol windows system which allows for two windows and appears to be incompatible with Microsoft Windows. Prof. Supachai appears to have no plans to accommodate users of other computers or software environments.


The Burmese Buddhist Canon (1)

Person in Charge: Mr. S.N. Goenka

Content: The data represents the devanagari edition of the Pali canon published in India.

History: The input of nearly all of the Burmese tipitaka plus commentaries and some sub-commentaries has been completed in India by Mr. S.N. Goenka. The input was done in order to publish a cheap devanagari edition. Mr. Goenka is willing to release the electronic data in romanized form.

Data input and correction: Not known.

Distribution: Some months ago, I heard that this database might be romanized and published through the AAR. I do not know how this and the following project are connected, if they are at all.


The Burmese Buddhist Canon (2)


Institution: Dhammachakka Meditation Center
Person in Charge: Ven. U. Silananda
Address: 68, Woodrow St.
Daly City, CA
94014 U.S.A.
Tel:(415)994-8272 Fax:
(415)239-4245 Content: The Rangoon council Edition (1954-56) of the Burmese Buddhist Canon in Roman script. There is consideration of having the Burmese script edition scanned in analogue form. Distribution: Ven. Silananda would like to make the database available without any copy right protection and restriction.


The Pali Text Society Database


Institution: Dhammakaya Foundation
Person in Charge: Ven. Mettanando
Address U.S.: Dhammakaya Foundation
11 Peabody Terrace No. 1301
Cambridge, Massachusetts
02138 U.S.A.
Fax: (617) 864-0096


Person in Charge: Ven. Dattajivo
Thai Address: Dhammakaya Foundation
23-2 Moo 7
Khlong Sam
Khlong Luang
Patumtahani 12120
Thailand
Fax: (2) 561-1326

Content: The Dhammakaya Foundation in Bangkok has reportedly input almost all of the Pali materials and also the English translations of the Pali Text Society series. Data is in the process of being proofread, but no date for completion has been announced nor a form of distribution agreed upon.


The Tibetan Buddhist Text Database


Institution: Asian Classics Input Project
Person in Charge: M. Roach, R.Taylor
Address: ACIP Washington Area Office
Robert Taylor
11911 Marmary Road
Gaithersburg, Maryland
U.S.A. 20878-5569 Tel and Fax: (301) 948-5569 <

ACIP New York Area Office
Michael Roach
c/o The Princeton Club of New York
Box 57, 15 West 43rd Street
New York, N.Y.
U.S.A. 10036 Tel and Fax: (908)364-1824

Linguistics Information Research Inc.
Hakumyo Niisaku
4-9-15 Koyama
Shinagawa-ku
Tokyo
Tel: (03) 3783-9428
Fax: (03) 3788-6180

Content: Though named "Asian Classics Input Project," this project focuses on Tibetan Buddhist texts and images. Its initial goal is the input of the 4,500 works of the Kangyur and Tengyur collections (Tibetan translations of Sanskrit Buddhist texts). Along with electronic texts, other research tools (dictionaries, bibliographies) are to be published in electronic form at the cost of duplication and mailing only. The various activities of this project and a list of input texts are well described in a rece nt publication: The Asian Classics Input Project: Release Three (ACIP, 1993). This brochure also contains much valuable information on the handling of Tibetan electronic text on different hardware and software platforms. The database now contains about 40 megabytes of data, some of it uncorrected. Corrected are over fifty Tibetan Buddhist texts and many lists and catalogues as well as some text books. Most of this material is available on floppy disks. In addition, some graphics files as well as various co mputer programs and tools are distributed for use on IBM compatibles and other platforms.
History: The project was begun with a grant from the Packard Humanities Institute and the David and Lucile Packard Foundation and has received the support of various organizations, most recently also of the United States Endowment for the Humanities. First release of data in 1990, second in 1991, and third in May 1993.
Data Input and Correction: Input of these texts is now for the most part taking place at the Sera Mey monastery in South India, but ten additional input centers (belonging to all four scholastic traditions of Tibet) are planned in Tibetan refugee communities. At present, input of the Madhyamika section of the Derge Tenyur is under way (using Tokyo University's edition). Uncorrected data is about 98% accurate; the last stages of the correction are handled by Tibetan specialists. Scholars can participate in the correction process.
Environment: The data are in standard ASCII format and can be delivered both for IBM and Apple machines. NEC users in Japan can also use the data but have much less choice in search and other utility software.
Distribution: The data are distributed for the cost of copying and mailing through the addresses listed above.


The Korean Buddhist Canon Project


Institution: Haein Monastery, Korea
Person in Charge: Rev. Chonglim @ Address: Haein Son Monastery
Kaya-Myon, Chini 10
Hapchon-Gun
Kyongnam
Korea 678-860

Content: Input of the entire Chinese Buddhist canon stored on more than 80,000 wooden plates at Haein monastery in Korea is planned. History: The initial input work was organized by Prof. Lancaster of Berkeley with the support of Mr. Park Wan-il, the former president of the Lay Buddhist Association of the Chogye Order of Korea. When the Haein monastery decided to take over the project, all input data were handed over. The continuation of input and correction activities is now being organized.

Data Input and Correction: The first two volumes were input in Shanghai in 1992 (Big-5 code using ETEN and additional self-made characters). The data were twice proofread but will be have to be corrected some more times when revised input guidelines are worked out at Haeinsa. It is likely that the later stages of data correction will be handled by scholar-monks at Haeinsa.


The Beihai Database


Institution: Foguangshan Beihai Training Center
Person in Charge: Rev. Hui Chuan dB
Address: Shih-men hsiang nei
Shih-men Ling-shan lu 106
Taipei Prefecture, R.O.C.
kpRHPOUj
Tel:(02) 6382511
Fax: (02) 6381293

Content: Novice monks at the Beihai kC monastery at the northern tip of Taiwan have input a catalogue of all titles of texts contained in the Taisho edition of the Chinese Buddhist canon. About forty sutras (among them the Lotus, Vimalakirti, Diamond, and Surangama sutras as well as some major Madhyamika and Tiantai texts) have already been input in a format that is easy to read on screen; the punctuation and text stems mostly from editions other than the Taisho .

History: Originally, the Beihai monks did some work on the Qisha edition, a partial edition of the Chinese canon (around 5000 scrolls) that was found in the 1930s and is very close to an exact copy of the first block print edition, the Kaipao edition of Chengdu (10th century). The plan was to use the data from the Korean canon project as basis for this edition; at the moment this project appears to be on hold, but it could well pick up again when the Korean canon project continues. However, the monks did no t rest idle and input the texts listed below.

Data volume: Input sutras amount to over half a million characters. Scriptures input as of May 1993 (in Taisho numbers): nos. 99 (part), 102, 124, 209, 235, 251, 262, 276, 353, 361, 365, 366, 389, 412, 450, 475, 492, 600, 676, 684, 685, 707, 779, 784, 842, 945, 1497, 1558 (part), 1564, 1568, 1569, 1573, 1613, 1614, 1630, 1666, 1915, 1917, 1944, and 2010. Input is progressing at a fast pace; some monks can type faster than any professionals I have ever seen (well over 100 characters per minute). Additionall y, a list of all text titles (and translators, etc.) has been input. As an example of the kind of international cooperation the Electronic Buddhist Text Initiative initiates and promotes, this list was given to Hanazono University's International Research Institute for Zen Buddhism, where Pinyin readings were automatically generated for all Chinese characters in the Beihai file. The resulting file is now being corrected and supplemented with Japanese readings, Pinyin readings, Sanskrit titles, etc. at the H bgirin research institute in Kyoto (see below) and will eventually be sent back to Taiwan and other interested parties. Environment: The data are unformatted text files (Big-5 code) and can be used on IBM- or Apple equipment with the necessary system software.

Data Input and Correction: The Beihai monks input data using ETEN on IBM compatibles; the data are in BIG-5 code with some added characters. The Taisho format was abandoned in favor of a format conducive to reading texts on screen, and often modern style punctuation was added in order to make reading easier. The data correction process and status is not yet known.


Chinese Academy of Social Sciences Database


Institution: Chinese Academy of Social Sciences
Person in Charge: LUAN Guiming
Address: Computer Center
Chinese Academy of Social Sciences
5 Jianguomennnei St.
Beijing, China 100732
EkXj
Tel: 5129614 ext. 2412
Fax: 5135025

Content: Unbeknownst to the outside world, the Academy has begun inputting the entire Taisho edition of the Chinese Buddhist canon. In May of 1993, the first ten volumes were already input (correction status unknown), and input is reportedly progressing at the rate of one million characters a week.

History: The Academy has been involved in input and handling of Classical Chinese data since the mid-eighties; it has thus much experience with full-form characters. It has already input vast amounts of data (Chinese classics, entire poetry collections, the Complete Tang Prose S, etc.) and published a whole series of concordances. When we visited the Academy in May of 1993, we were very surprised to learn about this project and its rapid progress. The director told us that they are financing the input by themselves, and we were unable to find out more about plans to publish or distribute these data.

Environment: IBM compatibles; classical Chinese data are usually input in the Academy's proprietary 45,000 character code and then proofread once by the input personnel; for customers, input is also done in Big-5 or GB code. Master data (proprietary code) and user data (Big-5 or GB code with the inevitable loss of information) are clearly distinguished. The proprietary character code requires a modified DOS environment and a PC card which the Academy apparently is selling; but we did not get a clear answer about the price tag nor about the system's compatibility.

Distribution: It is not clear how the Taisho data will be used and whether distribution is planned.


The Jinbun Kagaku Kenkyusho Database


Institution: Institute for Humanistic Studies, Kyoto University.
Persons in Charge: T. Takada
Address: Kyoto University
Jinbun kagaku kenkyjo
Higashi-ogura cho 47
Kitashirakawa, Sakyo-ku
Kyoto 606
Japan
Tel: (075) 753-7531

History: This is one of the pioneering institutions for the handling of multilingual data. In the eighties, it started printing its renowned yearly bibliography of Asian Studies on Kyoto University's mainframe computer, and several members of the institute are involved in their own database projects. The largest text input is the Taiping yulan in 1000 fascicles (Prof. T. Katsumura), but various other materials (such as some Daoist texts) are also used in electronic form by individual members of the institut e. Input of Taisho texts was started in 1992. A concordance is planned for a text now being studied at the Jinbun, Taisho no. 2085 m@˚B.

Content: The input of volumes 49 to 52 (historical section) of the Taisho edition of the Chinese Buddhist canon is planned. So far, the following texts have been input: Taisho nos. 2059, 2085, 2087, 2088, 2089 (parts 1 and 2), and 2092. These data have passed the basic proofreading stage. Data are in JIS code with placeholders for characters not contained in JIS.

Data Input and Correction: Data are input in JIS code by scanner on NEC equipment using the method described in the Electronic Bodhidharma No. 2. The data are then proofread and corrected by Chinese literature students of Kyoto University.

Environment: NEC 9801-type equipment. For printing purposes, non- JIS characters are created on a Macintosh using the Fontographer program. Distribution: Distribution has not yet been discussed, but in keeping with the Jinbun tradition one can expect the data to be released to specialists.


The Academia Sinica Database


Institution: Academia Sinica
Person in Charge: DING Zy-kaan
Address: Computing Center
Academia Sinica
Nankang, Taipei
11529 Taiwan R.O.C.
Tel:(886) 2-789-9257
Fax: (886) 2-783-6444

History: The Academia Sinica, best known for its 25-history database (40 million characters), is also inputting vast amounts of other data, for example Dunhuang materials collections and Chinese stone inscriptions (some already available on the Academia's online network), a historical place name dictionary jnT, person's names and book titles from the Yongle dictionary iTl, etc. Also already available online inside the institute are about nine million characters worth of classi cs (Thirteen Classics with commentaries, Baopuzi, Zhuangzi and commentaries, Mozi, Liezi, Laozi, Huainanzi, etc.). Texts already input and now being proofread by a team of eight full-time employees include the Luoyang qielanji, the Taiping yulan, the Shishuo xinyu as well as the Gaosengzhuan and Xugaosengzhuan (Taisho nos. 2059 and 2060). In the process of input are now, among many other texts, Zhuxi's yulei, the Wenxuan, the Dunhuang bianwen collection, the forty-two chapter sutra (Taisho no. 784), and a whole list of An Shigao's works (Taisho nos. 13, 14, 31, 32, 36, 48, 57, 91, 92, 98, 105, 109, 112). Further Taisho texts on the Academia's internal use input list: nos. 23, 46, 114, 115, 131, 137, 140, 149, 150a, 150b, 151, 152, 154, 157, 167, 184, 186, 190, 196, 197, 198, 202, 204, 208, 209, 210, 211, 212, 223, 224, 225, 280, 313, 322, 348, 350, 356, 361, 417, 418, 458, 492, 525, 526, 551, 553, 554, 602, 603, 604, 605, 607, 608, 621, 622, 624, 626, 630, 684, 701, 724, 729, 730, 731, 732, 733, 735, 778 , 779, 791, 792, 807, 1467, 1470, 1492, 1508, 1557, and 2027.

Environment: The Academia is using a UNIX network. Data are input in BIG-5 code by a company (70 $ NT per 1000 characters); characters that do not exist in this code are entered as an empty box. The texts are then proofread and edited in the Academia by a team of eight employees. The search procedures are not very flexible yet, and output options are limited. The dynastic history CD is for use on DOS equipment, presumably with ETEN. On the Academia's own computers, non-Big-5 characters appeared as "@", but it appears that about five thousand such characters are defined on the CD. I have not yet seen the CD work in its intended environment.

Distribution: The 25 dynastic histories are on sale in CD-ROM format.


The Chinese University of Hong Kong Database


Institution: Institute of Chinese Studies, Ancient Chinese Texts Database Project.
Person in Charge: HO Che-Wah
Address: Institute of Chinese Studies
Chinese University of Hong Kong
Shation, New Territories
Hong Kong
Tel:609-7376 / 695-2958
Fax: (852)6035149

Content: The first stage of this large project consists of the input of about eight million characters of Chinese classics. About six million will be input by the summer of 1993. Between 1993 and 1995, 36 Daoist texts (ca. 500,000 characters) from the the Six Dynasties will be input, among many other materials. In a subsequent stage (starting in 1995), 47 Buddhist texts (ca. 4.2 million characters) from the the Six Dynasties will be input. Buddhist texts the institute plans to input include: Taisho nos . 154, 223, 235, 262, 278, 360, 366, 374, 388, 397, 426, 453, 475, 618, 670, 816, 1238, 1422, 1478, 1509, 1521, 1524, 1568, 1646, 1659, 1666, 1668, 1819, 1856, 1857, 1957, 1978, 2046, 2064, 2047, 2048, 2059, 2102, and 2145. History: This is essentially a concordance project, and many of the problems the team has so far addressed have to do with producing printed concordances on computer rather than producing electronic text.

The first series of twelve concordances was printed by Commercial Press in Hong Kong and sold out (500 each printed, 250 each sold to Japan). The first part of this large project consists of the publication of 102 Chinese texts in concordance form. At present the staff consists of one researcher who edi ts the texts and writes the footnotes, one computer officer, and three assistants. A rhythm of about one concordance per month is planned.

Environment: Data are input in BIG-5 code, then competently edited and punctuated by one person at the institute, seven times proofread by university students, and finally once more proofread by a Chinese scholar. This process makes one expect high data accuracy; almost fifty percent of the Institute's total expenses go into text editing and data proofreading.

Distribution: Once the printed concordances are sold out, the Institute plans to release the text data and search software (each text separately). It is not yet clear when the Institute will start releasing such text data; but the price for such data is projected to amount to two and a half times that of the printed book (average US $ 40); thus data of a single text might cost as much as US $ 100. Such pricing policies might encourage multiple input and may not make much sense if, as the institute plans, a CD-ROM will be released at a later stage. The data will be provided in a format close to the printed version (including footnotes in pop-up windows and custom non-Big-5 characters) and with dedicated search software for use on IBM compatible PCs.


The Zen Knowledgebase


Institution: International Research Institute for Zen Buddhism, Hanazono University.
Person in Charge: Urs App
Address: International Research Institute for Zen Buddhism
Hanazono University
8-1 Tsubonouchi-cho
Nishinokyo, Nakagyo-ku
Kyoto, Japan 604
Tel:(075) 811-5181 ext. 280
Fax: (075) 811-9664

History: This project was founded in 1990 and consists of four stages: 1. Basic research and software development (1990-93), 2. Main data input phase (1993- 1996), 3. Data linkage and software development, and 4. Publication of the entire data set. A variety of software products have been released since 1990 (see list and description in article below); the first thoroughly corrected Chan texts on floppy are about to be released (summer of 1993).

Content: The aim of this project is the creation of an encyclopaedic base of knowledge for Chan / Son / Zen research centered on primary source materials.

Electronic Chan/Son/Zen texts, mostly from the Taisho or Zokuzokyo canons.

Research software (Indices to relevant dictionaries, lists of text names and authors, text descriptions, etc.)

Electronic tools (automatic concordance generator, text file arranger, search tools, code conversion programs, etc.)

In its initial experimentation phase with manual input and OCR input in JIS code, the institute has input about forty (mainly Zen) texts from the Taisho canon (nos. 475, 842, 1857, 1881, 1985, 1986AB, 1987AB, 1988, 1991, 1997, 1998, 2004, 2005, 2007, 2009, 2010, 2012A, 2012B, 2013, 2014, 2015, 2016, 2017, 2018, 2019A, 2021, 2022, 2023, 2024, 2025, 2831, 2832, 2834, 2833, 2835, 2836, 2837, 2887, and 2901). Most of these data are not yet proofread. The bad convertibility of JIS code was one of the reason s to choose another approach for massive data input (the Zokuzokyo's entire Chan/Zen section W@ [about 15 million Chinese characters] and the historical and biographical section jB [about 18 million Chinese characters]): input in Big-5 code with strict guidelines as to treatment of variant characters, etc.; mastering of data in CCCII code; and distribution in any code according to user wishes (JIS, KS, Big-5, GB, CCCII, Unicode). Environment: Software utilities released so far (for example the kana-kanji conversion tool for Zen terminology) are in JIS for use on NEC-9801, IBM, and Macintosh. A program in a new set of utilities (fall of 1993) specifically addresses the code environment question: a code-conversion program from JIS to Big-5 or Korean KS codes and vice-versa. Using expert code conversions, electronic versions of Zen texts will be converted from CCCII master data into a variety of codes depending on the user's needs. All major hardware and software platforms for personal computers are in use at the institute.

Distribution: Thoroughly proofread electronic text data will be continuously released from 1993. Such data will be sent to the academic institutions listed in the concordance list below and can be copied there. Uncorrected data are made available to people willing to contribute to the proofreading process.


The Korean Buddhist Collection


Institution: Paeng Nyon Buddhist Cultural Foundation
Person in Charge: Rev. Won Young
Address: 1047 Sadang-dong
Dongjak-gu
Jeong am Jeong Sa
Seoul, Korea
678-895 Korea
Tel:(2) 523-8005

Content: The Foundation plans to input the ten-volume Collection of Korean Buddhist Writings S. At present, the input of the works of Won-hyo (approx. 700,000 characters) has been completed. Distribution: The Foundation is considering the AAR as one potential distributor.


The Zenbunka Kenkyusho Database


Institution: Zenbunka kenkyusho
Person in Charge: K. Yoshizawa
Address: Institute for Zen Studies
8-1 Tsubonouchi-cho
Nishinokyo, Nakagyo-ku
Kyoto, Japan 604
Tel: (075) 811-5189
Fax: (075) 811-1432

History: Since around 1986, this institute has had many texts input by companies in Japan and China. The objective was first to produce indices; a series of such indices was published. Now the institute is also having other Chinese data (such as the 500-fascicle Taiping yulan) input; the orientation now includes research of vernacular Chinese. The institute concentrates on the production of printed concordances rather than electronic text. Thus code decisions and data correction are dependent on the hardwa re used at the institute (NEC-9801 type equipment and Fujitsu printers). No electronic texts or computer programs have so far been published.

Content: Due to the variety of input companies and countries the data quality varies greatly. Only data of which indices were produced are proofread. Proofread text data (in JIS, with more than 1000 two-letter alphabetical placeholders): the Zutangji (Chodang jip) cW, the Dongchansi edition of the Jingde chuandenglu iB^, and Taisho nos. 2003 and 2004. Uncorrected data: Taisho nos. 2003, 2004, 2005, 2015, 2017, 2025, 2059, 2060, 2061, 2062, 2548, 2551, 2566, 2574, and a number of texts from the Manji-Zokuzokyo collections (see list below). Additionally, some indices of reference works and repertories of Chan sayings were input. The majority of data was input by various companies in Japan and China using several methods (some using JIS, some simplified Chinese characters, etc.) ; there are thus wide variations in quality. Only data used for the publication of printed indices have been proofread.

Environment: All data are in Japanese JIS code. The many characters not present in JIS are now input in form of alphabetical placeholders, but earlier data use just one placeholder for all such characters, preventing search.

Distribution: The Zenbunka allows use of some of its data on site but has no plans to make any of its electronic text data available. However, there are exceptions.


The Hobogirin Database


Institution: Hobogirin Institute
Person in Charge: Hubert Durt
Address: Hobogirin Institute
Shokokuji Rinkoin
Kamigyo-ku
Kyoto, 602 Japan
Tel:(075) 256-4179

Content: The following projects are now in the input stage:

Electronic Edition of the Supplement of the Hobogirin (Catalogue of the Taisho Canon with Japanese and Chinese Readings)

Multilingual Electronic Indices to the Hobogirin Encyclopedia

History: Driven by the foresight of the late Anna Seidel, the Hobogirin research institute has since the late eighties continuously expanded its computer-related activities. The sixth issue of the institute's French/English journal Cahiers d'Extreme-Asie was completely set and printed on a Macintosh system, and so will the next volume of the Hobogirin encyclopedia. In the course of the institute's encyclopaedic research activities, many useful research tools were created. The transfer of such tools to the electronic medium as well as the creation of multilingual electronic indices promises to become another major contribution of this institute to international Buddhist research.


The Rissho Nichiren Database


Person in Charge: K. Mitomo
Address: Rissh University
2-16, 4-chome Osaki
Shinagawa-ku
Tokyo
Japan 141
Tel: (03) 3492-8528
Fax: (03) 5487-3352

Content: Rissh University in Tokyo has produced a CD with a photographic reproduction of a Kamakura edition of Nichiren shnin ibun. Because text materials are stored as graphic information, one cannot search for words within the text. In addition to this graphic file, a dictionary (Nichiren shnin goibun jiten) is included; this lets users find some words or themes indirectly and informs about the meaning of terms used in the text.

Environment: This CD requires a Hitachi proprietary system and can neither be used on ordinary Japanese PC equipment nor on IBM or Apple type machines.

Distribution: Since much proprietary hardware is required, use at other sites than Rissh is probably impractical.


The Bukkyo University Library Catalogue


Address: Bukkyo University
Hana-no-bo cho 96
Murasakino, Kita-ku
Kyoto 603
Japan
Tel: (075) 491-2141
Fax:(075) 493-9042

Content: Bukkyo University has published a CD-ROM catalogue of some of their library holdings with dedicated software for use on NEC-9801 type equipment. This is essentially a library system, but it lists articles separately. Searches are possible by title, keywords, author, classification number, type of article, date of publication, and language. The first edition was published in 1990; since then, no update has been produced.


The Shugaku kenkyusho Database


Person in Charge: Mr. Ozaki
Address: Shugaku kenkyusho
Komazawa University
Setagaya-ku, Komazawa 1-23-1
Tokyo 154
Japan
Tel:

Content: The Shugaku kenkyusho, a Soto-Zen research institution at Komazawa University in Tokyo, has begun input of various Soto-Zen related data and plans to expand such activities in the future. Dogen's Sanbyakusoku, the twelve-fascicle edition of the Shobogenzo, and the first three fascicles of the Eihei koroku have already been input. History: So far, data input was mainly done for the purpose of producing indices or concordances (the first of which is that of the Sanbyakusoku) , but it is to be expected that with the input of more materials from the Sotosh zensho, the emphasis will shift over to electronic text and online search. The institute has plans to input materials by Dogen as well as other important Soto exponents (such as Keizan).

Environment: Data is input in JIS code, mostly using the OCR method described in the second number of the Electronic Bodhidharma.


The Jodo Shinshu Kyogaku Kenkyusho Database


Person in Charge: Mr. S. Naito
Address: Jodo Shinshu Kyogaku Kenkyusho
Abura no koji, Shomen kado
Shimogyo-ku
Kyoto 600, Japan
Tel:(075) 371-9244
E-mail: Nifty-Serve MHA 01674

Content: The Kyogaku Kenkyusho has input various texts from the Taisho (nos. 1521, 1749, 1876, 1958, 1978, 1963, 2646, 2682) Environment: Data is input in JIS code, mostly using the OCR method described in the second number of the Electronic Bodhidharma.


Individual Projects at Japanese Universities

Input projects at Japanese universities mushroom. Partly driven by the institutions themselves, partly by individual researchers, teachers, or students, data are being input in various ways. By far the most popular input method is the OCR method with our institute's kanji shape file for the Taisho (included in our OCR Toolset). This method is not only used by many individual researchers but also by a number of institutions (for example Kyoto University, Hiroshima University, Ryukoku University).

Usually, texts of interest to the main promoter are input. There is little overall planning involved; as is natural at the beginning of such a revolution, the promoters cater to their own interests rather than to a wider audience. Thus they tend to make compromises in data quality that are acceptable for individual use but unacceptable for use by other researchers and institutions. This also accounts for the fact that hardly anybody seems even to consider inputting data in codes more comprehensive than JIS; there is still no awareness of the difference between data input for a small group of insiders (and possibly a computer-generated concordance) and data input for the purpose of electronic publication. Characters not present in JIS are often simply replaced by a black blob.


I am aware of individual input efforts at the following Japanese institutions: Prof. G. Nagao in Kyoto had Taisho nos. 1592, 1593, 1594, and 1596 input (OCR, JIS). Jonathan Silk (Univ. of Michigan) input also by OCR Taisho nos. 18 (part), 23, 31, 43, 44, 46, 220 (no. 7), 232, 233, 350, 351, 352, 659 (juan 7), and 1469. Data of the last five texts have also been proofread. Students of Prof. Katsura (Hiroshima Univ.) and Prof. Hayashima (Kyushu Univ.) have input the entire 100-fascicle Yogacarabhumishastra (Taisho no. 1579) and reportedly also some other texts such as Candrakirti's Prasannapada and Asanga's Mahayana sutralamkara (T. 1604). Prof. Okimoto of Hanazono University who had earlier students input some texts by hand (Taisho nos. 273, 670, 1484, 1558, 1666, 2883, 2901), has recently been using the OCR method with our institute's dictionary for inputting some additional texts (Taisho nos. 7, 842). Mr. Nonin of Ryukoku University is using the same OCR setup to input the Dazhidulun (no. 1509; 80 fasci cles already input) and has already finished input of Taisho nos. 643, 360, 366, 1749, and part of 417. Some professors of Taisho University in Tokyo are planning the input of a variety of Pure-Land materials. I am sure that many more such individual input projects are taking place; just recently I heard that Prof. Sema from Notre-Dame Seishin Womens' University in Okayama (Japan) has input the entire 50-fascicle Jinglü yixiang S (Taisho vol. 53, no. 2121).

Environment: JIS code with various individual strategies to accommodate lacking characters. Most data were produced on NEC-9801 type equipment; some are text files with undefined lacking characters; these can also be used on Macintosh or IBM equipment. Others are formatted word processor files (Ichitar, Matsu, etc.) which often use proprietary definitions of non-JIS characters and are thus of no use outside of the environment in which they were created.


Individual Input Projects in the U.S.

There are a number of Buddhism-related input projects financed and taken care of by individuals, for example: Jamie Hubbard (e-mail JHUBBARD@SMITH.bitnet) plans the input of Three-Stage Sect materials; Paul Hahn at the Dept. of East Asian Languages, U.C. Berkeley plans the input of the Korean Samguk Sagi and Samguk Yusa histories. Environment: Input of these texts will be done in Big-5 using ETEN.


Buddhist Iconography Database


Person in Charge: John Huntington
Address: Ohio State University
Dept. of the History of Art
100 Hayes Hall
108 North Oval Hall
Columbus, Ohio
U.S.A. 43210-1318
Fax: (614) 292-4401

Content: Database of Buddhist Iconography. Present volume: about 3 gigabytes.

History: Prof. Huntington has collected a great many slides of Buddhist art all over Asia and classified them according to themes and particular elements. He is developing methods to access this large pool of information in new ways; for example, access should be possible by iconographic elements and criteria of visual form rather than only words.

Environment: Macintosh; Laserdisc.


Japanese Buddhist Journal Literature


Institution: Indogaku bukkyogaku kenkyukai
Person in Charge: Mr. Ejima
Address: Indogaku Bukkyogaku Database Center
7th floor, Nihon Shinpan Building
Hongo 3-33-5, Bunkyo-ku
Tokyo, Japan 113

Content: The Indogaku bukkyogaku kenkyukai is since 1988 compiling a database on Japanese secondary literature on Buddhism. The information will cover all articles published in 85 scholarly journals since 1868. Input of titles, authors, journal information, and keywords is in progress at twenty-six Japanese universities.

Environment: Japanese NEC 9801-type equipment. A device driver incompatible with most available software is used for display of diacritics. On IBMs and compatibles, some of the information (letters with diacritics) is lost.

Distribution: A data set of titles of articles that appeared in the societys own journal (Indogaku bukkyogaku kenkyu) is on sale since 1989. It is designed as a relational database and hard to make use of in text format.


The Coombspapers Database


Person in Charge: Matthew Ciolek
Address: Coombs Computing Unit
Research School of Social Sciences
Australian National University
Canberra ACT 0200
Australia
Tel:(6) 249-2214
Fax:(6) 257-1893
e-mail: tmciolek@coombs.anu.edu.au History: The Coombspapers data bank (e-mail address coombspapers@coombs.anu.edu.au) was established at the end of 1991 and is expanding.

Content: This is a system linking electronic archives and databases on Buddhism and other Asian religions. It is a Special Projects part of the wider information system called Coombsquest. Among the social science & humanities material, there is a fair amount of information about oriental religion in general and Buddhism in particular (Electronic Buddhist Archives). The database contains bibliographical information, address lists, papers about Buddhist themes, and talks and interviews (mostly with Zen tea chers). Coombsquest also allows searching a Chinese Buddhist Text Archive at National Central University in Taiwan and a Buddhist Text Archive in Washington, USA.


Dharmafarers Database


Institution: Community of Dharmafarers
Person in Charge: Dh Vidyananda
Address: Community of Dharmafarers
P.O. Box 388
Jalan Sultan
46740 Petaling Jaya
Malaysia
Tel /Fax:(6-06) 611-489

Content: Database on Buddhist Institutions and Communities in South-East Asia.


Buddhist Information on Bulletin Boards

Content: Various Buddhism-related materials (texts, translations, bibliographies, discussions, papers, etc.) are posted on electronic bulletin boards around the world.

Addresses:

Japan: The Vihara interest group of the Nifty-Serve network has a directory with Buddhist texts that were input by individuals. These texts include for example the Awakening of Faith, the Vimalakirti stra, some Pure Land materials such as Tannish, etc. which Japanese monks and laypeople have input on a computer instead of writing them on paper with a brush. The Orient interest group of the PC-VAN network has no posted texts but other materials of interest to Buddhologists.

Australia: At the address coombs.anu.edu.au in the directory /coombspapers/otherwork/electronic-buddhist-archives/ there is a variety of materials related to Buddhism including bibliographies.

Bitnet: At buddhist@jpntuvm0 there is a free, unmoderated forum on Buddhism. It has been organized by Prof. Kawazoe of the Computer center of Tohoku University in Sendai, Japan. Prof. Kawazoe has been involved in a variety of computer projects in the field of Buddhism.

U.S.A.: Dharmanet (Berkeley), managed by Gary Ray (GaryRay@f658.n125.z1.fidonet.org). The telephone number is (510) 268-0102. This network features conferences on various forms of Buddhism, meditation, Tibet news, etc. and has a hookup to the Buddha-L electronic discussion group.

The Buddha-L (Buddha-L@ULKYVM) is a scholarly electronic discussion group about Buddhism and related themes. It also furnishes news on Buddhism and some translations of Buddhist texts. Indology: blackbox.hacc.washington.edu in the directory /pub/indic/ (includes electronic versions of a number of important Buddhist texts such as the Buddhacarita.

Other Contacts: Helga Dyck (e-maill address UMIH@CCU. UMANITOBA.CA) has information on an upcoming conference on electronic journals. The head of the University of Chicago computing center, C.M. Sperberg-McQueen (e-mail U35395@UICVM.UIC.EDU) is knowledgeable about SGML and the Text Encoding Initiative. This is an attempt to standardize electronic text formats and procedures.


American Academy of Religions Database


Institution: American Academy of Religions
Person in Charge: L. Lancaster (chairman of the Electronics Publication Committee)
Address: Department of East Asian Languages
104 Durant Hall
Berkeley, CA
94720 U.S.A.
Tel:(510) 642-3480
Fax: (510)642-6031

History: The Academy's publication committee has established a subsection for electronic publications. This committee devotes efforts to the promotion of databases concerning religion, their distribution, and their use (training and user support).

Content: The AAR will distribute Mahidol University's Pali CD. It also plans to publish a CD which contains all back numbers of the AAR's 14 journals and an electronic dissertation series .


Materials by Individual Researchers

With the increasing use of personal computers, a number of individual researchers make their own work (dissertations, translations of Buddhist texts, papers, bibliographies) available to fellow researchers and students in electronic form. Access to electronic versions can be extremely helpful since it allows searching for any word or concept and eliminates the need to retype quotes. In order to facilitate such exchange and also future publication in electronic form (particularly of translations in connectio n with the original texts), authors should retain the rights for electronic text.


Oxford Text Archive


Oxford University Computing Services
13, Banbury Road
Oxford OX2 6NN
England
Tel.:
(865) 273-238
e-mail:
archive@vax.oxford.ac.uk

Content: Though there is lots of Western literature etc. in this archive, at present there seems to be only one Buddhist text available, namely, the Mahanirvana sutra in Pali (input by Lance Cousins from Manchester University). However, it is possible that more Buddhist materials will find their way into this database.


Author:Urs APP
Last updated: 95.4.16