A Solution to the Problem of Updating Encyclopedias

A Solution to the Problem of Updating Encyclopedias

Eric M. Hammer and Edward N. Zalta

Center for the Study of Language and Information Stanford University

(ehammer,zalta)@csli.stanford.edu

Abstract This paper describes a way of creating and maintaining a `dynamic encyclopedia', i.e., an encyclopedia whose entries can be improved and updated on a continual basis without requiring the production of an entire new edition. Such an encyclopedia is therefore responsive to new developments and new research. We discuss our implementation of a dynamic encyclopedia and the problems that we had to solve along the way. We also discuss ways of automating the administration of the encyclopedia.

The greatest problem with encyclopedias is that they tend to go out of date. Various solutions to this problem have been tried. One is to produce new editions in rapid succession.1 Another is to publish supplements or yearbooks on a regular basis.2 Another is to publish the encyclopedia in loose-leaf format.3 In this paper, we propose a solution to this

This paper was published in Computers and the Humanities, 31/1 (1997): 47?60. The authors would like to thank David Barker-Plummer, Mark Greaves, Andrew Irvine, Emma Pease, Susanne Riehemann, and Nathan Tawil for critical suggestions which often led to improvements in the Encyclopedia's design. We would also like to thank the anonymous referees for their suggestions on how to improve the paper. 1For example, Louis Mor?eri tried this solution with his Grand Dictionnaire Historique of 1674, as did Arnold Brockhaus, in his Konversations-Lexikon, 1796?1811. 2So, for example, there were 11 supplementary volumes to the ninth Edition of the Encyclopaedia Britannica (1875?1889). These constituted the `tenth edition'. 3For example, the second edition of Nelson's Perpetual Loose Leaf Encyclopaedia of 1920. The Encyclop?edie fran?caise is still available in loose-leaf format.

1

Eric Hammer and Edward N. Zalta

2

problem, namely, a `dynamic' encyclopedia that is published on the Internet.4 Unlike static encyclopedias (i.e., encyclopedias that will become fixed in print or on CD-ROM), the dynamic encyclopedia allows entries to be improved and refined, thereby becoming responsive to new research and advances in the field. Though there are Internet encyclopedias which are being updated on a regular basis, typically none of these projects gives the authors direct access to the material being published. However, we have developed a dynamic encyclopedia which gives the authors direct access to their entries and the means to update them whenever it is needed, and which does so without sacrificing the quality of the entries. In the effort to produce a dynamic encyclopedia of high quality, we discovered that numerous problems had to be solved and that routine editorial and administrational functions could be automated. By reporting on our project, we hope to facilitate the creation of such reference works in other fields.

Basic Description of Dynamic Encyclopedias

We have recently developed the Stanford Encyclopedia of Philosophy (URL = ). The principal innovative feature of this dynamic encyclopedia is that authors have an ftp (`file transfer protocol') account on the multi-user computer that runs the encyclopedia's World Wide Web server. This feature not only enables the encyclopedia to become functional quickly, but also gives the authors of the entries the ability to revise, expand, and update their entries whenever needed.

Traditionally, encyclopedias have not been very responsive to new research and developments in the field--it is just too expensive to publish regularly new editions in a fixed medium such as print and CD-ROM. However, a dynamic encyclopedia simply evolves and quickly adapts to reflect advances in research. We believe that the process of updating individual entries never ceases, and that any encyclopedia which takes account of this fact will necessarily be more useful in the long run than those which don't.

Authors who have a strong interest in and commitment to the topics on which they write will be motivated to keep their entries abreast of the

4We conceived of this solution in our effort to implement John Perry's suggestion that the Center for the Study of Language and Information develop an Internet encyclopedia of philosophy.

3

The Problem of Updating Encyclopedias

latest advances in research. Indeed, dynamic encyclopedias may speed up the dissemination of new ideas. Of course, there may come a time when an author wants to transfer responsibility for maintaining the entry to someone else. In such cases, there is the possibility of having multiple entries on a single topic, and this is one of the new possibilities that can be explored in a dynamic encyclopedia.

Here is how we implemented our dynamic encyclopedia. We connected a multi-user (UNIX) workstation to the Internet and installed a World Wide Web server. We then created a cover page, a table of contents, an editorial page, and a directory in webspace entitled entries. We recruited Editorial Board members for the job of identifying topics, soliciting authors, and reviewing the the entries and updates when they are received. Once an Editorial Board member decides on a topic and has found an author to write it, he or she passes on the information to the Editor of the encyclopedia, who creates an ftp account and home directory for the author on the workstation and then sends the author the information on how to ftp the entries and updates when they are ready. So when authors ftp an entry or an update to their home directory, it becomes part of the encyclopedia5 and the Board member responsible for that entry is automatically notified. It is then his or her responsibility to evaluate the (modified) entry and inform the author of any changes that should be made.

The innovative features of a dynamic encyclopedia that has been organized on the above plan are:

1. It can be expanded indefinitely; there is no limit to its inclusiveness or size. New or previously unrecognized topics within a given discipline can be included as they are discovered or judged to be important.

2. It eliminates the lag time between the writing and publication of the entries.

3. It eliminates many of the expenses of producing a printed document or CD-ROM: typesetting, copy-editing, printing, and distribution expenses are no longer necessary.

5The way we have set things up, each entry is given its own subdirectory in the entries directory, and that subdirectory is then linked into the author's home directory. So any files that the author transfers into that subdirectory can be accessed over the World Wide Web.

Eric Hammer and Edward N. Zalta

4

4. It can change in response to new technology as the latter develops, such as new tools, languages, and techniques.

In addition, statistics software can process the information in the access log of the encyclopedia web server and identify which sites users access it from, which entries they access most, which topics they search for, etc. Such information can help inform decisions about which additional entries to solicit, which authors to recruit to write them, etc.

An important motivating feature of using the Internet as a medium is that the encyclopedia can reach a wider audience than is possible with traditional academic journals and books. Because of this, we are recruiting authors capable of writing articles that are of interest not only to specialists.

Computer Supported Collaborative Work

Encyclopedias are, in some sense, a collaborative effort. It seems natural, therefore, to analyze the task of building a dynamic encyclopedia in terms of `computer supported collaborative work' (cscw).6 For example, since both the Editor and the author will have write access to an entry, the place on the disk where the entry is stored constitutes a `group workspace'.7 Thus version control may seem necessary to prevent simultaneous editing by different `group members'.

Version control could prove useful on those rare occasions when the Editor, as opposed to the author, changes an entry to repair a typographically error or fix some problematic HTML code. Although the Editor will typically leave such tasks to the authors, there may be times when quick action by the Editor is necessary. On such occasions, authors and Editor could find themselves in the situation of attempting to modify the entry simultaneously. However, to avoid such conflicts, we instruct our authors to follow a protocol for revising their work, namely, to begin both by notifying the Editor of their intentions and by downloading the current version of their entry from the Encyclopedia. Such a procedure will prevent author and editor from overwriting each others modifications.8

6See Baecker [1993], Baecker et al [1995], Greenberg [1991], and Greif [1988]. 7Only the principal author of coauthored entries will have ftp access to an entry. 8To be absolutely safe, the Editor can always invoke superuser priveleges and prevent the author from further altering the file until the editing process is complete and a local backup is made.

5

The Problem of Updating Encyclopedias

Coauthored entries will obviously be highly collaborative, but these constitute only a very small percentage of the entries. If we ignore coauthored entries, it is striking that some of the distinguishing features of cscw are absent. For example, no member of the group of authors requires information on the current status of the work being done by other group members.9 Moreover, no member of the group of authors requires information about the history of other authors' collaborative activities. Nor do members of the group of authors require information about the process of collaboration (e.g., the roles and responsibilities of other members, and which group members fit into which roles).

These features of cscw, however, do apply to the Editor, who requires information on the current status of the work by the authors, on aspects of the history of the authors' activities, and on the process of collaboration. In addition, members of the Board of Editors will need information about the history of the activities of those authors writing on topics under their editorial control; for example, a board member needs to know as soon as such an author has updated an entry. And, finally, if the encyclopedia project has the financial resources to maintain a large central staff, then such cscw concepts as conferencing, bulletin boards, structured messaging, meeting schedulers, and organizational memory could play a role in the design of administrative procedures.

Since we are operating on a much smaller scale, these last cscw concepts will play almost no role in what follows. The cscw features that do apply will become features of the central administrative control of the encyclopedia and can be managed by properly defined databases and updating procedures. Thus, the cscw concept most relevant to our enterprise is `work flow management'. By analyzing the way in which the Encyclopedia would typically function (i.e., the sequence of tasks of the parties involved and the sequence of transactions among the parties), one can predict and address many of the problems that would affect the smooth operation of the Encyclopedia. These will be discussed in the next two sections. Even the choice of technologies was to some extent dictated by this analysis of work-flow. For example, we investigated SGML as a possible markup language for the Encyclopedia entries and we created a Document Type Definition for a typical encyclopedia entry (thereby defining tags that the

9If an author needs information about what topics the encyclopedia will include, this can be obtained directly by examining the Encyclopedia website or by asking the Editor.

Eric Hammer and Edward N. Zalta

6

authors would use to mark up their entries). Although SGML is superior in many respects, several factors prompted us to choose standard HTML, including (i) the availability of HTML editors and guides (which makes it easy for authors to produce entries in the proper format without extensive training), and (ii) the availability of good, free HTML search engines. Many other choices about the construction of the encyclopedia were made on the basis of such work-flow considerations.

It should be clear from our brief description that a dynamic encyclopedia poses very interesting questions concerning work-flow management. With adequate financial resources, a project of this type might consider buying, adapting, and/or modifying some off-the-shelf commercial workflow management system.10 But few of the systems available seem to be designed to solve the specific problems of the dynamic encyclopedia concept that we wanted to implement. We therefore decided to develop our own solution to the problems of work-flow, one tailored to our specific needs. Having Unix and perl as resources, we have been able to address the special problems that arise in working out the idea of a dynamic encyclopedia.

Problems Facing Dynamic Encyclopedias

First and foremost is the problem of quality control. Whereas all encyclopedias face the problem of choosing high quality board members and authors and the problem of editing entries, the dynamic encyclopedia has the further problem of evaluating changes to entries because authors have the right to access and change their entries when the occasion arises. In a static encyclopedia, once board members and authors are chosen, there is a single further step of quality control which involves the careful editing of submitted entries, so that errors are not published in the fixed medium. In contrast, a dynamic encyclopedia needs a systematic method of evaluating both the new entries posted to the encyclopedia and the subsequent changes made to those entries.

Second, there are the problems involved in producing an electronic work, such as maintaining a uniform entry style and familiarizing authors with markup languages and electronic file transfer.

10See, for example, Medina-Mora et al [1992]. It is unclear to us whether such software as the freely-distributed Egret ( csdl/egret/) or the commercial Lotus `Notes' () would be helpful in this regard.

7

The Problem of Updating Encyclopedias

Third, there are the problems of automating routine editorial and administrative tasks so that the encyclopedia can be set-up and maintained without a large staff. For example, the following processes can be automated: creating accounts for the authors, sending them email about their accounts and the ftp commands they might need, monitoring changes in the content to entries, updating the table of contents, cross-referencing entries, modifying the email aliases (such as the list of the authors' email addresses), notifying the board members that entries for which they are responsible have been changed, etc.

Fourth, there are the issues of copyright. Who should own the copyright to individual entries? Who has the responsibility for obtaining permission to display photographs? What rights do the authors have over their entries? What rights does the encyclopedia have to republish entries in altered form?

Fifth, there are the problems of maintaining the encyclopedia. How often should authors be expected to update their entries? What happens when an author no longer wants to be responsible for updating his or her entry? How do we turn over an entry to a new author? Under what conditions should the encyclopedia allow multiple entries for a single topic?

Sixth, there are the problems of site security. How does one prevent authors or anyone else from gaining access to other parts of the encyclopedia. What if an article is accidentally deleted or damaged?

Finally, there are the issues of citation and digital preservation. How should people using the Encyclopedia cite the articles? What happens if the cited material is subsequently deleted when an author updates or modifies the entry? How will the Encyclopedia be preserved so that the material will always be available for scholarly research in the same way that the citations to current and past encyclopedias are available?

Solutions to the Problems

Quality Control

Like other high-quality reference works, the authors of entries will be nominated and/or approved by a carefully selected board of editors and the entries themselves will be subject to critical evaluation. But given that the authors have the right to access and change their entries at will, the dynamic encyclopedia has the special problem of how to evaluate up-

Eric Hammer and Edward N. Zalta

8

dates to entries. Our solution is to monitor changes to each entry and to notify both the Editor and the editorial board member responsible for that particular entry. When notified of a change, the Editor immediately verifies that the entry has not been accidentally or maliciously damaged. More importantly, however, we have written a script that will send out email notices to the relevant board member automatically, not only when the entry is first transferred to the encyclopedia, but also when any changes are made thereafter.11 A problem with this procedure is that Board members will be notified even if there have been trivial modifications to entries. Though we have configured our script so that changes that the Editor makes to an entry (to fix typographical errors, HTML formatting errors, etc.) are not reported, we are planning to make our script `smarter', so that it reports to the Board member only significant changes to content made by the author.12

Given that entries in the dynamic encyclopedia can be modified, the authors can improve their entries not only in response to comments from the relevant Board member, but also in response to comments received from colleagues in the field. The latter may also be aware of relevant research not mentioned in the article. However, this introduces a controversial element, since commentators might not be satisfied by the modifications, if any, that authors make in response to their comments and may therefore write to the Editors to make their case. So the Editors and Board members of a dynamic encyclopedia must be prepared to moderate between authors and such commentators.

11We have taken advantage of the UNIX `find' program; it is invoked in a script (`modifications') that runs each night and makes note of which entries have been changed in the past 24 hours. The `find' command is invoked with the following flags:

find entries -ctime -1 -name '*.html' -print

This causes `find' to print a list of all the HTML files in the `entries' directory that were altered in the last day. For each HTML file in the list, the `modifications' script then determines which Board member is responsible for the entry and places a timestamped line in that Board member's log file (the log file is simply a list of entries along with the date they were modified and the author of the entry). On a fixed schedule, another script (`send-notifications') then sends the log file to the Board member in an email message. This notifies the Board member that he or she should evaluate the modified entries.

12For example, we are considering ways to use the UNIX `diff' command to tell us which lines in the file are different from the most recent backup copy. The problem with `diff' is the output, which is difficult to read. But there may be a way to convert the output into a more readable format.

9

The Problem of Updating Encyclopedias

As a final resort, the Editors can always remove entries should the authors fail to respond to valid criticism, from whatever source.

Production

To solve the problems of production, we have created an annotated HTML sourcefile of a sample entry. The authors may use this sourcefile as a model, replacing its content with their own content.13 We created a list of HTML manuals available on the World Wide Web and linked this list into the Editorial Information page of the Encyclopedia. For those authors with HTML experience, we created a empty template sourcefile defining the basic entry format, which they can download and simply fill in with their content. Recently, however, a wide variety of HTML-editors have become available and we have created a special page containing links directly to the download archives containing these editors. So the simplest way for an author with no HTML experience to create an entry would be for him or her to first download Netscape Navigator Gold from the archive, download our HTML template from the Encyclopedia, load the template into Navigator Gold, and then complete their entry simply by selecting text that they have entered and using menu items provided by Navigator Gold to format the text automatically.

Instructions which explain these options are automatically sent to the authors when we set up their accounts. These instructions also explain to the authors how to ftp their entry to our machine and get them into webspace once they have created the HTML sourcefile for their entry and tested it locally on their own computer. We have organized the author accounts in such a way that files transferred into the author's home directory immediately become a part of the encyclopedia.14

13The annotations in the sourcefile consist of both instructions and comments. The instructions tell the authors how to eliminate the dummy content and replace it (by cutting and pasting) with the genuine content of their entries. The comments serve to indicate what the special HTML formatting commands are doing.

14We have things arranged so that the author of the entry `entryname.html' will ftp that entry not just to his or her home directory, but to the special subdirectory of his or her home directory entitled `entryname'. This latter directory is created by our new-author script (see below) as a subdirectory of the entries directory and then linked into the author's home directory. Thus, any files the author ftp's into this special subdirectory are available to the httpd server.

Eric Hammer and Edward N. Zalta

10

Automation

We have automated many of the routine editorial tasks so that the encyclopedia can be administered without a large staff. We have written UNIX and perl scripts to do the following: create accounts for the authors (from keyboard input by the Editors), send the authors email about their account and the ftp commands they might need, take notice of newly submitted entries, monitor changes in the content to entries, manage the cross-referencing between encyclopedia entries by linking keywords of new entries to other entries, modify the email aliases such as `authors' (which contains a list of the email addresses of all the authors), and notify the board members that entries for which they are responsible have been changed. Here is a more detailed description of some of the scripts that have been written:

new-author script: This script will perform the system tasks necessary to add a new author to the encyclopedia. The script automatically sets up an account and home directory for the author with the proper access privileges (i.e., `write' privileges for the author and the editors only), updates the encyclopedia databases (containing information about authors and their entries), and mails customized information to the author about how to prepare his or her entry, access his or her account, and transfer the new entry to the encyclopedia's machine.

asterisks script: When an entry is assigned but not yet written, the name of the entry in the table of contents is marked with an asterisk. The `asterisks' script notices when an author has ftp'd a new entry to the encyclopedia and then removes the asterisk from the table of contents.

modifications script: This script sends email on a regular schedule to the Editorial Board members indicating which entries have been modified on which date. It determines which Board member is in charge of the entry and updates that Board member's log file with the filename, author, and date the file was modified.

encyclopedia script: This script is a database manager. It extracts and modifies information in the encyclopedia's databases. Among the tasks it performs are: (a) provide information about an author, (b) provide information about a board member, (c) provide information about an entry, (d) list authors by last name, (e) list keywords to be used for crossreferencing completed entries, (f) add a keyword to the database, (g) remove a keyword from the database, (h) list the entry associated with a

11

The Problem of Updating Encyclopedias

keyword, and (i) list all keywords for a given entry. keyword script: This script verifies and, if necessary, updates the key-

word cross-referencing links between entries. When a new entry is submitted, the script verifies that keywords for which authors have included links are linked to the correct entries. Moreover, any keyword references to the new entry in previously existing entries are automatically linked to the new entry by the script. The script also notifies the Editor if the author has included keywords for which there are no entries in the table of contents. The Editor can then decide either to add the entry to the encyclopedia (or associate the keyword with an existing entry) or to remove the keyword.

It should be mentioned that the selection of keywords is, in the first instance, carried out by the members of the Board of Editors at the stage when they identify topics for inclusion in the Encyclopedia. Since each board member will be chosen for his or her expertise in a philosophy subspecialty, the selection of topics and their corresponding keywords will be driven initially by the perspective that the board members have on their fields. However, the authors will also determine and list the concepts that are essential to understanding the entry they have contributed. When there are discrepancies between the concepts listed by the author and the topics identified by the board member, it will be the job of the Editor to work with these individuals and find the best way to organize the Encyclopedia. These judgements cannot always be made a priori and the keyword script identifies when such judgements have to be made.

Copyright Protection

Authors are instructed to read the encyclopedia's copyright notice before transferring their entry to the encyclopedia. The transfer of their entry constitutes an implicit acceptance of the copyright terms stated. The notice has three parts:15

Copyright Notice. All authors and contributers to the Encyclopedia retain copyright over their work. All rights not expressly granted to the Encyclopedia are retained by the authors. Copyright of the Encyclopedia itself is held by the University. All rights are reserved. No part of the Encyclopedia

15We would like to thank Andrew Irvine, a Stanford Encyclopedia Board member, for his assistance in the formulation of the three parts to this Statement of Copyright.

Eric Hammer and Edward N. Zalta

12

may be reprinted, reproduced, stored, or utilized in any form, by any electronic, mechanical, or other means, now known or hereafter invented, including printing, photocopying, saving, broadcasting or recording, or in any information storage or retrieval system, other than for purposes of fair use, without written permission from the Editors.

This part gives authors copyright over their entries. Note that to view an entry, the web browser accessing it makes a complete copy of the entry somewhere in the user's machine. We are assuming that such copying of entries qualifies as fair use, and is not ruled out by this portion of the copyright notice.

Licensing Agreement. By contributing to the Encyclopedia authors grant to the Encyclopedia a perpetual, non-exclusive, worldwide right to copy, distribute, transmit and publish their contribution, as well as any and all derivative works prepared or modified by the Editors from the original contribution, in whole or in part, by any variety of methods on all types of publication and broadcast media, now known or hereafter invented. Authors also grant to the Encyclopedia a perpetual, non-exclusive, worldwide right to translate their contribution, as well as any modified or derivative works, into any and all languages for the same purposes of copying, distributing, transmitting and publishing their work.

This part gives the Editors a license to use and modify submitted entries. The license give the Editors the right to publish the entry on the Internet, using whatever technology is currently available. It also gives the Editors the right to publish portions of an entry. For example, if someone searches the encyclopedia, a search engine will return only those portions of an entry relevant to the search keyword(s). The Editors may also wish to include a portion of an entry in an advertisement for the encyclopedia or in a description of the encyclopedia. Finally, it gives the Editors the right to modify entries, for example, to add links in the sourcefile to other entries or change the way entries are formatted.

Statement of Liability. By contributing to the Encyclopedia authors grant to the Encyclopedia immunity from all liability arising from their work. All authors are responsible for securing permission to use any copyrighted material, including

13

The Problem of Updating Encyclopedias

graphics, quotations, and photographs, within their articles. The University and the Editors of the Encyclopedia therefore disclaim any and all responsibility for copyright violations and any other form of liability arising from the content of the Encyclopedia or from any material linked to the Encyclopedia.

Because authors have access to their entries, they could include copyrighted material in an entry without the Editor's knowledge. Moreover, there is an interval between the time when an entry is modified and the time when it is checked. This clause protects the encyclopedia and its Editors from any problems with entries arising from these situations.

Maintenance

Dynamic encyclopedias require infrequent but regular maintenance by the authors and Board members, and require only moderate maintenance by the Editor. Once the Board and authors have been selected and the entries have been written, maintenance of the encyclopedia will primarily involve revisions by authors and examinations of the revisions by the board members. The Editor will only need to handle activities that are not automated, such as communicating with authors and the board concerning any problems that arise, troubleshooting the operation of the encyclopedia, and commissioning new entries as new concepts become important.

We suggest that authors update their entries at least once every year. When an author no longer wishes to maintain his or her entry, the Editors and author have several options. One is to leave it in the encyclopedia, indicating that no further revisions will be made. It may come to be of historical interest. The Editor will then have to commission another author to write a second entry on the same topic. A second option is to transfer maintenance of the original entry to someone else, with the details to be worked out between the original author and the new author.

Security

For the most part, the security problems of a dynamic encyclopedia are the usual security problems of system administration. We have given our authors an `ftp account' on our machine rather than setting up an

Eric Hammer and Edward N. Zalta

14

anonymous ftp server.16 So only authors and the Editor can submit or modify entries. Moreover, an author can only modify entries in his or her own home directory.

The only way to protect against malicious and unauthorized access to the machine is to back it up on a regular basis. This also protects the encyclopedia against machine failures. We back up our encyclopedia onto tape and onto an external hard drive.17 This external hard drive has been configured as a boot disk and contains all the system software necessary to run the Encyclopedia. In case the machine that runs the Encyclopedia experiences catastrophic failure, we can install the external hard drive into one of our backup UNIX workstations and reboot, a process that takes fifteen minutes.

Citation and Digital Preservation

We propose that citations to our Encyclopedia conform to the Modern Languages Association style to the citation of electronic sources. The `MLA-style' format for citation is:18

Author's Lastname, Author's Firstname. "Title of Document." Title of Complete Work (if applicable). Version or File Number, if applicable. Document date or date of last revision (if different from access date). Protocol and address, access path or directories (date of access).

16To be precise, we gave each author a login account with a home directory but made it impossible for the author to actually telnet, log on, and run processes on our machine. We did this by assigning a nonexistent UNIX shell `/bin/nosh' as their login shell. When an author ftp's to the machine, the ftp daemon checks to make sure that he or she has been assigned a login shell, but it doesn't require that the shell be a serviceable one. Thus, authors have ftp privileges to and from their home directories, but no login privileges, thereby reducing the load on our server and increasing security. Furthermore, each author's name not only serves to identify his or her home directory but also serves to identify a UNIX `group' (of users), of which only the author and the Editor are members. The author's home directory is assigned to this group, thus allowing only the author and the Editor write privileges to the author's home directory. Even if a password is stolen, at most one entry could be damaged.

17The tape backup is on an incremental dump schedule, with a full dump occurring every two weeks. The daily backup onto the external drive makes a new copy of the users' home directories, the HTML sourcefiles of the encyclopedia entries, and the various programs and support data needed to run a web server.

18See Walker, Janice. "MLA-Style Citations of Electronic Sources." Version 1.1. January, 1995 (Rev. 8/96). (May 12, 1997).

15

The Problem of Updating Encyclopedias

So, for example, a citation to our entry on Bertrand Russell, would look like this:

Irvine, Andrew. "Bertrand Russell." Stanford Encyclopedia of Philosophy. January 28, 1997. (October 12, 1997)

So that cited material does not disappear when entries are revised, we have decided to fix a quarterly edition of the Encyclopedia and store those editions online on a special `Archive Page' of the Encyclopedia. By checking and citing the most recent quarterly edition, one can be sure that the material being cited won't disappear. Thus, the citation to the entry on Bertrand Russell becomes:

Irvine, Andrew. "Bertrand Russell." Stanford Encyclopedia of Philosophy. Fall 1997 Edition. (October 12, 1997)

We are currently exploring whether there are any other alternatives to fixing a quarterly edition.19

Long term preservation of digital information is a somewhat more global problem than secure backup. From the previous section, it should be clear that on any given day, there exist three copies of the Encyclopedia (one on the principal computer, one on external hard drive and one recoverable from the backup tapes).20 We maintain an archive of the backup tapes of the Encyclopedia in a separate building. We also have several similar UNIX workstations in the lab housing the main Encyclopedia workstation and each of these computers could serve as a backup machine. As long as we maintain the present edition and past quarterly editions on 3 separate hardware devices (transferring the data to new technology as it becomes available) and follow the security measures outlined above (employing whatever new backup systems become available),

19The idea of fixing a quarterly edition has the added virtue of providing quarterly deadlines for the authors. This might help the Editors set specific goals for the authors and timetables for completing certain sections of the Encyclopedia.

20Actually, there are four copies, for a second copy of each entry is kept in the Editor's home directory on the principal computer. Whenever the Editor makes any modifications to an entry, a copy is immediately placed in this directory. By contrast, the backups on the external drive and tape drive are made once a day, in the early morning hours.

Eric Hammer and Edward N. Zalta

16

we will have adequately safeguarded the material that appears in our Encyclopedia for scholarly research far into the future.

Conclusion

A dynamic encyclopedia following the above plan, therefore, needs the following administrative staff: an Editor, a computer consultant, and an Editorial Board. The Editor will coordinate the activities of the encyclopedia and maintain the encyclopedia's host machine. The latter may involve some general UNIX system administration, such as updating the httpd installation and search engines, preparing a sample entry that demonstrates entry style, and maintaining the authors' accounts. A computer consultant will write the scripts described above, oversee the technical development of the project, and apprise the Editor of new developments taking place on the Internet.21 Though an advisory board is not necessary, we have one to help us choose the members of our Editorial Board. The Editorial Board will be responsible for soliciting qualified authors to write entries on appropriate topics, and also for evaluating the entries contributed by the authors they solicit.

With a larger budget and support staff, a complete `work-flow' analysis could be developed, which noted and recorded the various (kinds of) transactions between editor and authors and between editor and board member. The Encyclopedia database should keep track of more information about the state of an entry than ours does.22 At some point, we plan to develop a program which automatically sends out notices when it is time for the author of a particular entry to update their entry or bibliography. No doubt there are other ways to automate administrative tasks, and when time and money permit, we plan to implement them.

Although we have designed our dynamic encyclopedia principally with an eye toward solving the update problem, such an encyclopedia has other advantages. One is that there are no constraints on the length or number

21If the Editor has no interest or skills in UNIX system administration, the computer consultant could be assigned these tasks as well.

22For example, we don't currently record when an entry is first put online, whether the last update was a substantive update to the content or an editorial update to fix poorly written HTML code, the amount of time elapsed since the entry was commissioned, how frequently the entry has been updated, when the Board member responsible for the entry last commented on it, etc. Given our limited budget, we have relied on our email record and and calendar to keep track of many of these transactions.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download