A Guide to Using Digital Object Identifiers - …

[Pages:15]A Guide to Using Digital Object Identifiers

For Creators, Publishers, and Information Providers

Table of Contents

Page #

1. Introduction

2

2. Terms and Definitions

2

3. Implementing DOIs

3

3.1 Planning Phase

3

3.1.1 Securing corporate commitment

3

3.1.2 Allocating personnel

3

3.1.3 Identifying equipment needs

4

3.1.4 Planning the implementation

4

3.1.5 Requesting a prefix

4

3.1.6 Identifying digital objects to tag

4

3.1.7 Determining a numbering scheme

5

3.2 Implementation Phase

5

3.2.1 Designing response pages

5

3.2.2 Depositing DOIs

6

3.2.3 Ongoing DOI maintenance

7

4. The DOI System

7

4.1 The Digital Object Identifier

7

4.1.1 Prefix

7

4.1.2 Suffix

8

4.1.3 Check Digits

8

4.1.4 Syntax

8

4.1.5 Granularity

8

4.2 The DOI Directory

9

4.2.1 The Resolution Process

9

4.2.2 The Deposit Process

10

4.3 Associating a DOI with an Object

10

4.3.1 Embedding the DOI in an object

10

4.3.2 Using the DOI in a secure envelope

11

4.3.3 Using the DOI as a descriptive field

11

4.3.4 Using the DOIs in watermarks

11

4.3.5 Using the DOI in other contexts

11

Appendix A - One Publisher's Approach to Assigning DOIs Using the SICI 12

Appendix B - Prototype Participants

14

- 1 -

1. Introduction

The Internet represents a totally new environment for delivering and accessing information. As such, it requires new enabling technologies to protect both customer and publisher. Systems will have to be developed to authenticate content to insure that what the customer is requesting is what is being delivered. At the same time, the creator of the information must be sure that the copyright in the content is respected and protected.

In considering the required new systems, international book and journal publishers realized that a first step would be the development of a new identification system to be used for all digital content. This Digital Object Identifier (DOI) system not only provides unique identification for that content, but also a way to link users of the materials to the rights holders themselves, to facilitate automated digital commerce in the new digital environment.

Just as traditional printed texts such as books and journals provide a title page or a cover for specific identifying information, digital content needs its own form of unique identification. This is important for both internal management of content within a publishing house and for dissemination on electronic networks. Also, in the fast-changing world of electronic publishing, there is the added problem that ownership of information changes, and the location of an electronic file may change frequently over the life of a work. The DOI system has the technology necessary to permit an identifier to remain constant when the actual rights holder changes.

The Internet environment creates an expectation among users that resources can be linked, and that these links should be stable. The DOI system is seen as a way to link the reader or user of content to related materials such as classroom exercises, supporting data, videos, or sound clips. In the future, the unique and persistent identifier is envisioned as an enabler for the electronic processing of routine transactions over the Internet, such as document retrieval, clearinghouse payments, and licensing. As noted above, the DOI system can be used in both internal digital content management within a publishing house, and in the evolving world of Internet commerce.

After development and testing during the past year, the DOI System has entered Phase II of the Prototype. Based on the experiences of about a dozen publishers who participated in the first phase of the Prototype, this booklet has been developed to describe the steps required for successful implementation of the DOI system during this phase of the Prototype. It also describes the DOI system, its components, and its features.

2. Terms and Definitions

? A Digital Object Identifier (DOI) is a character string that uniquely identifies an object. DOIs are stored in a DOI Directory along with an associated URL. The DOI Directory is managed currently by the Corporation for National Research Initiatives, referred to in this guidebook as the Directory Manager. In subsequent phases of the Prototype, more Directory Managers and Directories may be added.

? Although publishers will form the bulk of new DOI system participants, (and this guidebook uses the term Publishers in many explanations and examples) a variety of businesses closely related to the publishing industry will also be requesting prefixes and using the DOI system. Therefore, the terms Organization and Information Provider are used in this document as well. The term Registrant is used in specific

- 2 -

examples to refer to an entity which has secured a prefix and is actively depositing DOIs into the system.

? The Directory Manager Liaison is designated by a registrant to be the point of contact for all DOI-related activities. The liaison will most likely be the member of the publisher's technical staff who will be directly involved in his organization's dayto-day DOI implementation. An organization may also wish to designate a Technical Team Leader for technical oversight and related tasks such as long range planning.

? It will be the liaison's responsibility to request a Prefix for the organization from the Directory Manager. The prefix forms the basis for the DOI, and is covered in depth in this guidebook. DOIs are described as being deposited in the DOI system at the time the liaison creates a record in the system associating the DOI number with an object. When used in conjunction with the World Wide Web, DOIs will be associated with Response Pages, a term used here to describe the Web page returned to a customer when that customer clicks on a DOI.

3. Implementing DOIs

3.1 Planning Phase

After an organization has decided to use the DOI system, there are some steps to follow that will ease the way. Although listed below in sequence, many of these steps can be done in parallel.

3.1.1 Securing corporate commitment. As with the launch of any new product, publishers have found it important to explain the DOI system throughout the organization and build a commitment to its use by sharing information about the many benefits. Senior management should be aware of the decision to use DOIs, and key personnel in all departments should also be involved because the DOI system provides benefits to production, marketing, sales, editorial, royalty tracking, fulfillment, inventory and the like. Letting everyone know about the system will help gain their commitment and their suggestions for use, which will ensure maximum benefits early on. A team approach works well.

3.1.2 Allocating personnel. To use the DOI system, an organization must designate a point of contact for the Directory Manager. Many people in an organization can be given access to the system and be authorized to assign DOIs and make changes, but a single Directory Manager Liaison will ensure that the system works smoothly and that the Directory Manager is able to contact an organization when there is a problem. This one individual will register an organization, obtain the prefix and password (see below), and be the focal point for communication with the Directory Manager. The contact will also authorize other users within the organization to deposit DOIs and make changes.

In addition to the Directory Manager Liaison, there should be a Technical Team Leader responsible for project planning and development. Typically, this person will conduct analyses and consult within the organization for systems integration and long term planning. There is also a need for technical oversight which involves coordination with local IT staff to maintain the publisher's server and response pages, to develop a gateway to legacy data, and to perform configuration management functions.

- 3 -

3.1.3 Identifying equipment needs. Each publisher needs a Web server to respond to DOI requests. There is no DOI-specific hardware or software required. The registration and updating of DOIs in the directory can be done using normal Web browsers, and the DOIs can point back to standard Web documents or scripts. Many organizations use existing Web servers, while others designate a special computer for this purpose. Specific configurations depend entirely on the number and size of documents and other digital objects to be served, and the desired performance. This analysis is no different from that which would be done for Web servers in general. And of course, the hardware can be changed at any time without affecting the DOIs, which is one of the arguments for using them.

3.1.4 Planning the implementation. Using DOIs requires a different type of planning than using print products does. If an organization uses the DOI system for internal digital object management, it must consider how to integrate DOIs into its current and planned information management systems. If an organization is using DOIs as enabling tools for Internet commerce or marketing, it must consider the vehicle it will use for communicating about the product to the public. This could be an existing home page, or it could be a separate product-offering screen. An organization might consider how to publicize the Web site where access to all of the DOIs begins. And it may consider various other options such as including DOIs in abstracting and indexing services. These "product offerings" or initial screens must be a part of the implementation plan. DOIs can be used instead of URLs (or other identification schemes) in current on-line catalogs, tables of contents, or internal digital libraries. Note that all of this can be changed over time; one of the main advantages of the DOI is the ability to change the location and organization of online material without changing the public references to it.

3.1.5 Requesting a prefix. The publisher's prefix forms the basis for the DOI number; one should be requested as soon as the commitment to use DOIs is made. Registering to get a prefix is easy. The liaison completes the online registration process available at , which includes the option for electronic payment, the quickest way to get up and running. The process requires that registrants provide the Directory Manager with some basic corporate information, and pay an initial registration fee ($1000 US during Phase II of the Prototype). The Directory Manager will assign a prefix, validate the information provided by the Liaison, and provide the Liaison with a DOI system login and password. [A full description of the prefix registration and payment process can be found at .]

3.1.6 Identifying digital objects to tag. The variety of digital objects to which publishers may assign DOIs is virtually unlimited; however, members of other communities have indicated that DOIs would be more useful to them if some guidelines were followed for DOI assignment. With this goal in mind, representatives from the publishing, library, and academic communities will meet during Phase II of the Prototype to discuss guidelines.

Digital objects can be existing electronic products, newly planned digital offerings, or even digital product announcements, online services, or internal digital archives. DOIs can be assigned to an entire series of objects (all the digital articles, abstracts, photographs in a collection) or a few items which may be used to test the market (a new release, an online book, a special video or multimedia work). When assigning DOIs, abide by the copyright law.

It is possible to have more than one DOI point to a single object. In the case of a virtual object, for example, there might be a DOI pointing to it as the "latest in a series," and a second DOI which would stay with the object, even after it has ceased to be "the latest." We recommend that publishers be cautious about assigning more than one DOI to an object, however, and do so only in special cases.

- 4 -

To gain experience, it may be wise to begin with a subset of the digital content an organization plans to "tag", perhaps 15 to 25 items. Assigning a single DOI, and seeing how easy it is, can quickly lead to widespread adoption of the system.

3.1.7 Determining a numbering scheme. The DOI is a set of numbers, letters and other characters, constructed of two components. The first element is the prefix, assigned by the Directory Manager. The second element is the suffix, which is assigned by the registrant to the specific content being identified. The DOI is intended to be a permanent identifier for a specific object, and its construction should be given careful consideration. The registrant must decide what numbering scheme will be used, and if a number with some type of structure or meaning is most appropriate for the digital objects being identified. For example, a registrant may choose to use an existing internal numbering scheme, or a descriptor such as the ISO standard SICI code, or another image or product number. The information provider is responsible for maintaining uniqueness within the suffix numbering system, although the directory will automatically check for uniqueness when the new DOI is deposited. [See Section 4.1 for a discussion of the components of the DOI number.]

3.2 Implementation Phase

3.2.1 Response Pages. Current implementations of the DOI System involve the World Wide Web. Response pages are the Web pages that are returned to the user when the user clicks on a DOI. Typically, the structure of a Web site that includes DOIs is the same as a site that does not include DOIs. However, in a Web implementation of DOIs, the URL associated with a DOI is added to the DOI Directory at the time the DOI is deposited. As illustrated in Figure 1, this requires that the material to be retrieved by a DOI (web page, sound bite, photograph, etc.) be positioned in a Web site file structure prior to depositing the DOI in the Directory. Both Web site design and content need to be addressed early in the DOI process.

Response Page

/articles/article.pdf journal/apr97_toc.html

figs/sound.avi

[Proxy address] [DOI]

figs/film.mov

Figure 1 - Response Page Uses DOI to Link to Table of Contents and Relative Links For Additional Links to Articles, Sound Files and Movie Clips

Using DOIs with Web pages requires decisions as to how the DOIs will appear on the pages. Current Web design allows for a variety of ways. A word or phrase or title can be designated to appear in a specific color to signal that it is a DOI that will result in an action when "clicked." In many of the Prototype applications of the DOI, copyright holders chose to signal the use of DOIs by providing an icon or a distinctive button that would tell the user that the link was in fact a DOI and could be trusted. But for design purposes this is not necessary. Conventional hypertext links will work as well.

- 5 -

Although to the user the DOI will appear as a hyperlink, a button, or an icon, in the source file the DOI will appear as a URL in an anchor tag. In the example below,

the URL "" points to a proxy server that takes DOIs in URL format, as shown above, and sends them on to the DOI system for resolution. This is one of several ways to resolve DOIs, but for now, it is the only way that will work with unenhanced browsers.

Response screens may vary depending on the type of objects to be retrieved. When DOIs are used in an internal tracking system, clicking on a DOI might return the actual digital object. A commerce system might offer a response page that identifies the object and tells the user how to obtain further information, including purchase instructions. Such a page might include bibliographic data or an abstract of an article. A response page for digital photographs might provide descriptors of the image or an actual thumbnail version with instructions on how to obtain the high resolution digital object. A movie system might use a brief synopsis of the movie as its response page, with instructions on how to order the full digital rendition for online viewing or downloading for single or multiple plays, or possibly a QuickTime clip of the movie. The DOI system is infinitely flexible. Response pages should, at a minimum, verify that the item found is the item requested.

Web pages featuring DOIs may still need to include URLs. DOIs are ideal for resources whose locations might change or in instances when the reference to the resource moves beyond the control of the copyright holder. Journal articles, for example, are cited in other research. The citations have a life of their own, and their proliferation is beyond the control of the entity that holds copyright in the journal. Hence, DOIs provide assurance that wherever the citations turn up, they will remain accurate and reliable links even years after the content has been relocated multiple times to various archives and digital repositories, assuming of course that the content owners keep the appropriate directory entries current. For content that will remain constant or for administrative links, URLs might be preferred.

3.2.2 Depositing DOIs and Their Associated Data. The Directory Manager maintains full administrative support for DOI registration at the Web site . Simple forms are available for entering a small number of DOIs one at a time, and a batch mode deposit method may be selected for depositing large numbers of DOIs. DOIs entered one at a time can be tested for accuracy during the deposit process. Using the DOI system may be made easier for a publisher if the organization's existing internal management tools and relational databases are modified to automate the process of assembling the DOIs and their associated URLs. In fact, many organizations have systems for inventory control, internal databases for content, and internal tracking systems for works in progress from manuscript receipt to editorial functions, and production through to release. DOIs can be added to the contents of these systems. [See also Section 4.2.2 The Deposit Process.]

The DOIs are available for access immediately after deposit in the system, so the DOIs are "live" as soon as the corresponding response pages are enabled. Figure 2 shows a simple list of five DOIs and their related Web sites. Figure 3 is an excerpt of a batch load file that deposited over 50,000 DOIs, and shows the format used. Note that the current URL for a given DOI can be altered at any time by the owner of the DOI.

- 6 -

DOIs

URLs

10.1234/5678 | 10.1234/5679 | 10.1234/4356 | 10.1234/4357 |

Figure 2 - DOIs and Related URLs

10.1002/002-8231(199702)48:22.3.TX;2-Q 10.1002/002-8231(199702)48:22.3.TX;2-V 10.1002/002-8231(199702)48:22.3.TX;2-Z 10.1002/002-8231(199702)48:22.3.TX;2-Y

Figure 3 - Batch Load File Format

3.2.3 Ongoing DOI maintenance. Publishers are responsible for the ongoing quality of data associated with each DOI, to ensure that DOIs continue to resolve to valid URLs regardless of hardware, software or personnel changes occurring within an organization. In the future, tools may be developed by technology providers to assist with DOI maintenance; additionally, publishers may wish to develop tools of their own.

4. The DOI System

4.1 The Digital Object Identifier

The DOI is a simple set of numbers and letters that has no intrinsic meaning. It is a "dumb number" which will always be unique and never reused. It is composed of a series of parts comprising a prefix and a suffix, as illustrated in Figure 4. Use of the code is optional, and is discussed below.

Prefix

Suffix

/ 10.15678 [SICI]0002-8231(199601)47:102,3.TX;2-B

Directory RegistrantPrefix

"10"

(numeric)

Code

Item ID

(optional) (any format)

Figure 4 - DOI Number

4.1.1 Prefix. The first component of the prefix, currently a "10," identifies the Directory which maintains the current record for that DOI. The second component, the "registrant" prefix, is a sequential number assigned to organizations by the Directory Manager. Organizations may choose to register a prefix for each major brand or product line. The prefix is connected to the directory number by a period.

- 7 -

4.1.2 Suffix. The suffix is the last component, separated from the prefix by a slash. Registrants assign the suffix using a numbering system of their own choosing. Several industries have come to rely on internationally recognized standards for identifying products. For example, in the sound recording industry, producers use the International Standard Recording Code (ISRC). The motion picture industry is working on the development of an International Standard Audiovisual Number (ISAN). The Authors, Editors and Composers use an International Standard Work Code (ISWC). Journal publishers use the International Standards Organization (ISO) recognized standards for identifying journals called the International Standard Serial Number (ISSN); for items and contributions in these journals, the Serial Item and Contribution Identifier (SICI); and for books, the International Standard Book Number (ISBN). Under development is the Book Item and Contribution Identifier (BICI).

If a standard numbering scheme such as those listed above is used, it is recommended that the suffix begin with the bracketed character set which is the recognized code for that standard, and known as an International Standard Digital Identifier (ISDI). This character set is identifed as the "code" in Figure 4. This will indicate that the object identifier is a standard, and may also be useful for automated systems set up to interpret ISDI tag sets. [See Appendix A for a description of the approach taken by one publisher for assigning DOIs using the SICI and ISSN codes. Note that the work was performed early in the prototype before the use of optional ISDI codes was introduced.]

4.1.3 Check Digits. Information providers are free to use check digits in the suffix numbering scheme, although most publishers have concluded that this is unnecessary. The DOI system does not calculate the check digit or otherwise attempt to determine the validity of a DOI, and the general conclusion is that check digits are unnecessary because DOIs are rarely entered by hand, and there is adequate confirmation and feedback whenever an invalid DOI is used.

4.1.4 Syntax. The DOI's underlying technology is designed to accept any unique string. There is a current maximum character size of 128 per DOI that is expected to be increased significantly during Phase II of the Prototype. Because the DOI is currently being used in the context of the Web, there is a small set of characters that cannot be used inside DOIs without special encoding. They are % (percent), # (hash mark), " (double quote), space, and tab. [See for a full explanation of DOI character encoding.]

4.1.5 Granularity. A DOI may be assigned to any object, large or small, and objects with DOIs may include other objects with DOIs. For example, this means that an issue of a journal might have a DOI, and so might each article in the journal. Each article may also have a DOI for the abstract if it is made available separately, as well as DOIs for charts and images, for supporting data and for related elements which might not have been included with the article in the printed journal. We recommend that information providers create DOIs for objects that are separately sold, and more granular DOIs be created when and as the market for those objects emerges.

- 8 -

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download