CONTENTS



HARMONICA

Harmonised Access to Music and Music Information in Libraries

Libraries Project: PROLIB/HARMONICA 10453

Commission of the European Communities

LIBRARIES PROGRAMME

Remote Access and Transfer of Audio Recordings

Deliverable Number: D3.6.3

|Version: |1.0 |

|Date: |30. 11. 1999 |

|Authors: |Werner A. Deutsch, Siegbert Herla , Werner Kriechbaum |

|Confidentiality: |Public |

|Status: |Final |

| | |

| | |

| | |

|This document consists of pages plus this cover |

CONTENTS

1. Introduction 1

1.1 Purpose and Scope 1

1.2 Applicability 1

1.3 Acronyms and Abbrevations 2

2. Analogue Audio 4

2.1 Analogue Tape-Formats 4

2.2 Digital Audio Formats 4

2.3 DVD - A Breakthrough in Digital Audio? 5

3. Digitisation 6

3.1 General 6

3.2 Digital Audio Workstations (DAW) 7

3.2.1 Analogue to Digital Conversion – Dynamic Range 7

3.2.2 Sample Rates and the 24 – 20 –16 bit Story 7

3.2.3 Linear PCM Audio 16-bit is not obsolete! 8

3.3 New Digital Audio Formats 9

3.3.1 Archive File Formats 9

3.3.2 Broadcast WAVE Format (BWF) 9

3.3.3 ‘Unique’ Source Identifier (USID) 11

3.4 BWF - A Music Library Audio Format? 12

3.5 De-Facto/Industry Digital Audio Compression Standards 12

3.6 Digitisation Procedure Quality Control 13

3.7 Signal Enhancement and Signal Restoration 13

4. Audio Segmentation and Content Description 15

4.1 Query by Audio Content (QBAC). 15

4.1.1 Sound File Segmentation 15

4.1.2 Links Refering to Segments 16

4.2 Content Description 19

4.3 Visualisation of Music Signals 21

4.4 Future Development of Content Driven Approaches: Audio Description Schemes 22

4.4.1 The MPEG-7 Audio Descriptor Scheme (tentative) 23

5. A Modular Archive Model 25

5.1 Acquisition of documents and information 25

5.2 Archival Storage 26

5.3 Data Management 26

5.4 Administration 27

5.5 Access 27

6. Audio-Networking 28

6.1 The Role of Libraries and Archives on the Internet 28

6.1.1 General 28

6.1.2 A Glimps on Copyrights 29

6.2 Library and Archive Services on the Internet 29

6.2.1 Connectivity 29

6.2.2 Delivery Model 30

6.3 MP3 – An Evolving Digital Music Delivery Sector 31

6.4 Frequently Used Bit Rates 32

7. Appendix 33

7.1 A Sample List of Audio Players 33

Introduction

1 Purpose and Scope

The purpose of this document is to provide an overview and references to information on transfer of analogue audio recordings into digital formats (digitisation), local and remote storage of data as well as access and retrieval in a library or archive environment. A description of current state of the art data acquisition workstations, quality control and archive reference models is given. New developments in network capabilities providing better service to selected network traffic over various technologies (QoS: Quality of Service networking) are in discussion.

• Analogue Audio: formats still working

• Digital Audio Workstations (DAW): data acquisition, local storage, segmentation, technical metadata generation, access issues

• Archive Reference Model: ingest station, archival storage, data management, administration, access

• Local area network (LAN) configuration: Gigabit, ATM, QoS, tailored services, special requirements of streaming media

• Internet connectivity: FTP, MP3, Real Audio

2 Applicability

The issues described in this document may be applicable to any sound storage, archive or collection. They are applicable to organisations with the responsibility of providing information on a temporary basis as well as for the long term. When taking the rapid pace of technology changes or possible changes in a Designated Community into consideration, there is the likelihood that facilities thought to be holding information on a temporary basis will in fact find that some or a lot of their holdings will need the same kind of attention as that given by permanent archives.

The deliverable is not intended to offer a de facto standard for digital sound document engineering, but rather as help for digitisation, editing, tagging, indexing and storage. It is aimed towards library and archive institutions that already have or are building up the equipment and expertise to digitise sound documents in-house. It addresses the more standard formats of sound. If it is planned to digitise primarily historic or fragile and rare sound documents or materials in non-standard formats and sizes, it might be considered outsourcing the digitisation of these materials to specially equipped laboratories or institutions[1].

3 Acronyms and Abbrevations

A/D - Analogue/Digital

AES - Audio Engineering Society

AIC - Archival Information Collection

AIP - Archival Information Package

AIU - Archival Information Unit

ASCII - - American Standard Code for Information Interchange

BEXT - Broadcast Extension Chunk (BWF)

BWF - Broadcast WAVE file format

CAD - Computer-Automated Design

CAR - Computer-Aided Radio

CCSDS - Consultative Committee for Space Data Systems

CD-ROM - Compact Disk - Read Only Memory

CIP - Catalog Inter-operability Protocol

CRC - Cyclical Redundancy Check

D/A - Digital/Analog

DAPA - Digital Audio Production and Archiving (EBU working group)

dB FS - Decibel (relative to full scale value)

dB m - Decibel (relative to 1 mW)

dB r - Decibel (relative to an absolute reference)

dB SPL - Decibel (relative to 20 μPa)

dB u - Decibel (relative to 0.7746 V) equiv. to dB m at 600 Ω

dB v - Decibel (relative to 1 V)

dB - Decibel (1/10 Bel)

DBMS - Data Base Management System

DDL - Data Description Language

DED - Data Entity Dictionary

DFAS - Distributed Finding Aid Server

DIP - Dissemination Information Package

DLI - Digital Libraries Initiative

DR - Dynamic Range

DSD - Direct Stream Digital (1-bit delta sigma technology: SACD)

DTD - Document Type Definition

DVD - Digital Versatile Disk

DVD-Audio - DVD working group specification audio

EAD - Encoded Archival Description

EBCDIC - Extended Binary Coded Decimal Interchange Code

EBU - European Broadcasting Union

ERL - Electronic Reference Library

FITS - Flexible Image Transfer System

GIF - Graphics Interchange Format

HDCD - High Definition Compatible Digital format (Pacific Microsonics)

HFMS - Hierarchical File Management System

HFS - Hierarchical File Server

HTML - Hypertext Markup Language

ICS - Interoperable Catalogue System

IEEE - Institute of Electrical and Electronic Engineers

IMS - Information Management System

ISBN - International Standard Book Number

ISO - Organisation for International Standardisation

LSB - Least Significant Bit

MPEG - Motion Picture Expert Group

MPEG 1 - Coding for Moving Pictures and Associated Audio for Digital

Storage Media up to 1.5 MBit/s

MPEG 2 - Generic Coding for Moving Pictures and Associated Audio

(multichannel audio)

MPEG 1 Audio Layer 1 Audio Coding Scheme (compression ratio: 1:4)

MPEG 1 Audio Layer 2 Audio Coding Scheme (compatible to Layer 1)

MPEG 1 Audio Layer 3 Audio Coding Scheme (compatible to Layer 1 & 2)

MP3 - syn. MPEG-1 Audio Layer 3 (compression ratio 1:10...12)

NARA - National Archives and Records Administration

NASA - National Aeronautics and Space Administration

NSF - National Science Foundation

OAIS - Open Archival Information System

OCR - Optical Character Recognition

ODL - Object Description Language

ODLS - Oxford Digital Library Services

OPAC - On-Line Public Access Catalogue

PCI - Periodicals Contents Index

PDI - Preservation Description Information

PDMP - Project Data Management Plan

QBAC - Query By Audio Content

QBIC - Query By Image Content

QoS - Quality of the Service

RIFF - Resource Interchange File Format

RLG - Research Libraries Group

SACD - Super Audio CD (Sony, Philips format, uses DSD)

SGML - Standard Generalised Markup Language

SIP - Submission Information Package

Super CD - DVD-based audio formats, superior to audio CDs.

TEI - Text Encoding Initiative

UML - Unified Modelling Language

UNICODE - Universal Code

WAV - Windows Wave File

WWW - World-Wide Web

Analogue Audio

The majority of historic sound documents and many recent digital remakes are still based on analogue audio formats. Analogue audio technology has matured to an excellent high level of sound quality from the early beginnings of non-linear Thomas Alva Edison´s horn recordings to the almost linear transmission chain of today. Generations of audio engineers have optimised the linear transfer function of tape recorders, amplifiers and disc cutting machines in order to produce noiseless and distortion free recordings. It was left to our generation to re-introduce nonlinearity when perceptive coders as MP3 are applied in broadcasting and network environments. Perceptive coders perform lossy coding omitting the possibility to reconstruct the original signal. Nevertheless, MP3 sounds acceptable, allows low bandwidth connections and has yet developed to one of the most frequently used audio formats for audio transmissions over the Internet.

The European Broadcasting Union (EBU) lists considerable less analogue audio tape formats as digital ones being referenced in the broadcasting area. This (4/1997) list is remarkably conservative and several additional digital formats have been introduced in the meantime (see also Section 3 of this document):

1 Analogue Tape-Formats

A01 6.3 mm analogue audio Full track

A02 6.3 mm analogue audio 2 channel

A08 12.5 mm analogue audio 8 channel

A16 25.4 mm analogue audio 16 channel

A32 25.4 mm analogue audio 32 channel

AS2 6.3 mm analogue audio 2 channel stereo

AT2 6.3 mm analogue audio 2 channel stereo & TC

CCA Compact Cassette audio

2 Digital Audio Formats

CDA Compact Disc Audio

D24 25.4 mm digital audio DASH 24 track

D32 25.4 mm digital audio PD 32 channel

D48 25.4 mm digital audio DASH 48 track

DA2 DAT format digital audio 2 channel

DAT DAT format digital audio Stereo

DD2 6.3 mm digital audio DASH 2 channel

DP2 6.3 mm digital audio PD 2 channel

3.5" data diskette - FD5

5.25" data diskette - FD8

8" data diskette - H8A

Hi-8 digital audio 8 channel

MO disk 600 MBytes capacity

M12 MO disk 1 200 Mbytes capacity

M13 MO disk 1 300 Mbytes capacity

NAB NAB audio cartridge -S16

A-DAT digital audio 8 channel

3 DVD - A Breakthrough in Digital Audio?

Although people have always been full of expectation for improved sound quality available from higher sample rates and greater resolution, technical limitations have made it impractical to implement a new standard until now. The arrival of DVD — the Digital Versatile Disk — has the potential of transforming the possibility for improved digital sound. A DVD offers over seven times the storage capacity of a CD. This additional capacity allows for an improvement in the resolution of the audio signal as well as in sample rates. The DVD specification allows for – and manufacturers of audio equipment and programme material (originally a group of 10 members was holding more than 4000 DVD related patents) use - many different resolution levels; but the dominant standard for high quality audio samples at 96 kHz to a resolution of 24 bits. This translates to 16,777,216 (2^24) different possible amplitude levels at a theoretic dynamic range of 144 dB. Combined with a more than doubled sampling frequency, DVD audio offers over 500 times the resolution available from CD. DVD audio provides for the first time in audio engineering the technical potential to produce and distribute music at a sound quality considerably better than human listeners can hear.

To probe further:

For a survey on historic sound formats and preservation issues see Harmonica Deliverable D3.1 “Analogue Documents, Carriers and Formats”

Digitisation

1 General

Although many digitisation projects will be done for purposes of increased access to sound archives and library collections primarily, preservation is often a natural by-product. Digitisation should therefore be performed with a "preservation mindset." This mindset implies[2]:

1. Performing analogue to digital (A/D) conversion and digital to analogue (D/A) conversion at the highest sample rate appropriate to the nature and the informative content of the originals

2. Performing analogue to digital conversion at an appropriate dynamic range and sound quality to avoid re-doing the transfer and re-handling of the originals in the future - digitise once only

3. Creating and storing a linear coded master sound file that can be used to produce derivative and compressed sound files in order to serve a variety of current and future user needs (i.e. data reduced and perceptual coded copies for browsing and Internet access)

4. Using system components that are non-proprietary

5. Using sound file formats, editing systems and data compression techniques that conform with industry standards

6. Creating backup copies of all files on a stable medium

7. Creating meaningful metadata for sound files and associated documents including cataloguing issues (if appropriate)

8. Monitoring and recopying data if necessary

9. Outlining a migration strategy for transferring data across generations of archive and access technology (plan for obsolescence of current hard- and software technology)

10. Anticipating and planning for future usage and technological developments

This document occasionally suggests the minimum hardware standards as well as high end professional audio equipment but libraries and archives should not just "do the minimum." Analogue to digital conversion at a higher sample rate and resolution rather than to the minimum required is encouraged. One plausible argument in favour of adapting a higher technical standard from the beginning is given by the development of costs during a digitisation project. Cost relation for equipment and staff usually range 1:1 at the start of a digitisation project and drops to 1:4 after a 4 year period.

2 Digital Audio Workstations (DAW)

1 Analogue to Digital Conversion – Dynamic Range

The main function of a digital audio workstation is performing a high quality analogue to digital conversion of a continuous audio stream generated from an analogue signal source as well as converting the digital audio stream back to an analogue signal. The quality of the conversion is determined by the resolution available from the analogue to digital converter (A/D) and digital to analogue converter (D/A). The resolution of an A/D – D/A converter system is given by:

The dynamic range of an A/D - D/A subsystem can conveniently be expressed in dB. According to a resolution of 1*x(V), a n*bit system provides

i.e. ≈ 96 dB at n=16 bits, ≈ 108 dB at n=18 bits, ≈ 120 dB at n=20 bits and ≈ 144 dB at n=24 bits.

2 Sample Rates and the 24 – 20 –16 bit Story

Sample rates in professional audio and video environment range from 32 kHz to 96 kHz and 192 kHz (DVD audio), 44.1 kHz and 48 kHz being used mostl frequently. Other sample rates occasionally used are: 44.056 kHz, 47.952 kHz, 64 kHz, 88.112 kHz, 88.2 kHz, 95.904 kHz and 176.4 kHz.

It is advisable for sound archives and collections to maintain audio workstations capable to select sample rates ± 10% off the nominal values. At least one unit serving arbitrary sample rates between 5 kHz and 48 kHz should be available in order to handle non-standard recordings from several sources; otherwise a sample rate converter as a separate functional unit is necessary. An overview on popular sample rates and sound file formats is given in HARMONICA deliverable D3.2

.

High end DAWs use accurate, discrete, multi-bit A/D converters. The A/D converters operate at a sample frequency of 192 or 176.4 kHz and employ sophisticated digitally subtracted dither to produce both low noise and distortion components below -120 dB FS, or less than one part per million. The 192/176.4 kHz signal is decimated to 96 or 88.2 kHz, 24 bits using optimised filtering.

In spite of all technology progress CDs are still in 16 bit audio. Several solutions for handling both the 24 and 16 bit domain have been proposed. In order to reduce the 44.1 kHz, 24-bit signal to 16-bits while retaining many 24-bit audio benefits, soft limiters are applied which allow the increase of the peak signal level up to 6 dB without overloading. The peaks are reconstructed when decoded increasing dynamic range by 6 dB. For undecoded playback the units work as standard limiter. Some DAW units provide a low level range extension which gradually increases the gain on low-level signals (approx. starting at -45 dB FS) by 4 dB over a 20 dB range.

As one of several possibilities the final step in the reduction to 16-bits is to add high-frequency weighted dither and round the signal to 16-bit precision. The dither can be applied to the frequency range of 16 kHz to 22.05 kHz leaving the noise floor flat below 16 kHz without influencing the psychoacoustic relevant frequency range for the perception of tonal signals.

Psychoacoustically designed noise shaping filters are controlled by the spectral range of the time varying audio signal. Some audio systems introduce, as part of the final quantisation, a pseudo-random noise hidden code as needed into the LSB of the audio data. The hidden code carries the decimation filter selection and peak detection and low level range parameters. The hidden code is completely inaudible and is only inserted 2-5% of the time, effectively producing 16-bit undecoded playback resolution. The result is an industry standard 44.1 kHz, 16-bit recording which should be compatible with all CD replication equipment and consumer CD players.

Although DAW producers advertise their signal processing being compatible to CD-standards, careful verification of the format along CD-Red Book specification is necessary, which calls for linear PCM audio at a 16-bit word length and 44.1kHz sample rate. Special care has to be taken if sound documents from unknown sources are not accompanied by the appropriate digitisation side information. Chaining of different noise shaping or dithering concepts can produce distortions quite above the hearing threshold, although each of them is inaudible when listening to them alone.

3 Linear PCM Audio 16-bit is not obsolete!

Audio technology has to be seen transitional today. With both DVD-Audio and Super Audio CD on the horizon, libraries and archives will face new challenges before classic digital audio formats have been acquired sufficiently. A psychological turning point will probably come when the CD is no longer seen as the best available sound quality format by the public. Just as the CD in comparison to the LP and the cassette on its way up, the new formats will gain consumer acceptance rapidly. Nevertheless, CD format: linear PCM audio at a 16-bit word length and 44.1kHz sample rate is not obsolete. It will remain the only viable consumer digital delivery format building the mainstream in consumer audio electronics for quite a while. An estimation of millions of CD-recorders to be sold per year represent a lot of hardware for the survival of the format. Recordable DVD is still in discussion on different formats (and size: 12 cm, 8 cm?); the projection that one DVD can possibly not be run on two different laptops is confusing the salespeople as well as the users. DVD as Digital Video Disk will even depend on the manufacturers choice to make it playable in more than one geographical region.

3 New Digital Audio Formats

1 Archive File Formats

In the early 90s, computer-aided radio (CAR) systems became digital audio islands in broadcasting houses. These CAR systems used propriety file systems. The music and radio programme exchange between different islands took a lot of processing time just for file conversion. Out of a variety of sound file formats two evolved as a de facto standard: AIFF used in the MAC/UNIX world and RIFF/WAVE in the PC domain. This scenario was found by an EBU project group P/DAPA (Digital Audio Production and Archiving) when negotiating with industry in order to propose a common file format for linear audio quality serving the AES/EBU hardware interface standard. In order to generate and process descriptive information conveniently, metadata should also be included in the file format. The group decided to select the widespread RIFF/WAVE format as a proposal for a standard. One major advantage of the WAVE file format can be used: WAVE files are worldwide native files on all PC platforms and each PC is able to play and edit them. WAVE files are also used for audio data import and export on several other computer platforms. In order to enable standardised audio programme exchange the group developed the so-called Broadcast Wave Format. The main issue consisted in the agreement on a special designed 'Broadcast Extension Chunk' (BEXT Chunk) for storage of additional metadata and descriptive information in sound files.

2 Broadcast WAVE Format (BWF)

As libraries and archives receive music documents from quite different sources and their users occasionally are located in a broadcasting environment, in future many sound files stored in BWF format may appear. Although libraries could consider BWF as a useful lib-rary standard, it seems questionable to convert all digital audio holdings into BWF as long as audio file transfer with broadcasters is not needed often. BWF stands for a comprehensive method to include metadata and links to additional descriptive information in sound files but alternative solutions taking advantage from the possibility to define user chunks in the RIFF/WAVE format can be expected to arise. It should be emphasised that sound data of standard linear coded WAVE files remain playable with any wave-player available whether or not additional chunks are included; whereas the content of user chunks needs to be managed by special application software components.

Fig. 1: the Broadcast WAVE Format (from EBU Technical document 3285).

The Broadcast Wave Format (BWF) has been defined in EBU standard N22-1997. The full specification of BWF, a description of the Broadcast Extension Chunk and basic information on Microsoft RIFF format is given in EBU standard document Tech. 3285. BWF incorporates ISO/MPEG-2 Layer II which is intended to be used as browse quality in sound archives. A description of MPEG support in BWF is given in Supplement 1 of Tech. 3285. Further information on BWF as well as the documentation described can be obtained from the EBU () and from Swedish Radio Corporation (). In addition to the BWF specification, the project group (P/DAPA) published the following recommendation R85 for programme exchange of audio data files:

• · Sample rate: 48 kHz

• · Resolution: minimum 16 bit / linear

• · Alignment level: according to EBU R68 (headroom of 9 dB)

• · Preemphasis: none

• · Channel formats: mono, 2 channel stereo

• · Signal formats: multi channel >> MPEG, linear PCM

• · BEXT chunk: transfer ahead of the audio data

BWF does not support all types of RIFF chunks; nevertheless it is compatible to ISO OSI layer model for information interchange. BWF files can be used independently from the transport layer for data exchange in real time and file transfer over networks as well as for signal storage on data carriers such as disks or tapes. One single BWF file is capable to hold about 4 Gbytes of data, corresponding approximately up to 6 hours of linear stereo sound signal (16 bit / 48 kHz) or 4 hours (24 bit / 48 kHz). This size of data volume is sufficient for the storage and reproduction of almost all analogue sound carrier volumes commonly used.

The following BWF file structure is currently supported:

The BEXT Chunk contains general metadata such as title, originator, archive number etc.

A Coding History (part of BEXT chunk) describes the transmission chain of the current sound signal, providing information such as of sound carrier material, recording and playback equipment, analogue-to-digital-converter and digital I/O interface card of the PC.

The Format Chunk is used to specify format information as linear PCM, stereo, sample rate and resolution (16...24 bits).

The Quality Chunk contains information obtained from the digitisation procedure, such as a protocol of defects in the analogue recording and transmission chain, tape drop-outs, clicks, thumps, hiss, print-through and additional notes (in preparation).

The Cue Sheet Chunk provides cue points, tags and segment data as offset, start time and duration of a specific content in the file (in preparation; see also Audio Segmentation and Content Description).

The Wave Chunk contains the audio samples of the digitised sound signal.

3 ‘Unique’ Source Identifier (USID)

In order to idetify BWF source sound files an unique identifier has been proposed which serves as a prime link between sound files and associated data in a database system. Applications can use the identifier instead of the file name for reference.

The EBU proposed the following structure:

OriginatorReference

The field is as a sequence of 32 ASCII characters (not a string) provided in the BWF to contain an unique identifier of the file. The organisation originating the BWF file is responsible for the allocation of the USID.

Country code: (2 characters) is based on the ISO 3166 standard

Company code: (3 characters) is based on the EBU Technical information I30-1996.

Serial number: (12 characters extracted) This should identify the machine’s type and serial number.

OriginationTime (6 characters, HHMMSS) This should be sufficient to identify a particular recording in a human-useful form in conjunction with other sources of information, formal and informal.

Random Number (9 characters 0-9) Generated locally by the recorder using some reasonably random algorithm. This number serves to separate files made at the same time, such as stereo channels, or tracks within multitrack recordings.

Example of an USID

Generated by a Tascam DA88, S/N 396FG347A, operated by Radiotelevisione Italiana, at time: 12:53:24

UDI format: CCOOOSSSSSSSSSSSSHHMMSSRRRRRRRRR

UDI Example: ITRAIDA88396FG347125324098748726

4 BWF - A Music Library Audio Format?

As has been pointed out, BWF is an advanced real life example for combining sound and (technical) metadata in a comprehensive file format for broadcasting storage and retrieval applications as well as for programme exchange between different partners. One of the outmost advantages of the approach transporting all necessary metadata within the sound files themselves is easy management of sound files and metadata in the LAN and across different hard- and software platforms. For the archive or library starting from scratch, a medium size digitisation project can be evolved on learning by doing and just by implementing a minimum standard as BWF requests. For up to several ten thousands of sound files of a homogenous collection, general purpose computer systems and file archive servers provide file management and browsing tools convenient for use.

Larger collections will not succeed without additional data base management and a specified file system structure. Links from (or to) an existing catalogue have to be updated at regular intervals and sound and metadata should be “frozen”. In any case, before digitising, libraries and archives should think very carefully about implementing appropriate file naming conventions for sound files and associated documents along content related classification or indexing schemes in order to support effective data retrieval later on.

The current BWF structure is certainly superior when the percentage of metadata associated to sound is small in comparison to the data size of the sound itself. It is applicable when search and browsing is primarily done on the basis of listening to sound data and not by cruising through large volumes of content related metadata. One very useful extension of the BWF concept could be providing appropriate links to recently evolving content description standards as MPEG-7 promises to become.

5 De-Facto/Industry Digital Audio Compression Standards

Although sound archives and music libraries should not even think about to select any perceptual (lossy) coded sound format as an internal archive standard, they certainly will have to deal quite a lot with manyl sound documents encoded in different industry audio compression schemes. Archives and libraries should therefore be prepared to read all formats on which relevant content will be stored as well as they could provide an user sevice to convert nonstandard formats into generally readable ones. Today audio formats are dictated by the music contents the consumer appreciates rather than by technical quality criteria which would support technically optimized solutions at reasonable costs: A list of audio coders and links to useful pages on this issue always is incomplete at that point in time it would be written (for further information see Section 6 of this deliverable and among others, ).

6 Digitisation Procedure Quality Control

Real time sound analysis of the analogue and digitized audio stream is performed during digitisation in order to control the analogue to digital conversion process. For this reason, digital audio workstations provide several useful features such as:

• · automatic detection of start and end position of the audio signal

• · automatic detection of pauses during the recording

• · automatic detection of the noise floor level

• · automatic detection of clicks and impulsive distortions

• · automatic detection of analogue media drop-outs

• · average value of signal to noise ratio of the audio signal

• · average value of frequency bandwidth of the signal

• · average value of stereo correlation

• · average value of level dynamics

Signal parameters extracted by means of digital signal processing as listed above are used to control the sound quality of the digitised waveform. Any errors occuring during digitisation of analogue recordings should be detected on the fly. Standard quality criteria, obtained from long term statistics are matched against actual values measured. A transfer (quality) protocol is usually added to the technical metadata set.

7 Signal Enhancement and Signal Restoration

Although Quality Control may report signal degradations which could be repaired by means of digital signal enhancement algorithms, a strictly linear archive copy should be produced in any case. This advice is justified because any signal enhancement to be considered can be performed on digital copies of the linear archive document with no loss of generality. As digital copies are 100% exact replica of the original no information loss takes place; whereas signal filtering or any other processing carried out already at the time of digitisation would inevitably introduce nonlinear transfer functions which frequently cannot be inverted later on. As a general rule nonlinear manipulations of the audio signal, such as lossy data reduction, compression algorithms, filtering and signal restoration, should be carried out at the end of the transmission chain only and never on archive copies.

Some manufacturers of digital audio workstations and audio software packages provide typical audio signal processing algorithms, working either in the time or frequency domain, such as:

• DeNoiser for the reduction of broadband noise (hiss)

• RepairFilter for elimination of quasi-static noise (mains hum, hum from dimmers, stereo pilot tones)

• DeScratcher for elimination of scratches on vinyl record and shellac recordings

• DeClicker for automatic elimination of clicks

• DeCrackler for automatic elimination of crackles

• DeClipper for automatic elimination of digital clipping

• DropOuter for automatic drop-out restoration

• VPIs for remastering and sweetening

(parametric EQ, 1/3 octave EQ, linear phase EQ)

The list above can be completed by additional sound conditioning functions which belong to the standard equipment of a sound recording studio, usually appreciated by the professional sound engineer; these are among others:

Compressors, Limiter, Loudness Maximizer, De-esser, Free Shaper (redithering / noise shaping), stereobasis manipulator etc. For exact metering (level control) tools such as PhaseScopes (stereoscope) , FFT-Analyzers, 1/3 octave Analysers, Real Time Spectrograms, MatrixScopes etc. are applied.

Audio Segmentation and Content Description

1 Query by Audio Content (QBAC)[3].

With the increasing acceptance of Digital Libraries and archives as storage systems for music and multimedia data, efficient architectures for context-oriented search of non-textual data becomes a pressing need. Whereas both, automatic shot detection and query by image content (QBIC), have opened inroads to characterise and search for visual information, no equivalent methods exist for streaming audio data. Many audio data have a common property which still images and video material do not have: It can be expressed in two corresponding forms, either as a 'textual' representation (music score, transcript) or as a realisation (sound recording).

Working with Streaming Media raises new issues related to collecting, storing, annotating, indexing, browsing, use of meta-data, and retrieval interfaces for libraries and audio/video archives. While in some cases, the reference linking of an entire audio file to a score or a text file might be sufficient just for "listen to", the correspondences of a search on the score or text file require a fine-grained audio data segmentation. Once such a fine-grained linkage between textual representation (narrow transcription) and acoustic realisation is established, the textual representation can be used to facilitate QBAC (Query By Audio Content). A language-aware search engine locates the desired elements in the score or text. The result of the query are segments of audio material (audio objects), identified by fine-grained links between score or text and the sound recording.

1 Sound File Segmentation

Sound segments (audio objects) are addressed sample by sample from the beginning of a recording. Usually, sample number offset and duration of the segment is referenced. Segment identifiers are created by automatic segmentation procedures or by programme supported manual editing. Cue in and cue out points, edit decision lists and play lists can be linked to segment identifiers and collected in sound file directory tables. The segment structure should not be limited to a single segmentation layer and relative addressing of segments should be supported. In order to facilitate context related queries, overlapping segmentation should be implemented. Sound file directories may grow to considerable size (several thousands entries for each file) in the course of cumulative segmentation work sessions. Furthermore, as one and the same audio signal has to be segmented differently according to special user requirements, it seems appropriate to store the segmentation metadata in separate files which serve as input to an audio content related database.

Archived sound data usually do not change anymore after digitising. What has to be updated in regular intervals are metadata and metadata links, the location of suitable cue in points, segment sequence procedures for rapid browsing, the creation of clips and several further archive staff and user accessible functions. The concept to manage the metadata separated from the sound files enables fast and easy access and virtual (non-destructive) processing and access of sound[4].

Fig. 2: segmentation of sound files in multiple layers. Sound segment addresses, segment identifier, optional links and content description are stored in a sound file directory which is separated from the sound.

2 Links Refering to Segments

Several standard, industry-standard, and proprietary standards can be used for to express the links needed to refer to segments. However, since in many cases neither the audio recording nor the description or other audio segment linked to it can or should be changed, a simple hyperlink scheme as in HTML is not sufficient. Instead it is necessary to use so-called ‘independent’ hyper links, which are external to the files they link. In addition, these hyper links must be bi-directional (description to audio and vice versa) to allow both, applications like querying the text and playing back the speech, as well as playing the audio and switching e.g. to the display of the score at an arbitrary point in time.

Currently the most advanced linking mechanism available is the HyTime ilink [HyTime97][5]. Similar functionality can be provided by other implementations and will is likely to be provided by XML linking mechanisms. In the following discussion SGML and HyTime will be used as an example. An independent link consists of three components:

Anchors, which are regions or points in a text or audio document.

Locators, which locate or address anchors

Links, which link or connect locators.

[pic]

Figure 3: The canonical bi-directional indipendent link from HyTime.

The ISO/IEC HyTime standard offers locators to uniquely address elements (sub trees) within SGML documents. This mechanism assigns a list of integer values to each node of an SGML tree. The list of integer values is the 'road map' to get from the root of the SGML document (tree locator '1') to the specific SGML element using several 'traffic rules' to generate the tree locator integer list. The rules are:

1. The 'journey' starts at the root element and adds one integer for each horizontal level below on the way down the SGML tree.

2. The root element of the SGML tree has the tree locator '1'.

3. Each integer stands for one horizontal level of the SGML tree.

4. Each integer value is generated by counting the number of nodes from left to right. Only the children of the node above are taken into account.

5. The left most node (left most child) of a node above is assigned the integer value '1'.

Starting at the root element (tree locator '1') and taking all the above rules to generate the tree locator into account, the nodes (elements) of the following abstract tree will be addressed by the tree locators listed in the table below.

[pic]

|Element |Tree locator |

|A |1 |

|B |1 1 |

|C |1 2 |

|D |1 1 1 |

|E |1 1 2 |

|F |1 2 1 |

|G |1 2 2 |

Table 1: Tree locators for the abstract SGML tree:

In SGML, the DTD fragments needed for a link between audio and text would be expressed in a form similar to :

...

...

...

...

Using above definitions, the link itself would be expressed in a form similar to :

...

>file=test.wav start=588 end=24703 unit=ms

1 1 2 1 1 1

...

2 Content Description

In the previous section it has been shown that content description of sound is closely related to narrow segmentation of sound files. Whereas the automatic creation of video objects[6] has already been successful by use of semiautomatic analysis tools nothing comparable exists for audio. The limitations experienced in the content description of streaming video equally apply – even to a larger extent – for audio data. Further development is needed for:

• segmentation tools requiring fully automatic, real-time analysis

• applications which may allow some convenient level of user guidance

• means to exploit the complementary skills of user and machine in solving complex analysis and interpretation problems

• tools that assist in the integration of the different solution fragments

• means to establish consistency of the final description, especially in corres-pondence to video sequences in case of multimedia documents

• applications for the integration of audio content description in an existing catalogue framework

Some open issues have been seriously addressed by the CUIDAD[7] group, contributing to MPEG-7, building a bridge between low level audio descriptors based on the signal (amplitude spectra and parameters extracted from the acoustic waveform), music descriptors (music scores on a symbolic level) and semantic descriptors on the perceptual level.

Fig. 4: functional diagram of audio content processing (from CUIDAD ).

A working model of a description scheme for sound clips and sound effects has been developed by the American company Musclefish

.

The most comprehensive attempt to standardise – among others – the description of audio content is the MPEG-7 effort of the ISO/IEC JTC1/SC29/WG11 working group. MPEG-7 is still an ongoing process and the first version of the standard is expected late in 2001. The following table gives an overview of the most significant milestones met and the timeline to the completion of the standard:

• October 16, 1998: Issued formal Call For Proposals

• February 1, 1999: Deadline for submission of MPEG-7 proposals.

• February 15-19, 1999: MPEG-7 Evaluation Ad Hoc group meeting:

• March 15-19, 1999: Developed the first Experimental Model (XM 1.0) and Core Experiments.

• July 6-10, 1999: the first Audio Core Experiments, in Speech Recognition, Sound Effects, Instrument timbre, and Melody, were initiated.

• December, 1999: MPEG-7 Working Draft established.

• October, 2000: MPEG-7 Committee Draft

• February, 2001: MPEG-7 Final Committee Draft

• July, 2001: MPEG-7 Draft International Standard

• November, 2001: MPEG-7 International Standard

• Additional information about MPEG in general and the upcoming MPEG-7 standard can be found at and

Although certain details proposed by MPEG-7 may not yet be applicable for libraries and archives without appropriate applications ready for usage, general guidelines for audio document classification can already be derived. Among them are:

• Speech, Speech Recognition Systems

• Singing voice

• Timbre or Instrument

• Instrument Description

• Melody, Melody Description

• pitch or note (spectrum description)

• tempo or rhythm (temporal description)

• Surround sound

• Sound Effect Classification

Worth mentioning are classification and content description systems originating from musicology as well as from the music industry, the latter being progressively present in the commercial download business on the Internet.

3 Visualisation of Music Signals

Visualisation of audio signals by so called Spectrograms is employed whenever music signals cannot be represented in 'textual' formats as music score or transcription. This happens frequently in ethnomusicology or when acoustic and perceptual differences between individual interpretations of the same piece have to be documented. Visualisation of music is performed in real time so that visual and audio representation of the music can be observed synchronously. Spectrograms can be read similar to piano rolls comprising the running time axis on the abszissa and the frequency scale on the ordinate. The strength (level) of spectral components is coded in an appropriate color scale. Spectrogram icons (sound thumbnails) are used in order to provide fast access to a large number of sound files and segments stored in a database.

Fig. 5: structuring of audio by visualisation (narrow band spectrogram) of an extract (11 min) from Bruckner 8th symphony.

4 Future Development of Content Driven Approaches: Audio Description Schemes[8]

As far as MPEG-7 has already created a structure for “Obvious Audio Ds” four types of audio Ds have been considered:

1. media based Ds,

2. non-perceptual low-level audio characteristics,

3. perceptual low-level audio characteristics and

4. high-level audio characteristics.

In order to meet best the requirements for the description of audio of key applications the group obviously abandoned the demand for generality and concentrated on the following areas and audio content sets:

1. pure music,

2. pure speech,

3. pure sound effects and

4. arbitrary soundtracks applications.

For each of the four application areas it has been provided:

• a typical application scenario in order to prove its relevance,

• the effort required to implement it has to be stated,

• and a statement whether there’s a chance to automatically determine the values under consideration.

Currently MPEG-7 is considering the followin Audio Content Set:

Radio A1 Radio news broadcast

Music A2 "Two Ton Shoe" Rock album

A3 Bruckner's Te Deum, and Mozart's Requiem

A4 Original composition, a capella. Voice only

Audio A5 Short sequences of solo instrument and other sounds

A6 Pop song based on an A-A-C motif

According to this pragmatic point of view a step by step creation of useful tools for automatic segmentation and tagging of sound of a large number of audio applications is expected. The currently considered descriptors are among many further possible as the following:

1 The MPEG-7 Audio Descriptor Scheme (tentative)

Descriptors for pure music:

Archiving music

Descriptors for musical genres

Descriptors for a composer

Descriptors for an artist

Descriptors for an artist group (e.g. band, ensemble, orchestra, choir)

Descriptors for single pieces of music

Searching music collections

Structuring music, descriptors to capture musical structures

Filtering music broadcasts

Music education / teaching

Music editing

Composition

Manipulating musical content

Music production

Descriptors for pure speech data:

Searching speech collections

Structuring speech collections

Rhethorics education

Descriptors for sound effects:

Searching sound effects collections

Movie synchronisation

Descriptors for arbitrary soundtracks:

Searching Radio program collections

Searching TV program or Movie collections

Filtering Radio programs

Filtering TV programs

Film production / editing

Film education

A Modular Archive Model

Long Term Preservation of digital data involves issues of physical storage, software and data standards as well as migration plans and disaster management. In addition, the digital archive involves technology required for global, multimedia, object-oriented databases with emphasis on adding value along dimensions such as: real-time, fault tolerance, security, and Quality of Service (QoS). The technologies and standards needed to be applied for the archiving/preservation/retrieval of digital documents is currently a major concern in the archives community. Life time of digital storage media and systems is extremly short in comparison to analogue sound and multimedia carriers libraries and archives got used to for many years. The question is whether computer industry will actually provide small and medium tailored storage solutions for individual libraries in the near future. As an alternative, backup and archive systems located at large computer centers and data farms frequently provide secure digital storage containers and rental storage which could also be used for small and medium volumes of digital data. Librarians and archivists have to force themselves to overcome the psychological barrier not to see a digital document, not to handle it physically anymore, not even to have it in-house – and it is still available, just because it is stored in a secure digital container!

In any case whether the digital storage system is located and managed by the institution in-house or by a professional computer center remotely the following key functions have to be provided by the archive system (for a Reference Model for an Open Archival Information System see: OAIS[9]: ):

1 Acquisition of documents and information

An entity, which provides the services and functions to accept new documents and adjunct information from external or from internal acquisition units under Administration control and prepare the contents for storage and management within the archive. Acquisition functions include receiving sound documents and adjunct information, performing quality assurance on the document package, generating an Archival Information Package (AIP) which complies with the archive’s data formatting and documentation standards, extracting Descriptive Information from the AIPs for inclusion in the archive database, and coordinating updates to Archival Storage and Data Management. Different collections may have different description schemes fitting into the archive’s data formatting and documentation standards.

2 Archival Storage

An entity, which provides the services and functions for the storage, maintenance and retrieval of AIPs. Archival Storage functions include receiving AIPs from the acquisition unit and adding them to permanent storage, managing the storage hierarchy, refreshing the media on which archive holdings are stored, performing routine and special error checking, providing disaster recovery capabilities, and providing AIPs to Access to fulfill user requests.

3 Data Management

An entity, which provides the services and functions for populating, maintaining, and accessing both Descriptive Information - which identifies and documents archive holdings - and administrative data used to manage the archive. Data Management functions include administering the archive database functions (maintaining schema and view definitions, and referential integrity), performing database updates (loading new descriptive information or archive administrative data), performing queries on the data management data to generate result and query sets, and producing reports from these sets.

Fig. 6: Outline of a functional diagram of a modular archive system.

4 Administration

An entity, which manages the overall operation of the archive system. Administration functions include soliciting and negotiating acquisition and access agreements with document providers and IPR owners, auditing acquisition material in order to ensure that they meet archive standards, maintaining configuration management of system hardware and software, evaluating the contents of the archive and periodically requesting archival information updates, providing system engineering functions to monitor and improve archive operations, developing and maintaining archive standards and policies, providing user support, monitoring changes in the Designated User Communities, interacting with library and archive Management, and activating stored requests.

5 Access

This entity supports users in determining the existence, description, location and availability of information stored in the archive and allowing users to request and receive documents and information products. Access functions include communicating with users in order to receive requests, applying controls to limit access to specially protected data and information, coordinating the execution of requests to successful completion, generating responses (Dissemination Information Packages, result sets, reports) and delivering the responses to users.

Audio-Networking

1 The Role of Libraries and Archives on the Internet

1 General

The Internet has developed into a mass communications system. By the end of 1998, the Internet had more than 100 million users located throughout the world, and that number is growing rapidly. More than 100 countries are linked into exchanges of data, news and opinions and more than 1 million servers are sending information within the net. Obtaining access to information from the net is open to all users who have a personal computer or other access device, the appropriate software and the ability to gain access to the system (referred to as “obtaining connectivity”), usually provided by an Internet Service Provider (ISP).

Increased availability of bandwidth, faster modems, improved and scalable audio coding schemes are supporting the development of library music information services. These include the implementation of technology that allows the digital conversion (digitisation) and storage of mass amounts of data as described in previous sections. Future developments might not include a significant change of networt structure but rather a substantial increase of capabilities of access devices to download large quantities of data; the development of higher bandwidth distribution systems for real time access and streaming media. The latest research on building, understanding and using digital archives and digital libraries indicates the development of sophisticated routers that transmit information; the advent of user-friendly software allowing access to information stored on any connected computer (search machines and intelligent assets) etc. in the near future.

Libraries and archives have to decide - depending on their regional policies - whether or not they will use the capability to implement

• online services for streaming media,

• services for file transfer or

• services for traditional catalogue access only.

What are the benefits for the library user and what will the consequences be for the library organisation and its management? Which are the necessary tools and protocols on the Internet and what are the provisions to protect the Intellectual Property Rights (IPRs)?

2 A Glimps on Copyrights

As libraries and archives can play many different roles for various types of transmissions over the Internet, it is, consequently, important to examine these roles separately in determining which activities may give rise to which liability. The definition, whether an Internet transmission is to be classified as

• a communication to the public or as

• a communication by telecommunication or as

• a process involving reproduction of data, which takes place between two identifiable partners,

might develop as a central legal issue. Several legislative bodies in different countries raised a number of issues including: whether there is a communication by telecommunication to the public as soon as musical sound document or music information is electronically transmitted, made available, uploaded, downloaded or browsed? Is a communication over a network for which access is restricted, a communication to the public? IPR organisations certainly will argue that a communication to the public already occurs as soon as the end user can access a library document from a computer connected to a network.

The role of a library or an archive as content provider is given as soon as

• content is assembled and placed as a collection of files

on a server to allow the files to be accessed.

Usually, the library or archive organisation which has overall responsibility for the content of the site (the site owner) also operates and maintains the server on which the site is located. This model is normally followed by larger and medium sized libraries and archives (for a detailed description on the role of content providers on the Internet, associated legal issues and business arrangements see e.g. : Public Perfomance of Musical Works published by the Copyright Board Canada). For small volumes of digital data it is recommended to rent a server as well as storage capacity in a secure digital storage container provided by large computer centers.

2 Library and Archive Services on the Internet

1 Connectivity

Digital library services use the Internet as a network of local and remote computers and computer networks designed to receive and forward bytes of data grouped into packets between end nodes (the source and destination computers). The basic communication service of the Internet consists of two components:

• the Internet addressing structure and

• the Internet delivery model.

The addressing of computers in the Internet is managed by unique Internet Protocol adresses (IP address). The format of an IP addess is given by a combination of integer numbers, usually in the range of 000 and 255, as following:

The unique IP address allows the control of the World Wide Web traffic. Host names can be allocated according to their geographical location and access patterns providing geographical as well as temporal data are obtained from each connection. In order to make the addressing easier, slightly more user-friendlier domain names are generally used instead of IP addresses. These names are translated automatically back to their associated IP addresses by means of the Domain Name System (DNS), operated by all IAPs for use by their subscribers. Changes of domain names and IP addresses have to be carried out in co-ordination with the IAP. The domain names together constitute the Internet’s addressing structure. Once the connection is established, the service can be initiated providing the appropriate software is running on the participating computers.

Fig. 7: installing an IP address on a PC under MS Windows. This is not a valid IP address!

2 Delivery Model

The delivery model contains several delivery modes for the transmission of music and music information with the aim of sending and requesting information over the Internet. Originally, the network providers tried their best to deliver data but would not provide commitments as to the quality of the service (QoS, e.g., commitments as to bandwidth or reliability). Generally, the user requests the document required in a unicast pull mode during a connection. Although it is now possible to request a minimum of required bandwidth and a maximum delay (e.g., that packets will be transferred within a specified period of time, see Harmonica deliverable 3.6.1 & 3.6.2, RSVP Reservation Protocol), participants in the professional music business still consider the Internet as inadequate in performance. Moving from streaming analogue-based audio transmission to full audio bandwidth packetized digital delivery systems, it is evident that in spite of significant progress in network technology there are many difficulties yet to overcome. Libraries and archives have to decide which of the possiblities they accept as apllicable for their service.

Alternative delivery modes of providing music and music information over the Internet involve streaming media as well as multicasting. Real Audio, currently the market leader for delivering real-time audio and video (including music)[10], provides servers capable to broadcast multiple streams; nevertheless, the delivery system for multicast is almost the same as for unicast because each recipient still receives its individual copy. Further key players in the real time and music download business are – among many others - Liquid Audio, WinAmp and the MP3 community. The selection of a specific delivery for a music information service has to be decided along the nature of the delivery mode intended to be implemented, whether a direct relationship between the library or archive with the end user can be established or not. In particular, whether the end user or the library organisation is paying for the usage of the information or the work provided.

3 MP3 – An Evolving Digital Music Delivery Sector

WinAmp, Sonique, MusicMatch, RealJukebox, , , Liquid Audio, Lycos MP3, 2look4, , Audiofind, MP3friend, Emusic are just a small selection of sites and search machines of tools and possibilities for playing, encoding, searching, browsing and downloading a huge amount of music and music information over the Internet.

A couple of years ago, MP3 was just an audio compression format which originally has been developed by Fraunhofer IIS-A mainly for broacasters programme transfer.Today, MPEG Layer-3 is one of the most advanced audio coding schemes like MPEG-2 AAC (Advanced Audio Coding). In the meantime, MP3 is a growing technology standard for storing and distributing audio on a much broader basis, and is revolutionizing the way audio is transmitted over the Internet. Several manufacturers of audio processing tools and signal processing workstations support MP3 by implementing user-friendly export options.They feature high quality audio transmission and support of constant bit rate (CBR) as well as variable bit rate (VBR), encoding at bit rates of up to 320 kbps.

MP3 has become a Net phenomenon that is currently in the center of an enormous controversy. That is because MP3 allows people with an Internet connection to bypass record stores (and cashiers) and download CD-quality music by their favorite artists - for free. MP3 is not welcomed by musicians and record companies, who expect their sales figures to drop. However, record companies and music publishing houses themselves are adopting this format for promotion purposes and music-on-demand over the Internet. Music industry is still discussing which of the partners in the game will be the loosers, which the winners und who will dominate the market?

The home production of MP3-files is easily performed by means of CD rippers. CD rippers are programmes that extract - or rip - music tracks from a CD and save them onto the hard drive (Audiograbber). Once the tracks are located on the hard drive, they are converted to the MP3 format. Many CD rippers have MP3 encoders built-in (such as MusicMatch Jukebox); or a separate encoder utility, such as MP3Enc. is needed. Rippers are used to store the music programme of the current choice on handy hard drives.

For further interesting music sites see: DMX - Digital Music Express ; TCI cable service: 95 music programs;

4 Frequently Used Bit Rates

CA*net3 CANARIE–Bell Canada 40 GBit/s optical Internet Oct.1998

Internet2 US new national backbone 9.6 GBit/s network for end of 1999

Internet2 US new national backbone 2.4 GBit/s network for end of 1998

Internet2 US national backbone 22 Mbit/s infractructure April 1998

ATM OC-12 622 Mbit/s

ATM OC-3 155 Mbit/s

100-BaseT / FDDI LAN 100 Mbit/s

T3 45 Mbit/s

10-BaseT Ethernet LAN 10 Mbit/s

T1 1.5 Mbit/s

Digital HDTV 40-60 Mbit/s 5.1 audio uncompressed

Next generation DVD (Blue Laser) 23 Mbit/s

DVD-ROM 11.08 Mbit/s

Digital DVD-Audio (uncompr.) 9.6 Mbit/s 6 channels max. 96 kHz, 24 bit

Digital TV, DVD-Video (NTSC) 6-10 Mbit/s 5.1 audio, NTSC video compr.

Multichannel audio compressed 224-640 kbit/s 5.1 channels, Dolby Digital

Compact disc 1.14 Mbit/s 44.1kHz, 16 bit, stereo

Stereo audio uncompressed 1.536 Mbit/s 48 kHz, 16 bit, stereo

Stereo audio compressed ("MP3") 20-128 kbit/s MPEG-2 Layer 3

Normal telephone channel 64 kbit/s mono, limited bandwidth

Telephone modem 14.4- 56 kbit/s ITU V.90 modem 56 kbit/s

Cable modem (with Ethernet card) 50-200 kbit/s up to 10 Mbit/s theoretical

ISDN 64-128 kbit/s FM stereo quality

ISDB (Integr.Services Digital Broadc.) 150 Mbit/s NHK trans. 21 GHz/ch

ADSL (a new telephone service) 512 kbit/s uses standard wires

ADSL high-speed modem 1 Mbit/s

Program data 10 kbit/s

Facsimile (fax) 20 kbit/s

Still picture 70 kbit/s

Tell Text 100-200 kbit/s

Audio graphics 800 kbit/s/ch

MIDI 32.5 kbit/s per 16 channels

Table 2: bit rates currently in use for different audio and multimedia services (from AES WP-1001 Technology Report TC-NAS 98/1: Networking Audio and Music Using Internet2 and Next Generation Internet Capabilities http:/)

Appendix

1 A Sample List of Audio Players

1.00 for Macintosh



a2b Music Player 1.00b8 for Macintosh



a2b Music Player 2.0 for Windows 95/98/NT



ARIES Breathe MP3 player 0.91 for Windows 95



Aries Mod Player 1.0 for Windows 95, NT 4.0



Audio CD Player for Windows 3.1



Audioactive Player 1.2a for Macintosh PPC



Audioactive Player 1.3 for Windows 3.1



Audioactive Player 1.9 Beta for Windows 95/NT



Beatnik Player 2.03 for Macintosh



Beatnik Player 2.03 for Windows 95/98/NT



CD player Maximus 3.3 for Windows 95/98/NT



Cubic Player for Dos



Destiny Media Player 1.31 for Windows 95/98/NT



Digital Music Player for OS/2



DSM Player 1.04 for Macintosh



Dual Module Player for OS/2



Hyperprism H-PPC Player for Macintosh

iNERTiA PLAYER for Dos

,

Liquid Player 5.0 Preview for Macintosh



Liquid Player 5.0 Preview for Windows 95/98/NT



Melody Player 2.0 for Windows 95



Microsoft Windows Media Player 6.4 for Windows 95/98/NT



MIDI Player V1.55 for Windows 95



Midi Synthi Player 5.8 for Windows 95/98/3.1



Midisoft Internet Media Player v3.08 for Windows 95/98/NT



Mikey Player 98 4.1 beta for Windows 95



Mikeys mp3 Player for Windows 95, 98



Mini MIDI Player v1.12 for Windows 95



MM Player 4.02 for Windows 95/98/NT



MM Player Pro 4.02 for Windows 95/98/NT



MODPlug Player 1.40 for Windows 95/98/NT



Mpeg Audio Player 1.21 for Macintosh



Multi Module Music Player 1.00b4a for Windows 95



Musician's CD Player for Windows 95



NoteWorthy Player 1.50 for Windows 3.1



NoteWorthy Player 1.55a 16 bit for Win 3.x



NoteWorthy Player 1.55a for Windows 95/NT



PH Player for Atari



Player PRO Direct-To-Disk 0.1b for Macintosh



PROcessu CD Player 2.02 for Windows 95/98/NT



Real Player G2 Update 1 for Windows 95



RealPlayer 5.0 for Windows 3.1



RealPlayer G2 for Windows 95



Shockwave 6 Flash Player for 68K for Macintosh



Simple CD Player 2.3 for Windows 95/98/NT



SSEYO Koan File Player V2.2 for Windows 95/98/NT



Streaming Audio Player 0.8 beta for Windows 95/98/NT



ThrottleBox Player 1.2 for Windows 95/98/NT



TitleTrack CD Player v2.1 for Macintosh PPC



True Speech Audio Player for Mac (PPC) for Macintosh



True Speech Audio Player for Mac 68k for Macintosh



Ugly CD Player 2.1 for Macintosh



Unreal Player Max 2.02 for Windows 95/98/NT



Upscale Pro MIDI Player for Windows 95



Variable Speed CD Player for Windows 95



Wired Planet player for Windows 95/98/NT

Xing MP3Player



ya cd player 2.5 for Macintosh



Yo!MPEG Player v1.0.2.79 for Windows 95/98



-----------------------

[1] See D3.6.4 In-house pros/cons. Outsource pros/cons.

[2] See: COLORADO DIGITIZATION PROJECT: GENERAL GUIDELINES FOR SCANNING which correspondingly apply for sound documents.

[3] AC308 ACTS-DICEMAN: Distributed Internet Content Exchange using MPEG-7 and Agent Negotiations.

[4]

[5] [HyTime97] ISO/IEC JTC 1/SC18 WG8 N1920rev, “Information-Processing - Hypermedia/Time-based Structuring Language (HyTime) 2nd edition,” ed. Charles F. Goldfarb, Steven R. Newcomb, W. Eliot Kimber, Peter J. Newcomb. May 1997.

[6] ACTS-MoMuSys (Mobile Multimedia Systems)

[7] CUIDAD is a European Working Group coordinated by Ircam - Centre Georges Pompidou in order to gather all institutions, industrials and users interested in the . ESPRIT project 28793.

[8] This section refers to: Obvious Audio Descriptors / Description Schemes; Source: MPEG-7 Audio Reflector; Nov. 1999.

[9] Reference Model for an Open Archival Information System (OAIS): Consultative Committee for Space Data Systems CCSDS 650.0-R-1 RED BOOK May 1999.

[10] For a demonstrator system of a digital sound archive see or

-----------------------

[pic]

links between segments of different sound files

offset/duration/segment ID/optional links/content description..

sample 0

sample 0

sample 0

sound file 3

sound file 2

sound file 1

audio streams

User

Community

Music and

Music

Information

Producer

ACCESS

Dissemination,

Delivery,

Data Transfer Control,

& Reports

User Interface

ARCHIVAL STORAGE

Receive Data, Provide Data,

Migrate Media,

Management of Storage,

Disaster Recovery

Storage Media

Backup Media

DATA MANAGEMENT

Database Updates,

Database Administration,

Query Management,

& Reports, Cataloguing

Database

ACQUISITION

Receiving Information.

Quality Control,

Descriptive Information,

Digital Audio Workstations

Coordination of Updates

ADMINISTRATION

Acquisition and Access Agreements, IPR Management,

Archive Standards, System Configuration,

Physical Access Control, Archival Information Updates,

User Support, User Monitoring.

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download