The Challenges to Designing Viable Digital Libraries

Michael Ribaudo[1], Colette Wagner[1], Michael Kress[2], and Bernard Rous[3]

[1] Office of Instructional Technology, The City University of New York (CUNY), 555 West 57th Street--16th Floor, New York, New York, USA, 10019, {RIBBH, CAWBH}@CUNYVM.CUNY.EDU

[2] Computer Science Department, College of Staten Island/CUNY, 2800 Victory Boulevard, Staten Island, New York, New York, USA, 10314, MKRSI@CUNYVM.CUNY.EDU

[3] The Association for Computing Machinery (ACM), 1515 Broadway, New York, New York, USA, 10036, ROUS@ACM.ORG

Abstract

The education and information communities in the United States are facing a crisis in the distribution and archiving of information. The creation of the electronic superhighway promises an infrastructure allowing for solutions to the present crisis. This promise must be put to a rigorous test to determine that the digital storage and dissemination of information along the Internet can cost-effectively provide access to information for the exponentially growing user community. Beyond the technological research and implementation that needs to be accomplished, the challenge is to master the business of information distribution just as we have begun to tackle the technical issues involved. Six specific areas of research critical to the design of viable digital libraries are identified:

(1) Designing a Technical Infrastructure and Establishing the Viability of the Internet as a Delivery Medium;

(2) Making Electronic Documents Available to People with Disabilities;

(3) Developing a Business Model for Electronic Publishing;

(4) Building a Production Model for Electronic Publishing;

(5) Developing Prototype Electronic Publications and Access Tools; and

(6) Resolving Intellectual Property Issues.

Keywords: Digital libraries -- issues, digital libraries -- access, digital libraries and electronic publishing.

Introduction

The education and information communities in the United States are facing a crisis in the distribution and archiving of information. This crisis is being generated by a combination of evolutionary effects: the explosion of technological advances, and the economic and procedural constraints that limit the access to information that is now being demanded by a growing proportion of the American population. Our education and library systems have traditionally been in the forefront of providing information access to the nation's students and to the general public at large; it is these systems that are currently facing involution as they straddle the large gap between the print-based distribution system and the growth of the new information industry.

At academic and library institutions nationally, journal subscriptions are being canceled and collections are becoming quickly outdated and economic realities strain the delicate balance between expenditures on the rising costs of printed material and the necessary technological upgrading now expected by the user community. While printed materials still remain the mainstay of academic endeavors, much research now requires that it be disseminated in formats that include graphics and visual representations of variable processes or theoretical constructs [5]. As a result, and as a study commissioned by the Mellon Foundation confirms, "the traditional library's mission of creating and maintaining large self-sufficient collections for their users is being threatened" [2].

The creation of the electronic superhighway promises an infrastructure allowing for solutions to the present crisis. However, this promise must be put to a rigorous test to determine that the digital storage and dissemination of information along the Internet can cost-effectively provide access to information for the exponentially growing user community. Advances are quickly proceeding and we must take the time to carefully evaluate them. The World Wide Web, for example, already links computers all over the world in a hypermedia environment that allows for the instantaneous global dissemination of text, graphics, animation, video, and software. The academic community and the general public are embracing this wealth of information and aggressively seeking more information and new ways to obtain and use it. What is now needed is a comprehensive discipline-based technical information service that will serve to knit professional communities together and provide solid directories of resources available.

Yet electronic publishing and dissemination are in their infancy. In 1992, Jul [4] noted that while there were 100,000 or more print journals world-wide, there were only about 30 electronic journals and about 60 newsletters and digests published over the Internet. While CICNet reports that there were approximately 700 electronic journals in 1994, there can be no doubt that electronic publishing is still in the cradle.

Beyond the technological research and implementation that needs to be accomplished, clearly the major stumbling block is economic. To date, the electronic universe has operated on the model of a cost-free public library. Users can browse through stacks and take home what they choose for little or no money. This model works well when the amount of information available is relatively small and the number of people taking advantage of the resource is also small. This same model, however, has limited the amount of information adapted to the Internet. Costs cannot be recovered if the information is free or dependent on the good-will of users to reimburse production costs. The problem is not primarily technical. It is, instead, a problem of determining sound business practices and the needs and expectations of users. The challenge is now to master the business of information distribution just as we have begun to tackle the technical issues involved. As Hawkins [3] notes, the electronic libraries that we develop must provide universal access to information on a cost-effective basis, with the cost of access borne by institutions, not individuals.

The primary components of the distribution puzzle are production, distribution and marketing. Involved are also the behavioral and legal issues that arise in the new context of the information superhighway. Data storage versus on-line delivery, file formats, network topology and protocols, load-leveling, authentication of documents and quality maintenance, archiving, user interfaces and the various needs of user communities are all questions that need to be addressed. Too, the collection of payments, use monitoring, pricing/tariffing, and the creation of an economic model need considering alongside the creation of available information. As Jul [4] notes, these are not new questions: "Traditional library services and functions acquiring, cataloging, storing, and retrieving documents and providing reference services are sure to arise within the Internet, albeit in new and different ways. When editors, authors and readers can access and use electronic publications with greater ease, this form of publication is sure to find greater acceptance and use. Until then, the brave and the hearty among us continue to reduce the barriers that, ironically, accompany electronic publication on the Internet."

To insure viable digital libraries comprehensive projects which address these issues in a grounded environment are essential. As the information available on the Internet is expanded solutions to the nagging problems that have limited the usefulness of the electronic highway should be sought.

We suggest the following areas of activity for determining the critical design aspects of viable digital libraries:

(1) Designing a Technical Infrastructure and Establishing the Viability of the Internet as a Delivery Medium;

(2) Making Electronic Documents Available to People with Disabilities;

(3) Developing a Business Model for Electronic Publishing;

(4) Building a Production Model for Electronic Publishing;

(5) Developing Prototype Electronic Publications and Access Tools; and

(6) Resolving Intellectual Property Issues

Activity 1

Designing a Technical Infrastructure and

Establishing the Viability of the Internet as a Delivery Medium

As a pre-requisite to the growth of digital library collections and the electronic publishing industry on the Internet, the viability of the Internet as a delivery medium must be determined. An effective technical infrastructure for the optimized used of networked resources must be designed and built in multiple testbed environments. Four areas of research are essential to this design:

1. constructing a model of network topologies that support the use of digital library resources;

2. assessing and developing algorithms for load-leveling and dynamic binding in a truly distributed network environment;

3. investigating compression technologies and storage media that can relieve network traffic, enhance application performance and strengthen the economic viability of the electronic publishing process; and

4. determining to what extent the Internet provides necessary network speeds and bandwidth to support user access to large electronic publication databases and exploring alternative or supplementary solutions.

Activity 2

Making Electronic Documents Available to People with Disabilities

In general, much research must be done on designing publications and services to meet user needs. People with disabilities are a sector of the user population who require specially designed peripherals and interfaces in order to utilize the wealth of text, sound, graphics and motion that can be accessed electronically over the Internet. Guidelines for providing access tools that will serve this population are essential.

Much work has been done in making computer information available to people with disabilities [1]; what is needed is to integrate the disparate research activities and create a model user interface to electronic documents for persons with disabilities that can be implemented at libraries, schools, colleges and research centers. Many of the problems associated with the delivery of information to persons with visual impairments can be solved with existing technology:

*scanners with optical recognition software, Braille printers, and speech access systems for those who do not read Braille can be used to read conventional printed or displayed text;

*enlarged display screens are available for those with lesser degrees of impairment;

*graphics and text files can be prepared and programmed beforehand for home or office processing of graphics for use on audio-tactile tablets; and

*hypermedia techniques can be used to give easy access to text information.

However, the delivery of motion video to blind and visually impaired individuals presents special challenges. Descriptive video techniques should be investigated and a prototype system incorporating electronic documents using audio, voice, music, and sound descriptions must be designed.

Graphic display is a major presentation element in the publication of mathematical and scientific formulas in print. Because no existing optical character recognition program is capable of reading formulas accurately, individuals with visual impairments need specially prepared software to reduce their dependence on sighted readers.

The efforts of the International Committee for Accessible Document Design to establish standards for document preparation independent of eventual publication formats must be embraced and supported so that electronic information can be disseminated simultaneously in a number of different text formats: conventional print, Braille, and voice, to name but a few. Also, graphics will have at least these formats: conventional print, embossed plates, computer screen, and descriptive voice explanations.

Material and information based on visual perception, such as text, graphics, and visuals, does not present a problem to people who are deaf and hearing impaired. However, information based on sound requires special attention to insure equal access by providing textual description of the voice and sound presented in the multimedia components of a document. Research on voice recognition technology as a seamless method of voice to text conversion is also required.

Activity 3

Developing a Business Model for Electronic Publishing

It is imperative to understand how traditionally published information sources will be incorporated into the digital library collection. Yet, the publishing industry, like the rest of society, is struggling to formulate its own agenda. The effects of moving from print to electronic media must be studied and the effect of electronic sources on scholarship, learning, and reading habits must be explored. We must understand changes in the way people take in and use electronic information in order to devise workable business models for electronic publishing and for the development of digital library collections that include documents by newly empowered author/publishers as well as by the traditional publishing establishment.

A critical first step is to study these user patterns and preferences in both the business and scientific communities and to identify clear user interfaces and appropriate data representation formats that allow for effective packaging and presentation of information. Through focus group research, we will be able to identify the direction needed to develop appropriate access and delivery models. A carefully constructed set of focus groups representing five important constituencies is essential. These constituent focus groups include:

*A Scholars' Focus Group that will explore such issues as: establishing criteria to measure the shifting patterns of electronic information used by scholars and others; the impact of electronic publishing on author/publisher relationships and on the peer review process; and changes in the nature and use of scholarly publications.

*A Corporate Group that will examine: changing patterns in the corporate use of scientific and technical information; the management of information systems for corporate researchers, and what is desired from publishers in terms of format delivered, rights desired, and preferred charging method; collaborative use of academic and technical information; and shared resources.

*An Advertising Group, that will explore the potential of electronic advertising and the impact of interactive delivery of advertisements and promotions. These are important areas of exploration because key revenue streams from print are threatened in electronic publications, and new advertising services must be investigated.

*A Librarians' Group that will explore issues of data access and retrieval; the relationship between libraries and campus computing centers; and the impact of these new methods of information storage and retrieval on budget.

*A Students' Group that will explore issues related to enhanced learning and growth through more efficient use of technology and state of the art scientific models; the type of browsing, search, and navigation tools desired; what sort of publications or information collections are most useful.

Activity 4

Building A Model Electronic Publishing Process

A crucial factor in the existence of digital libraries is the creation of a strong and adaptive technical infrastructure to support the production of electronic publications. Developing models of seamless electronic publishing processes that can be widely adopted are is a necessary condition to ensure that digital libraries are sustainable over time. The goal must be a seamless "womb-to- tomb" production environment--i.e., a single digital stream that flows naturally from author origination through electronic peer review and electronic notification of disposition, through on-line editing and composition for optical print or display, to final archiving in a digital library with a variety of distribution and access options.

Activity 5

Developing Prototype Electronic Publications and Access Tools

In order to develop electronic publications that meet user needs, in vitro studies that examine user patterns, actual costs, market demand and relative elasticity must be conducted and widely reported. These studies must be designed to provide a testbed for solving existing problems in the traditional publication process--e.g., print backlogs, providing more timely access to scholarly information by accelerating the peer review process, etc. Testing and evaluation of easy tools for search, retrieval, and display of information must also be undertaken in conjunction with these in vitro studies.

Activity 6

Resolving Intellectual Property Issues

Paramount to the viability of digital library collections is placing the resolution of intellectual property issues as they relate to information contained in electronic format high on our national agenda. Until and unless we resolve these issues with a set of clear guidelines for ownership and use of electronic information (including the relationship between electronic information and information contained in other, more traditional formats), the development of electronic digital libraries will be hobbled and the users of information on the nation's electronic superhighway will be the ultimate victims.

Conclusion

The facts are simple. As a society, we are enamored with the electronic superhighway and its shining promises. Our appetite for electronic information is voracious, and we are striving to sate it by creating a melange of glorious new sources, scoring exciting new technological advances along the way. The challenge to sustaining this new world is as exciting as its discovery. The frontier that is the electronic superhighway must be tamed and shaped in order to achieve the collective vision of sustainable digital libraries--growth must be channeled and planned, chaos must be checked by the creation of standards, and limits must be identified and overcome.

References

[1] Brown, Carl, 1992. Assistive Technology Computers and Persons With Disabilities, Communications of the ACM, Vol. 35, Number 5.

[2] Cummings, Anthony M., et. al., 1992. University Libraries and Scholarly Communication: A Study Prepared for the Andrew W. Mellon Foundation.

[3] Hawkins, Brian L., 1994. Planning for the National Electronic Library. EDUCOM Review, May/June, 19-29.

[4] Jul, Erik, 1992. Of Barriers and Breakthroughs (Electronic Publishing), Computers in Libraries, March.

[5] Yavarkovsky, Jerome, 1990. A University-based Electronic Publishing Network, EDUCOM Review, 25.