KEYWORDS: Digital libraries, user-centered system design, evaluation, needs assessment.
By work-centered, we mean a digital library designed to support the work of the users, in this case, enviromental planning. These services are distinguished from those required of other forms of digital libraries, such as those found in education or entertainment, in that the key decisions about the system, and evaluation of it, are based on supporting the users' work tasks. In our case, the users are from diverse organizations but are united by their goal of environmental planning. This project is unusual among digital library research projects in this focus on information in support of public policymaking. The potential users and uses of the system are extremely varied and the applications are complex and of substantial practical significance.
Our primary goals are to provide a coherent, content- based view of a diverse distributed collection which will scale to very large collections and large numbers of clients and servers, and to improve data acquisition technology. We are addressing these problems by research focussing on:
The testbed is being implemented on two tracks. A low-tech system consists of a search engine and data available over the World Wide Web. The corpus at this writing consists of photographs and both ASCII and OCRed versions of scanned documents, primarily state publication. The combination of OCRed and scanned text allow users to both manipulate the text and to view the original document, complete with images, tables, and other data that does not easily translate during OCR.
The current search engine for textual materials is based on the Dienst protocol, with full-text searching to be added shortly. The photos are accessed via a geographical browser. This version is publicly available at http://elib.cs.berkeley.edu.
The second track is a high-tech system implemented on an Illustra database, accessed via a geographical browser and innovative text searching based on natural language processing. The data includes diverse georeferenced (that is, referring to a geographical area) datasets as well as text and still images of several different kinds, and ultimately video as well. Functions of the high-tech version will migrate to the low-tech version as feasible.
The project is proceeding by means of an iterative design process with substantial attention to the needs of users in the areas of content, resource discovery and retrieval methods, document analysis, interface design, and browsing.
This paper discusses the user needs assessment and evaluation component of the project: its underlying premises, methods, and initial findings. We are approximately six months into a four year project, so these are the early findings of a developing project.
The San Francisco Bay Delta is a linchpin in the state water system. Two-thirds of the state's population get their water via the SF Bay Delta, which consists of over 50 man-made islands, a thousand miles of levees, and hundreds of miles of meandering waterways where fresh river water and salt water from the Bay come together. The Delta is also a fragile ecosystem that provides habitat for hundreds of species of fish, waterfowl, mammals, and plants while supporting extensive agriculture and recreation.
Environmental planning is complex and so are environmental documents. An environmental report for a part of the Delta may address waterways, water quality, endangered species, recreation, economic development, land use, agriculture, soil, transportation and utility infrastructures, shipping, flood control, flood insurance programs, political structure, legislation, ecosystem protection, and history. It will include text, maps, charts, graphs, tables, and photos.
Issues recur over time and across geographical areas and therefore across planning initiatives and documents. The Delta smelt, for example, an endangered species of fish, spends different parts of its lifecycle in different places and so will turn up in planning documents covering several areas. For the smelt to survive, these analyses have to be shared and the plans have to be coordinated.
The California Department of Water Resources (DWR) is the state agency with major responsibility for water planning. The DWR coordinates its efforts with a host of other agencies, ranging from the federal Bureau of Reclamation to the Environmental Protection Agency; other state agencies; and local agencies, ranging from water districts to mosquito abatement districts. Other key water planning stakeholder groups include environmental, agricultural, and industry groups.
DWR is also engaged in public education. California suffers from a chronic, mammoth, and growing undersupply of water, even in flood years. DWR's mission includes educating the public about the importance of water and its proper use by such means as developing curriculum materials for schools.
Public policymaking and planning, especially on an issue as critical as water, is information-intense. Water planning requires forecasting supply and demand; developing and modelling alternatives for the management of water supply and demand; and forecasting their costs and impacts. It must take into account the underlying science and the environmental context, past conditions and outcomes, sophisticated modelling of complex systems, and public policy priorities.
Water planning in California is a highly consultative process. It includes extensive analysis by state agency staff and others; exhaustive review and analysis by other stakeholders; and protracted public discussion.
Groups other than state agencies are also involved in producing environmental information. Local and federal agencies are also involved. Many environmental and industry groups monitor and critique the state's work and develop their own data, analyses, and interpretations.
A major goal of the UC Berkeley Electronic Environmental Library project is to support this planning process. Its goals include providing effective access to existing information, reducing the duplication of effort in data collection and analysis, improving the coordination of planning across projects, enhancing the interagency and public review process, and aiding in the dissemination of the large amount of information generated as part of the planning process.
In summary, some key aspects of California water planning that impact digital libraries:
A second goal is to explore how the methods of user- centered system design, usability testing, and library evaluation can be combined, adapted, and extended to the design and evaluation of digital libraries. It is our contention that the digital library combines the characteristics of its precursors -- libraries, electronic information retrieval systems, and computer systems that support work -- to create new and interesting problems of design and evaluation which require new methods for design and evaluation, or at least a rethinking of existing methods.
Usability has been defined as "[a system's] capability in human functional terms to be used easily and effectively by the specified range of users, given specified training and support, to fulfill a specified range of tasks, within the specified range of environmental scenarios" (Shakel quoted in [dillon94], p. 14). This definition stresses the importance of context. Information-seeking can best be understood as a means toward a user-defined end of performing some higher-order task. A digital library must ultimately be evaluated according to how well it supports the users' tasks, and so users are key partners in making that assessment.
The usual approach to user-centered system design is an iterative process similar to the following, derived from Dillon [dillon94]:
User requirements analysis for a work-centered digital library serving a group much larger and more diverse than a single organization or work group must look well beyond task analysis. We have defined four levels of analysis at which user needs assessment and evaluation must operate:
Different data collection methods are appropriate for different levels of analysis; for example, to understand the environmental planning process we are using unstructured interviews; sense-making approaches [dervin83] are useful for understanding information acts; and many standard usability methods are appropriate for evaluating system use.
Some initial findings from our interviews reinforce our initial focus on the larger context rather than information system preferences. Our interviews indicate that information search and retrieval are not particularly salient to many of our subjects. Their primary focus is their tasks. Their reflectivity on their information seeking and use is generally low: they use the tools to which they are accustomed, and rely largely on interpersonal information channels. However, when they see the potential of new tools to improve their work, some are eager to adopt them.
Users' evaluations of services are based on expectations, which are in turn based on prior experience [parasur90]. The library evaluation literature has found a low level of expectation among naive users [vanhouse93] and a tendency toward uncritical acceptance of what expert evaluators rate to be low levels of service. User evaluations of digital libraries or their ideas about possible information products and services, therefore, while useful, are an inadequate basis for the design of digital libraries. Hence evaluators and designers must consider but go beyond users' expectations and suggestions.
Our data collection methods include:
Professional and disciplinary communities and government agencies each have their own schema for understanding the world and their task domains [elio90]. And the public is likely to differ from these experts. Because water planning is a complex, on-going process, regular participants are likely to engage in a joint sense-making process that results in a shared view of the world and their tasks that differs from the public's [harris94]. How do we design digital libraries to accomodate these differences? Whose schema get reflected in the system's design?
Design is always a political process, with differing interests and priorities [greenbaum91]. Evaluation is likewise political [childers93]. Environmental planning is of course intensely political. Multiple constituency models of organizational effectiveness [zammuto84] may provide some guidance, but there is no easy solution for adjudicating among different groups' needs. Can we design digital libraries to satisfy both subject experts and novices? Organizational insiders and outsiders?
The standard approach of setting performance targets and evaluating the system against them is also problematic. Neither the applications for which the system will be used nor the level of performance deemed acceptable is fixed. Users' needs and expectations develop along with the system, and their experience with this and other systems will affect their expectations and evaluations [parasur90]. The technology of digital libraries is continually changing. Evaluation needs to be fluid and dynamic. And yet the system builders need design targets.
In the technical and political context of environmental information, conflicts arise over the appropriateness of including some information. The quality of some data may be challenged. Interest groups' analyses may differ substantially from those of government agencies. Individuals with varying levels of scientific and political legitimacy may seek to contribute to the corpus as a means of airing their views. The decisions about what information to include and what to exclude gives rise to debates over ownership, control, censorship, and public participation in policymaking.
Current trends in planning in California will probably make the usefulness of the digital library even greater. The state is working deliberately to make the process of water planning more open, with earlier public involvement, and with more emphasis on coordination across many smaller projects rather than large state initiatives. The digital library is a potentially valuable tool in making this process more open, dynamic, responsive, and transparent.
We may see subtle and complex changes in tasks and work processes as well as the higher-level planning functions. Ruhleder [ruhleder94] warns that information systems researchers must understand the effects of information systems on the codification of data used to accomplish tasks and the relations between users and their tools, techniques or systems for accessing and interpreting data. She points out that media, thought, artifacts, and work processes are deeply intertwined. Because environmental planning is information-intense, we expect changes in tools will change work practices. For example, environmental planning is profoundly influenced by, on the one hand, the existence of many complex time-series datasets, and, on the other, the logistical complexities of acquiring and using data from different sources and in different formats. Improved access to these data will likely change how they are used.
One reason for this behavior appears to be the nature of the information need. Workplace users, at least these users of environmental information, want to retrieve information rather than documents per se. For example, they may want to know the total demands on water of a given river. This information may be in the files of an individual, a dataset, or a document. Their need is not for a document but for an answer to a question. In library terms, it is a reference question, not a document request. Users need powerful, complex retrieval and analysis of heterogeneous objects. Our users are enthusiastic about Textiles [hearst93] because it allows analysis and searching at the level of topic rather than document.
Planning and analytical work consists of a continuum or flow, with reports as products that instantiate the work at a point in time. A work group needs to be able to integrate existing internal bodies of multi-datatype documents with external sources. They are continually creating new materials, requiring differing degrees and functionality of external access. Flexible authoring, structuring, and delivery mechanisms are required.
Finally, the digital library needs to be integrated into and augment the users' established work practices. It must interoperate with the work groups' other systems.
The digital library combines the characteristics of its precursors -- libraries, electronic information retrieval systems, and computer systems that support work -- to create new and interesting problems of design and evaluation which require new methods for design and evaluation, or at least a rethinking of existing methods.
This paper draws on the work and insights of a multidisciplinary group that includes Prof. Robert Twiss of the UC Department of Landscape Architecture, and Mark Butler, Lisa Schiff, and Gloria Stockton of the School of Library and Information Studies. Gary Darling of the California Resources Agency, Ray MacDowell of the California Department of Water Resources, and numerous employees of the Dept. of Water Resources have been invaluable in helping us to understand their work. The many other members of the UC Berkeley Digital Libraries project have also contributed to the work described here.
[adler92] Adler, Paul S., and Terry A. Winograd. The usability challenge. In Adler, Paul S., and Terry A. Winograd, eds. Usability: Turning Technologies into Tools. Oxford University Press, NY, 1992, 3-14.
[brown92] Brown, John Seeley and Paul Duguid. Enacting design for the workplace. In [adler92], 164-197.
[childers93] Childers, Thomas, and Nancy A. Van House. What's Good: Describing Your Public Library's Effectiveness. American Library Association, Chicago, 1993.
[dervin83] Dervin, Brenda. An overview of sense- making research: concepts, methods, and results to date. International Communications Association Annual Meeting, Dallas, May, 1983.
[dillon94] Dillon, Andrew. Designing Usable Electronic Text: Ergonomic Aspects of Human Information Usage. Taylor and Francis, Inc., Bristol, PA, 1994.
[elio90] Elio, Renee and Peternela B. Scharf. Modeling novice-to-expert shifts in problem solving strategy and knowledge organization. Cognitive Science 14 (1990), 579-639.
[greenbaum91] Greenbaum, Joan and Morten Kyng. Design at Work: Cooperative Design of Computer Systems. Lawrence Erlbaum Associates, Hillsdale, NJ, 1991.
[harris94] Harris, Stanley G. Organizational culture and individual sensemaking. Organization Science 5,3 (August 1994), 309-321.
[hearst93] Hearst, M. and Plaunt, C. Subtopic structuring for full-length document access. In the 16th Annual International ACM/SIGIR Conference on Research and Development for Information Retrieval, Pittsburgh, 1993, 59-69.
[nielsen94] Nielsen, Jacob. As they may work. Interactions 1.4 (1994), 419-24.
[parasur90] Parasuraman, A., Leonard L. Berry, and Valarie A. Zeithaml An Empirical Examination of Relationships in an Extended Service Quality Model. Marketing Science Institution Working Paper, Report 90-122, Cambridge, MA, 1990.
[ruhleder94] Ruhleder, Karen. Rich and lean representations of information for knowledge work: the role of computing packages in the work of classical scholars. ACM Transactions in Information Systems 12, 2 (1994), 208-30.
[suchman91] Suchman, Lucy A. and Randall H. Trigg. Understanding practice: video as a medium for reflection and design. In [greenbaum91], 65-90.
[sweeney93] Sweeney, M., M. Maguire, and B. Shakel. Evaluating user-computer interaction: a framework. International Journal of Man-Machine Studies 38 (1993), 689-711.
[vanhouse93] Van House, Nancy, and Childers, Thomas. The Public Library Effectiveness Study. American Library Association, Chicago, 1993.
[zammuto84] Zammuto, Raymond F. A comparison of multiple constituency models of organizational effectiveness. Academy of Management Review 9 (1984), 606-616.