Introduction
Registration
Keynote Speakers
Program Schedule
Tutorials & Workshops
Area Attractions
Sponsorship
Awards
Important Dates
Paper Submission
Student Volunteers

STUDENT RATE AVAILABLE for a limited number of hotel rooms! First come, first served; $79/night plus tax. Email Cathy Larson to request a room.

ONLINE REGISTRATION IS CLOSED. PLEASE REGISTER ONSITE!

Are you planning to schedule a committee or other meeting during JCDL 2004?
To ensure meeting room availability, please contact Cathy Larson as soon as possible. Space is available on a first-come, first-served basis and is limited, so reserve early!


Tutorials & Workshops

JCDL 2004 offers a range of fresh, relevant Tutorials and Workshops from
which participants may choose. Tutorials present a single topic in detail
over either a half-day or a full day. Workshops are intended to draw
together communities of interest in a new or emerging issue and provide a
forum for discussion and exploration, over a half or full day.

 

  • Tutorial 1: Introduction to Digital Libraries
  • Tutorial 2A: Thesauri and ontologies in digital libraries (Part I):
    Structure and use in knowledge-based assistance to users
  • Tutorial 2B: Thesauri and ontologies in digital libraries (Part II):
    Design, evaluation, and development
  • Tutorial 3: Data Grids and workflows
  • Tutorial 4: Introduction to Fedora and Its Applications
  • Tutorial 5: Building Digital Library Collections with the Greenstone Librarian Interface
  • Tutorial 6: Evaluating Digital Libraries
  • Tutorial 7: Qualitative User Studies -Understanding Users in Action (CANCELLED)

 

  • Workshop 1: The Second Symposium on Intelligence and Security Informatics
  • Workshop 2: Global Reach and Impact of Digital Libraries Workshop
  • Workshop 3: How can reusable design guidelines improve the usefulness of educational digital libraries and collections?
  • Workshop 4: Designing for Diverse Audiences (CANCELLED)

 


Tutorial 1: Monday, June 7, 9:00 a.m. – 12:00 p.m. (half day)


Introduction to Digitial Libraries

by Edward Fox

Abstract

This tutorial broadly explains, covers, and illustrates key aspects of the DL field. It uses the 5S framework (societies, scenarios, spaces, structures, streams – see paper in April 2004 ACM TOIS) to provide an intuitive but formally sound basis for understanding. It provides an overview to the field as used in the Fall 2003 Digital Libraries course taught by the instructor.

Objectives

After completing this tutorial, attendees will be prepared to benefit from the presentations at JCDL, including other tutorials. They will be able to explain “digital library”, to distinguish it from related fields, to describe key DL concepts using the 5S framework, and to appreciate the history and key results of the field. They will be aware of many well known projects / systems.

Outline

This tutorial will start with an overview of definitions, foundations, scenarios and perspectives. It will cover a variety of issues, including:

• search, retrieval and resource discovery;
• multimedia/hypermedia;
• metadata (e.g., Dublin Core);
• electronic publishing; SGML and XML;
• document models and representations;
• database approaches;
• 2D and 3D interfaces and visualizations;
• architectures and interoperability (e.g., OAI); metrics;
• educational (e.g., CITIDEL, NSDL, NDLTD) and social concerns.

Case studies of projects, initiatives, and systems will illustrate key concepts, including:

• Computing and Information Technology Interactive Digital Educational Library (www.citidel.org)
• National Science Digital Library (www.nsdl.org)
• Networked Digital Library of Theses and Dissertations (www.ndltd.org)
• Open Archives Initiative (www.openarchives.org, www.dlib.vt.edu/projects/OAI/)
• Systems and approaches to building digital libraries (e.g., Open Digital Libraries)

Duration: Half day

Target Audience

All attending JCDL for the first time, as well as those interested in a refresher or different perspective on the field of digital libraries. Level of experience required: introductory. Those at intermediate or advanced levels could benefit as well, since the 5S framework has broad applicability for planners, designers, implementers, and evaluators.

Proposer’s Information

Dept. of Computer Science
Virginia Tech, Blacksburg, VA 24061 USA;
Tel: +1-540-231-5113 [direct], 6931 [dept.]; 230-6266 [mobile];
Fax: +1-540-231-6075 [CS]
Email: fox@vt.edu
URL: http://fox.cs.vt.edu/

Biography

Dr. Edward A. Fox holds a Ph.D. and M.S. in Computer Science from Cornell University, and a B.S. from M.I.T. Since 1983 he has been at Virginia Tech, where he serves as Professor of Computer Science. He is Chairman of the IEEE-CS Technical Committee on Digital Libraries. He directs the Digital Library Research Laboratory, Networked Digital Library of Theses and Dissertations, and Computing and Information Technology Interactive Digital Educational Library (CITIDEL). Fox is editor for the Morgan Kaufmann Publishers book series on Multimedia Information and Systems. He is co-editor-in-chief for ACM Journal of Educational Resources in Computing and was General Chair for JCDL'2001. He served as Program Chair for ACM DL'99, ACM DL'96, and ACM SIGIR'95. He was lead guest editor for Communications of the ACM special issues July 1989, April 1991, April 1995, April 1998, and May 2001. He has been (co)PI on over 85 research and development projects. In addition to his courses at Virginia Tech, Dr. Fox has taught over 55 tutorials in more than 20 countries. He has given more than 45 keynote/banquet/international invited/distinguished speaker presentations, over 90 refereed conference/workshop papers, and over 400 additional presentations.

 

Tutorial 2: Monday, June 7, 9:00 a.m. – 4:30 p.m. (full day).

Participants may sign up for Tutorial 2 for the full day, or may choose to sign up for either the morning session only (Tutorial 2A) or the afternoon session only (Tutorial 2B)
Tutorial 2A: Monday, June 7, 9:00 a.m. – 12:00 p.m. (half day).

2A: Thesauri and ontologies in digital libraries (Part I):
Structure and use in knowledge-based assistance to users


by Dagobert Soergel

Abstract:

This introductory tutorial is intended for anyone concerned with subject access to digital libraries. It provides a bridge by presenting methods of subject access as treated in an information studies program for those coming to digital libraries from other fields. It will elucidate through examples the conceptual and vocabulary problems users face when searching digital libraries. It will then show how a well-structured thesaurus / ontology can be used as the knowledge base for an interface that can assist users with search topic clarification (for example through browsing well-structured hierarchies and guided facet analysis) and with finding good search terms (through query term mapping and query term expansion — synonyms and hierarchic inclusion). It will touch on cross-database and cross-language searching as natural extensions of these functions. The workshop will cover the thesaurus structure needed to support these functions: Concept-term relationships for vocabulary control and synonym expansion, conceptual structure (semantic analysis, facets, and hierarchy) for topic clarification and hierarchic query term expansion). It will introduce a few sample thesauri and some thesaurus-supported digital libraries and Web sites to illustrate these principles.


Objectives:

Participants should appreciate the complexity of subject access and understand the problems that a thesaurus/ontology can help solve.

Participants should understand the principles of thesaurus/ontology structure.

Participants should be able to apply thesaurus/ontology structure to solving subject access problems.

 

Outline:

Thesaurus functions
Introduction. Challenges for digital libraries
Why thesauri: a first look with examples
What is a thesaurus? A first look with examples
Thesaurus functions
Thesaurus structure
Concept-term relationships
Conceptual structure: Semantic analysis and facets. Hierarchy
Implementing thesaurus functions
Evaluation of thesauri
Resources

Examples of classifications and thesauri
Alcohol and Other Drug Thesaurus (AOD Thesaurus)
Medical Subject Headings (MeSH) and
Unified Medical Language System (UMLS)
Art and Architecture Thesaurus (AAT). (Getty Foundation)
Dewey Decimal Classification. (US Library of Congress & OCLC/Forest Pr)
WordNet (Princeton University, George Miller)
CYC Ontology

Duration: Half-day (morning)

Proposer Information:

Dagobert Soergel
College of Information Studies
Univ. of Maryland, College Park, MD 20742
Office:(301) 405-2037
Fax (301) 314-9145
Mobile 703-585-2840
Email dsoergel@umd.edu
URL: www.clis.umd.edu/faculty/soergel/

Biography:

Dagobert Soergel holds an MS equivalent in mathematics and physics (1964) and a PhD in political science (1970), both from the University of Freiburg, Germany. He is Professor of Information Studies, University of Maryland, where he teaches courses in information retrieval, thesaurus development, expert systems, and information technology, and an information systems consultant. He has been a visiting professor at the universities of Western Ontario, Chicago, and Konstanz, Germany. Among other books, he has authored Organizing Information (1985), which received the American Society of Information Science Best Book Award, Indexing Languages and Thesauri. Construction and Maintenance (1974) and numerous papers. He has designed several thesauri, most recently the Alcohol and Other Drug Thesaurus http://etoh.niaaa.nih.gov/AODVol1/Aodthome.htm (for which he chairs the advisory committee) and the Harvard-Stanford Business Thesaurus (under development). He is developing TermMaster, a thesaurus management software package. In 1997 he received the American Society of Information Science Award of Merit.


Tutorial 2B: Monday, June 7, 1:30 p.m. – 4:30 p.m. (half day)

2B: Thesauri and ontologies in digital libraries (Part II):
Design, evaluation, and development


by Dagobert Soergel


Abstract:

This tutorial is intended for people who have a basic familiarity with the function and structure of thesauri and ontologies. It will introduce criteria for the design and evaluation of thesauri and ontologies and then deal with methods and tools for their development: Locating sources; collecting concepts, terms. and relationships to reuse existing knowledge; developing and refining thesaurus/ontology structure; software and database structure for the development and maintenance of thesauri and ontologies; standards such as RDF and TopicMaps; collaborative development of thesauri and ontologies; developing crosswalks / mappings between thesauri/ontologies. In summing up, the tutorial will address the question of the amount of resources needed to develop and maintain a thesaurus or ontology.


Objectives:

Participants should have a good grasp of what is involved in developing a thesaurus or ontology so they can judge or supervise thesaurus/ontology development projects and so they have a basis for the further development of skills to actually do thesaurus/ontology development.

Participants should be able to design and evaluate thesauri and ontologies, applying proper criteria and methods.

Participants should be able to locate pertinent existing thesauri and ontologies.

Participants should be able to extract pertinent information (terms, concepts, relationships) from existing thesauri and ontologies.

Participants should be able to develop a systematic structure for the domain of the thesaurus/ontology.

Participants should be familiar with methods and tools for developing thesauri and ontologies as the basis for acquiring the skills of using these tools


Outline:

Introduction and overview
The process of thesaurus construction
The overall process of thesaurus construction
Sources of concepts, terms, relationships, definitions
Methods of data collection
Merging data from many sources

Developing the conceptual structure
Facet analysis 1: Education (starting with classes from DDC)
More facet examples: Yahoo Education, job titles (10 min)
Developing the conceptual structure, continued
Hands-on facet exercise (in pairs)
Principles for meaningful arrangement
Rules for selection of concepts as descriptors. Rules for selection of terms
The structure and processing of thesaurus data
Interoperability of thesauri/ontologies. Crosswalks
The structure of a thesaurus/ontology database
The many forms of Knowledge Organization Systems (KOS) and their standards
Thesaurus software and its evaluation


Duration: Half-day (afternoon)

Proposer Information:

Dagobert Soergel
College of Information Studies
Univ. of Maryland, College Park, MD 20742
Office:(301) 405-2037
Fax (301) 314-9145
Mobile 703-585-2840
Email dsoergel@umd.edu
URL: www.clis.umd.edu/faculty/soergel/

Biography:

Dagobert Soergel holds an MS equivalent in mathematics and physics (1964) and a PhD in political science (1970), both from the University of Freiburg, Germany. He is Professor of Information Studies, University of Maryland, where he teaches courses in information retrieval, thesaurus development, expert systems, and information technology, and an information systems consultant. He has been a visiting professor at the universities of Western Ontario, Chicago, and Konstanz, Germany. Among other books, he has authored Organizing Information (1985), which received the American Society of Information Science Best Book Award, Indexing Languages and Thesauri. Construction and Maintenance (1974) and numerous papers. He has designed several thesauri, most recently the Alcohol and Other Drug Thesaurus http://etoh.niaaa.nih.gov/AODVol1/Aodthome.htm (for which he chairs the advisory committee) and the Harvard-Stanford Business Thesaurus (under development). He is developing TermMaster, a thesaurus management software package. In 1997 he received the American Society of Information Science Award of Merit.


Tutorial 3: Monday, June 7, 1:30 p.m. – 4:30 p.m. (half day)


Data Grids and Workflows

by Arun Jagatheesan, Reagan Moore

Abstract:

The “Grid” is an emerging infrastructure for coordinating access across autonomous organizations to distributed, heterogeneous computation and data resources. Data grids are being built around the world as the next generation data handling systems for sharing, publishing, and preserving data residing on storage systems located in multiple administrative domains. A data grid provides logical namespaces for users, digital entities and storage resources to create persistent identifiers for controlling access, enabling discovery, and managing wide area latencies. Data Grid Technologies could be used to build and manage global digital libraries and archives across multiple administrative domains. The data flow pipelines associated with maintaining digital libraries and archives could be mapped as Data Grid Workflow services.

Objectives:

The tutorial’s objective is to introduce the emerging Grid technologies and their relevance to digital libraries and persistent archives. Novices and experts would benefit from this tutorial. The tutorial would cover introduction, use-cases in large projects, design philosophies, existing technologies, open research issues, and demonstrations. Hands on sessions for the participants to use and “feel” the existing technologies could be provided based on the availability of wireless internet connections.

Outline:

  • Introduction to Data Grids (What?)
    • Data Intensive Computing Environments
    • Data grids, digital libraries, persistent archives, workflows
    • Data Grids deployed or under development (as of the tutorial date)
  • Requirement for Data Grids (Why?)
    • Grids ( Digital Libraries, Persistent Archives, Dataflow Pipelines)
    • Data sharing, data publication, data preservation, and data analysis
  • Some design philosophies in Data Grids (How?)
    • Transparencies
    • Storage resource transparency
    • Storage location transparency
    • Data identifier transparency
    • Data replica transparency
    • Virtual data abstraction
    • Authentication transparency
    • Virtual Organization transparency (zones)
    • Logical Architecture
    • Logical name space
    • Collection hierarchy
    • Implementation Architecture
    • Storage repository abstraction
    • Information repository abstraction
    • Access abstraction
    • Latency Management
    • Grid security, authentication and authorization
    • Consistency Management
  • Use Cases (Where?)
    • SDSC Storage Resource Broker (SRB)
    • BIRN
    • California Digital Library
    • TeraGrid
    • National Virtual Observatory
  • Data Grid Services
    • Grid Middleware for end to end services
    • Sagas
    • Data Management and Data Access in Grid
    • Data Grid Language
    • Context based workflow
    • SDSC Matrix Project
    • OGSA, Persistent Archives, GGF Grid File System
  • Related Technologies
  • Open Research Issues
  • SRB Hands on and demonstration

Duration: Half-day

Content Level: Introductory to Intermediate

Target Audience:

Since the tutorial covers basics, open research issues and hands on sessions, a wide variety of people usually fall into the category of “intended audience”. Based on similar experiences before, the following people would benefit:

  • Beginners, Students: Introduction to related topics including data grids, digital libraries, persistent archives and grid-based workflows would be useful for beginners
  • Investigators: Researchers who already know about digital libraries and very large data handling can update themselves on new research challenges from real-life projects
  • System Managers and Consultants: Information provided about existing technologies and best practices would be useful for people interested in production systems.
  • Commercial companies: The case studies of the existing projects will provide them an idea of how similar problems in the commercial world could be solved and applied to handle collaborative data environments in global digital libraries and archives.

Proposers’ Information:

Arun Jagatheesan arun@sdsc.edu
Reagan Moore moore@sdsc.edu

San Diego Supercomputer Center,
University of California at San Diego
9500 Gilman Drive, MC-0505
La Jolla, CA 92093-0505

Biographies:

Dr. Reagan Moore is Co-Program Director for Data and Knowledge Systems at the San Diego Supercomputer Center. He coordinates research efforts on digital libraries, data grids, and persistent archives for projects with NSF, NASA, DOE, NARA, NHPRC, and the Library of Congress. Moore has a Ph.D. in plasma physics from the University of California, San Diego, (1978) and a B.S. in physics from the California Institute of Technology (1967). Moore has worked at the San Diego Supercomputer Center since 1986, as manager of the Cray Time Sharing System, and then as manager of all production software services. Moore currently is co-PI for SDSC participation on 13 research grants ranging from the NSF National Virtual Observatory, to the NSF Southern California Earthquake Center, to the DOE Particle Physics Data Grid, and the NARA Prototype Persistent Archive.

Arun swaran Jagatheesan ("Arun") is an Adjunct Assistant Researcher (OPS Faculty member) at the University of Florida, and a Visiting Scholar at the San Diego Supercomputer Center (SDSC) at University of California, San Diego. His research interests include Data Grid Management, Peer-to-peer Computing, and Workflow Management Systems. He is the founder and technical lead of the SDSC Matrix Project on Gridflow Management Systems. He is a co-chair of the Grid File System Working Group at the Global Grid Forum, and is involved in research and development of multiple datagrid projects at the San Diego Supercomputer Center.

Locations where a similar tutorials have been presented:

- VLDB 2003 – 29th International Conference on Very Large Databases, Berlin, Germany. http://www.vldb.informatik.hu-berlin.de/progr_tutorial.html - Tutorial4

- MSST 2004 – 12th NASA Goddard /21st IEEE Conference on Mass Storage Systems and Technologies, April 13-16, 2004, College Park, Maryland, USA.


Tutorial 4: Monday, June 7, 1:30 p.m. – 4:30 p.m. (half day)


Introduction to Fedora and Its Applications

by Ronda Grizzle, Ross Wayland, Chris Wilper

Abstract:

This tutorial is designed to introduce the concepts behind the Fedora digital repository system architecture and to show the architecture's capabilities via a live demo of the software. The concept of Fedora content models will be introduced with specific examples shown of how to apply Fedora content models to different types of data. Working content models will be demonstrated.

Objectives:

After completion of the Introduction to the Flexible Extensible Digital Object Repository Architecture and Its Application to Digital Libraries tutorial, participants will have a basic understanding of the concepts behind the Fedora digital repository architecture and Fedora content models, and a basic working knowledge of the fundamental features of the Fedora software and the application of Fedora content models to specific data types. The tutorial is appropriate for all levels of expertise from beginner to expert.


Outline:

Introduction
Fedora Demo: current functionality, including a discussion of content
versioning
Fedora Content Models: definition of Fedora content models (e.g. images, electronic texts, simple and complex documents, collections, and metadata), application of Fedora content models to specific data types. Content Model demonstration: the content models currently at work in Fedora repositories.
Future Development Plans
Q&A

Duration: Half day

Target Audience:

Digital repository developers, administrators, and content specialists
Level of experience required: Introductory

Proposers’ Information:

Ronda Grizzle: rgrizzle@virginia.edu, 434-924-3965

Ross Wayland: rlw@virginia.edu, 434-924-0746

Chris Wilper: cwilper@cs.cornell.edu, 607-254-8623


Biographies:

Ross Wayland is Associate Director of the Digital Library Research and Development Group at the University of Virginia Library where for the past five years he has served in a variety of capacities including technical manager, software developer, database administrator, and technical consultant. He is currently the Lead Researcher/Programmer at the University of Virginia Library on the Fedora Project (2001-2004, funded by the Andrew W. Mellon Foundation), a joint effort involving the University of Virginia Library and Cornell University to develop a general-purpose digital object repository system. He is also actively involved in the implementation of Fedora at the University of Virginia Library and has led efforts in developing Fedora Content Models for images, TEI-encoded texts, and EAD-encoded finding aids. Prior to joining the library, he has over fifteen years of experience in the computer field including the areas of applications programming, database administration, system analysis, humanities computing, and clinical data repositories.

Chris Wilper is a research programmer/analyst in the Digital Library Research group in Cornell University's Information Science Department, with particular interests in information management and architecture. He has been working on the Fedora Project (and braving Ithaca's winters) for two years. Prior to coming to Cornell, Chris worked as a Documentum consultant to Sun Microsystems, assisting with a custom publishing solution for www.sun.com. In the early days of the web, he worked as a software engineer at Hewlett-Packard, developing their first engine for publishing customer support documents and downloads to www.hp.com. Chris is excited about the open-source nature of the Fedora software and thinks the ideas in the architecture will have a long-lasting effect on how we think about information management and publishing.

Ronda Grizzle, the Technical Coordinator for the Fedora Project in the Digital Library Research and Development Group at the University of Virginia Library, joined the Fedora project two years ago. After receiving her MSIS from the University of North Carolina at Chapel Hill’s School of Information and Library Science, she worked in a variety of library settings as a systems librarian and for a library automation vendor as a customer service manager. She brings ten years of experience in library systems, database and user interface design, customer service, and end-user training to the Fedora development project.


Tutorial 5: Monday, June 7, 9:00 a.m. – 4:30 p.m. (full day)


Building Digital Library Collections with the Greenstone Librarian Interface

by Ian H. Witten

Abstract:

This tutorial is a practical, hands-on, laboratory-style workshop in which attendees build their own digital library collections using the Greenstone digital library software, a comprehensive, open-source system for constructing, presenting, and maintaining information collections.

It is a highly compressed version of a 3-day UNESCO-sponsored workshop to promote the development and sharing of digital library collections using Greenstone. The intensive 1-day format is feasible because the UNESCO workshop is aimed at librarians in developing countries who have limited experience of advanced technology. UNESCO’s aim is to train trainers, and their participants are expected to promote digital library collection development by conducting similar programs in their countries. Attendees of this tutorial will be given access to a full set of teaching material for the 3-day version of the course that they are encouraged to adapt for local courses back home.

Background:

The course is entirely based around Greenstone’s “librarian” interface. This facility allows users to gather together sets of documents, import or assign metadata, build them into a Greenstone collection, and serve it from their web site. It supports six basic activities: opening an existing collection or defining a new one; copying documents into it, with metadata attached (if any); enriching the documents by adding further metadata to individual documents or groups; designing the collection by determining its appearance and the access facilities it will support; building it using Greenstone; and previewing the newly created collection from the Greenstone home page.

The interface explicitly supports four levels of user: Library Assistants, who can add documents and metadata to collections, and create new ones whose structure mirrors that of existing collections; Librarians, who can, in addition, design new collections, but cannot use specialist IT features (e.g. regular expressions); Library Systems Specialists, who can use all design features, but cannot perform troubleshooting tasks (e.g. interpreting debugging output from Perl scripts); and Experts, who can perform all functions.

Collections built with Greenstone automatically include effective full-text searching and metadata-based browsing facilities