1Department of Electrical Engineering, Princeton Univesity 2Woodrow Wilson School, Princeton University 3Vice President of Computing and Information Technology, Princeton University 4Siemens Corporate Research, Princeton NJ 5Advance, Inc., Arlington VA
Political scientists use video as a primary source material. Today, video collections are hard to access--either the scholar must travel to the site of the collection or pay large copying fees and wait several weeks after selecting material from a brief textual description. Even after the material is acquired, it is hard to manipulate: it is difficult to search through large volumes of material to find the relevant items; there are no mechanisms for browsing through a particular clip to find the most important material; and it is difficult or impossible to annotate the material with scholarly notes. Video library techniques developed for political science studies will also be directly useful in other scholarly disciplines as well as a variety of commercial applications.
We are designing the Princeton Video Library of Politics (PVLP) as a testbed for digital libraries which provide video-intensive, multimedia source material to library patrons. PVLP is a collaborative effort both within and beyond Princeton. Within Princeton, participants include the Departments of Electrical Engineering and Politics, the Woodrow Wilson School of School of Public and International Affairs, and the Center for Computing and Information Technology, with consultation by the Princeton University Libraries. Nationally, Siemens Corporate Research and Advance, Inc. are research collaborators.
Scholarly research, such as that done by political scientists, provides a challenging domain for digital video library research because scholars are very demanding users of libraries. Scholars require libraries to collect and catalog large amounts of material, since each individual scholar needs different pieces of material. They need to collect not just well-documented material--encyclopedias and books in the textual domain, network news programs in video--but also less-structured primary sources--manuscripts and notes in text, outtakes and internal White House recordings in video. Since scholarship seeks to find evidence to support new ideas and to make new connections between ideas, scholars need to be able to navigate through large collections. Scholars must sift through large amounts of material to find potentially interesting material, then scan preliminary selections more thoroughly; both searches through large amounts of material and browsing through smaller selections are very difficult using traditional video techniques. Political scientists consider the relationship between image and sound to be a central topic of study; any video library must provide tools which give weight to both the audio and picture tracks of a video and provide fusion tools which help library patrons navigate through multimedia material.
Video is an extremely challenging source material for digital libraries: the material is not in textual form, requiring new techniques to free both librarians and scholars from the need to watch moving image material in real time; and video data requires extremely high bandwidth which must be delivered at the deadlines imposed by the video frame rate, requiring careful consideration of algorithmic and architectural efficiencies to be able to support large collections and user populations. Video also provides new opportunities for scholars: automated search and browsing schemes will help scholars make much more effective use of video sources, helping them find new material and make connections between pieces of the collection which they could not make using present video techniques.
Our project addresses new techniques to solve these critical video library problems:
Computer-assisted cataloging--Cataloging of moving image material is time-consuming and expensive because the material must be viewed in real time. We are working on browsing techniques for catalogers, including elimination of redundant key frames and fusion of audio and picture track information. We will also study feature extraction algorithms which can identify faces, backgrounds, etc.
Indexed search--Scholars must be able to search through sequences identified from the on-line catalog as potentially interesting to narrow the search to topics which may not be index terms in the general catalog. Since video sequences in their raw form--a series of pixels--are not sufficiently abstract for library operations, appropriate descriptions must be extracted to help organize, search, and navigate the data. We propose to study descriptions extracted from the picture track and are augmented with audio track and available textual information, such as the closed captioning track or separate synopses. Video database queries will be formulated in terms of both syntactic features, such as edges, corners, or region shapes, and semantic features, including faces and backgrounds and geometrical relationships among the features. Such data must be modeled in the database for query. In this project, we propose to develop query formulation mechanisms to support a variety of video attributes, from alphanumeric data to visual and aural features. We will also develop access methods which can support the retrieval of voluminous video data and search over the audio and picture tracks as well as over annotations.
Browsing--Browsing is a critical element of scholarly study. New browsing mechanisms are needed to reduce the time required by scholars to evaluate material during library searches. Existing browsers do not sufficiently integrate information from the audio and picture tracks. Since scholars must often navigate through long clips or large quantities of material, we will develop new techniques for hierarchical browsing, which uses key frame classifications (generated automatically or provided by the patron) to cluster related frames. We will also use existing speech recognition algorithms as components of new audio-visual browsers.
Distribution--High-quality video requires both high-performance network connections and terminals capable of decoding compressed video at the required rates. We believe that an important aspect of scholarly video library research is the development of low bit rate delivery techniques which provide adequate service to political scientists who, unlike Cold War-era nuclear weapons labs, do not have unlimited funds for communication and computational equipment. Even if the present goal of providing high-speed network connections to every school within ten years is met, that will still provide a large window of inequity between institutions based on funding. Current video libraries encourage inequities in access through large charging fees or the need to travel to the library site; digital video libraries should discourage, not encourage that trend. Existing low bit rate coding techniques, such as model-based coding, were developed for talking-heads style videoconferencing and are not well suited to the range of images encountered in our testbed collection.