MediaWeaver -- A Distributed Media Authoring System


Sha Xin Wei
Academic Software Development
Stanford University
Stanford, CA 94305-3090

Abstract

This paper describes MediaWeaver -- a distributed media management system: a network-based toolkit that developers can use to organize, describe, and link arbitrary collections of network-based media. MediaWeaver has been in use at Stanford University since 1993, to create network-based multimedia archives across disciplines in the Humanities and Sciences. The system integrates networked Unix workstations with front ends built for the World Wide Web, Macintosh and NeXTStep. 

Contents

Introduction

A major challenge facing designers of networked computing environments today is to fashion scholarly workspaces which are simultaneously coherent, easily reconfigurable, expressive, and above all, useful.

MediaWeaver is designed to support the construction of such workspacesÑas models of human systems which are both conceptually rich and data rich. MediaWeaver mediates between coherent, customizable interfaces and an open set of network services, such as database engines, WWW servers, fulltext search engines, and media conversion facilities. (See the Gallery of applications and Current Projects.)

MediaWeaver was conceived as a framework to accelerate multimedia designers' work in creating rich complexes of media supported by relational data models. MediaWeaver can also be used by non-programmers, to publish their own materials on the network.

History

After about five years of making interactive multimedia titles, we took stock of our work process to see where the bottlenecks were, and also what were the greatest defects in the interactive titles which we produced. The MediaWeaver was designed to address all of these problems. Its various frameworks were designed to be used by faculty and student authors and by designers of multimedia simulations; it was designed explicitly to support members of academic disciplines outside traditional programming communities. And it had to leverage tiny application programming resources.

We started with two prototype projects in 1993-1994: a history of Renaissance (Elizabethan) theater, and a study of high technology in the Silicon Valley. The first was chosen from a pool of faculty projects which required some management of art images and associate music or text on the network, The second presented the challenge of dealing with a significant, changing body of structured text in a complex, evolving research model. In addition, we wanted to lay the foundation for general relational modeling of human systems as such data became available in the course of the research. In both cases, we could not assume a fixed interface or conceptual model. Indeed, the only surety was change.

This genealogy strongly influenced the design principles which we will outline in the following section.

Since then weve continued with the SiliconBase [Lenoir], as the Silicon Valley History project is called, and have added several other communities + mediabases: a prototype for an archive of electro-acoustic music; a Chicana/o artists database; and most recently, the Information Map Project which aims to serve as both a learning center and a clearinghouse of Latin American conservation issues and organizations.

Design Principles and Corollaries

Make it immediately useful.

 Bread & butter reasons, but also participatory design principles suggested that we should let composers start working right away with their own media, conduct seminars and write papers using our system instead of waiting for the Holy Grail. To enable significant scholarly work, whatever we built had to exchange data transparently with commercial applications and databases, and inter-operate transparently with distributed services. Authors were encouraged to use whatever commercial editors they already had on their personal computers (Macintosh, some IBM PC): eg. MS Word, WordPerfect, Adobe Photoshop, Adobe Premiere, Omnipage, DeskScan. Our frameworks synthesize commercial, public and custom software. Our authors work in a heterogeneous network where UNIX and Mac clients see a common filesystem, and can apply user tools from Mac, UNIX/X and UNIX/NS to shared mediabases.
 
 

Factor, factor, factor.

 The architecture reflects a separation between (1) persistent storage in the filesystem (eg. ASCII or AIFF blob bytes) and in databases (eg. blob metadata in Sybase tables); (2) model (eg. hypermedia topological structure, bibliography); and (3) presentation/interaction (eg. WWW/Mosaic document, Hypercard simulation, custom disposable apps). By decoupling models from media, we can sidestep the question of data ownership and allow complex research models to be constructed on existing corpora or proxy media.[1]

Since MediaWeaver stores topological information in databases, it can generate HTML documents on the fly rather than keep source media in HTML files -- a simple version of dynamic documents. Factorization gives us the option of interposing even more expressive and nuanced means of forming constellations media or mediastreams on the fly.

Maintain user interface metaphor neutrality.

 We wish to allow multiple views on shared media, which means that rather than building a single interface application or layout protocol (a la HTML forms), we provide an API supporting multiple, concurrent, and most importantly, reconfigurable interfaces. The MediaWeaver does not assume that views must look like word-processors. Word-processor-like document viewers like MS Word or Mosaic present essentially a unidimensional rebus, a stream of generalized characters, some of which are ordinary letters, some of which are raisins of media like an embedded graphic. In general, a simulation can have quite a different structure, such as a map, timeline, multi-track score, vivarium, video VR, soundspace etc. MediaWeaver user interface kits do not assume documents, windows, chunks, or links. But the MediaWeaver does deliver documents as a special case. For example, ordinary word-processor documents may be catalogued in indigenous formats.
 
 

Broadcast rather than publish.

 MediaWeaver is designed to deliver information over networks, rather than in detached forms such as CD ROM. The CD ROM (and videodiscs etc.) distribution model is in a sense a natural relic of the traditional publishing model which requires a physical commodity in order to function. From the point of view of a university library, most if not all of the same problems encountered in acquiring preserving, cataloguing and circulating paper books or journals recur in dealing with CD ROMs and videodiscs. Some of these library issues are even thornier in the new formats.
 
 

Finegrained network distribution of software, even of single computing objects, offers quite a different paradigm which may be more akin to a broadcast model than to the publishing model. This also gives us the flexibilty we need to support live research projects in which the primary source media as well as the secondary literature and even the conceptual models are in flux. In any case, MediaWeaver's factorization allows us to build templates to which we can download a subset of a projects model + data at any moment. In this way, we can print a standalone version of simulations like T. Gieryn's Cornell Biotechnology Lab or G. Crane's Perseus by downloading data and models from the network into local templates.
 
 

Even more interesting are the new genres of publication now made possible by online mediabases. MediaWeaver provides a scheme in which progressively more formal or public compositions can arise organically from flexible, personal or project-specific research collections. For example, collections of source material can be acquired and edited according to research agenda. This demand-driven model efficiently allocates human and system attention. New scholarly articles or pedagogical presentations can be made in situ and catalogued back into the mediabase. For example, the SiliconBase seminar's reader is an entirely online hypermedia structure which can be modified at any moment by the instructors. Lectures can be composed, presented in conferences, and revised online. Over time, well-critiqued articles can then be given more public status by relaxing their access locks. Such research reports become a virtual professional journal with the addition of a suitable editorial board and digital signatures. Design issues such as the social conventions around periodicity and cost recovery mechanisms would be interesting to investigate using such a framework.

Maintain model neutrality.

 To allow multiple conceptualizations requires that authors be able to build rapidly several models over the same media. This came from a practical need to reconcile the very different time-scales involved in designing provisional research schema of annotations and associations vs. designing a MARC-quality archival description of the same set of media. Again, by factorization and abstraction MediaWeaver allows very different communities to work with media, represented when necessary by proxies, using their own models. Consequently, instead of binding to one particular database, MediaWeaver uses a data access framework which allows us to connect to any of several standard types of RDBM engines over the net, including Sybase and Oracle. MediaWeaver provides an object-oriented abstraction so that its clients need not deal with dialects of RDBMs. Clients can store arbitrary objects like bitmaps or serialized Objective-C objects as meta-data via MediaWeaver's object-oriented database access framework. In practice, (large) media are kept as source media in ordinary distributed filesystems, and (small) meta-data -- annotations, references, links, abstracts, etc. -- are kept in RDBMs.
 
 

Expect evolution.

 Perhaps the key to making an scholarly workspace worth using is to ensure that intellectual content survives across change in technology. This is partly an institutional commitment as well as a technological issue. Aside from the obvious requirement of a modularized architecture whose components may be replaced without breaking service, the following principles guided our work:
 
 

Assume no single data representation.

 We do not need to spend resources to converting media systematically to a single format like HTML or SGML. This is perhaps the most important technical feature of MediaWeaver. By making no assumption about the internal structure of a media entity (a blob), and not even requiring that a media entity exists as bytes in a filesystem, MediaWeaver allows authors to compose with any computable or renderable medium whatsoever. This way, MediaWeaver can accomodate currently unknown data types and interactions. Moreover, this way MediaWeaver can deal with opaque or pre-recorded media (eg. TIFF, MPEG, AIFF, TeX, Renderman), performable scripts (eg. NS scorefiles, Mathematica notebooks, Applescripts), executables (eg. a UNIX tool, Hypercard stack, NetScape application), and data streams (eg. live video channel) with equal ease /difficulty.
 
 

How is this feasible? The general principle here is to

Focus on space of transforms more than the base space.

Converting all the authors source media into some standard structure (such as SGML) is not cost effective nor strategic in our context because of the diversity of the material (some conversions would lose too much information), the large human cost (editorial, programmer, administrative), and the constantly changing substance. Moreover, we are not convinced that a universal permanent (on the scale of decades) document structure exists which can deal with all the structures we have in hand. Therefore, we decided that it is wiser to build a filter service which MediaWeaver core objects as well as clients could invoke on foreign platforms.
 
 

Assume nothing about the internal structure of a media entity.

A media entity may be a programmatically generated stream of data, a file of any renderable data type, an executable, or may even exist only as a virtual object in a meta-data record. This allows authors to reason with proxy objects even when, for legal or technical reasons, primary media are not available. Conversely, multiple versions of a logical media entity can be tracked. The front end, not the MediaWeaver core, decides how to interpret multiple versions of a blob. For example, a movie clip may exist in MPEG as well as a QuickTime Mac proprietary format. The front end asks for the locally renderable version, but authors deal only with a single logical entity.
 
 

Architecture

Figure 1 Media Model.

The basic media object model is described in Figure 1. There are two ways to conceive of media objects: (1) as data, with associated meta-data, or (2) as webs of data. The first way is typical of large databases or well-organized archives, while the second is typical of ad hoc collections like the World Wide Web. MediaWeaver supports both models, using annotations and links.

The basic media entity can be simply a meta-data record in a MediaWeaver database. It consists of at minimum a unique id. The meta-data record can have zero or more metadata fields (or attributes). It may or may not point to (annotate) a piece of data, for example an image or a text document or a Java applet. By allowing virtual blobs which refer to no "real" data in persistent storage, we can construct compound structures quite easily. This is analogous to a UNIX directory or Macintosh folder structure. the advantage is that this is quite independent of any one operating system, and be delivered to MediaWeaver clients in different formats (eg. as HTML pages). Also, unlike file systems, the schema for the meta-data can be easily designed and extended by the authors to suit their evolving descriptions of their media collections.

The MediaWeaver architecture can be broken into three layers: the layer of Front-End Applications. the Core Management layer, and the layer of Network Services, based on networked persistent data storage. (Figure 2) 

 Figure 2 Architecture.

Front-End Applications Layer:

WWW browsers/CGI Library

Currently, MediaWeaver documents can be searched and displayed on the world-wide web through a library of Common Gateway Interface (CGI) code. This code provides: access to data stored on the net or in commercial databases via the MediaWeaver (see API below). the ability to develop customized web-based applications which compose HTML on-the-fly.

Network Application Program Interface (API)

The API is a set of commands used by front-end developers to connect to the MediaWeaver system via the network. With the API, developers can: extend the functionality of commercially-available applications like Netscape and Hypercard. build customized applications which can run on any platform.

Core Management Layer:

The MediaWeaver core is written in Objective-C and uses NeXTSTEP's Application and Database Access Kits.

Media Manager

Software component which allows users to work with multiple databases and media formats in a transparent, user-friendly environment, providing: access to media in any format including images, video, audio, simulations, and even applications. concurrent, network access to database engines including Sybase, Oracle, Informix and others. the ability to store descriptions of media in database tables.

Search Manager

Plug-board software component which can leverage a number of search engines through a single-user interface: Current searchers include a database searcher and a full-text searcher. Several image-based search engines are currently being incorporated. Developers may incorporate their own searching techniques.

Authorization Manager

A system which controls database access through username/password security. This system can be extended by developers or replaced by other security systems.

Link Manager

The link manager allows the MediaWeaver system to link related documents across both database and media-type: links can be developed, updated, and removed without creating specialized HTML code. links can persist across applications and changing standards. actual users can make their own links amongst documents.

Filter Manager

Digital Media must often conform to specific formats for proper display. The filter manager provides a single, simple way to use any number of media format conversion tools to make abstracts (thumbnails) or different versions of media.

Network Services Layer:

Commercial Database Engines

The MediaWeaver relies on commercial databases to store descriptive information about the documents found within a given collection: supported databases include Sybase, Oracle, and Informix. allows developers to use reliable, existing database software instead of buying new, single-task engines. ensures that descriptive information will persist even as content and developer applications evolve.

Media Storage on Network

The MediaWeaver system can access media on any networked UNIX server. Developers do not have to: copy and store locally media available elsewhere. purchase specific server hardware or software for media storage. The MediaWeaver uses public domain and research data conversion and search engines running on UNIX platforms over the network.

Filters & Converters The MediaWeaver accesses public domain filters that allow developers to: create text abstracts and image thumb-nails. convert among various sound formats convert among various image formats convert among various text formats other filters can be added

Search Engines

The MediaWeaver system accesses public domain and research search engines that can retrieve files based on textual content visual content (eg. color, texture, simple shape) other engines can be added

Under the assumption that editors, browsers, search engines, filters, abstractors, and high-level OO inter-operable user environments could be added incrementally and in parallel, we invested more of our energy into the service mediation, plus abstract classes which captured the semantics of search, annotation, and association. In fact, MediaWeaver is now integrated with many of these complementary tools.

For details, see [Sha2].

Work in Progress

We are extending the MediaWeaver in several directions: integrating new content-based search engines (eg. image-based searchers [Wang]), building next-generation front end kits (Java[2]), and adjoining geographic information systems (GIS).

End Notes

[1] We have in mind notions such as using relational grammars to define meta-layouts for user-interfaces. Examples include WRI's Mathematica 2.3, and work by Weitzman and Wittenburg[Weitzman].

[2] Originally called Oak. WebRunner was written in Oak. James Gosling, jag@sun.com.

[3] Here are some snapshots of past and current projects. 

Bibliography

[Alexander]
Christopher Alexander, Sara Ishikawa, Murray Silverstein. A Pattern Language, Oxford University Press, 1977. 
[Ehn]
Pelle Ehn. "Towards a Philosophical Foundation for Skill-Based Participatory Design." Usability: Turning Technologies into Tools, 116-132 in P. Adler and T. Winograd (eds.) Oxford University Press, 1992.
[Lenoir]
Tim Lenoir, Sha Xin Wei. "Networked Scholarly Workspaces for History of High Technology." Talk at MIT, March 1995, online document available at URL http://lummi.stanford.edu/Media2/pix/www/MIT/slides. 1994. 
[MMDD]
"Metamedia Distributed Databases." Online document available at URL http://lummi.Stanford.EDU/Media2/ASD/ASD_Homepage/Multimedia.html. 1994.
[Sha2]
Sha Xin Wei, "MediaWeaver -- A Distributed media authoring system for networked scholarly workspaces," to appear in a special issue on Multimedia Tools, Multimedia Systems Journal 1996.
[Thaller]
Manfred Thaller. mthalle@gwdg.de. Max-Planck Institute for History, Goettingen, Germany. "What is 'source oriented data processing'; what is a 'historical computer science'?" In Historical Informatics: an Essential Tool for Historians? 1994.
[Wang]
James Z. Wang. "Wavelet-enhanced image search methods." Stanford ASD Technical Report, May 1996..
[Weitzman]
Louis Weitzman and Kent Wittenburg. "Automatic Presentation of Multimedia Documents Using Relational Grammars." ACM Multimedia 1994.

Last modified 29.7.96, Sha Xin Wei, 415-725-3152, xinwei@leland.stanford.edu

http://lummi.stanford.edu/Media2/texts/Architecture28.5.96.html