Interactive Composition Work Group Goals V1.0


1. Describe the topic and purpose (goal) of this WG?

Topic:

The definition, categorization, (and possible standardization) of the basic
functionality required to create 'Interactive Audio Environments' (IAE's).

By 'Interactive', I refer to any media experience in which progress through
the media is determined in some fashion by the experiencer.

By 'Environments', I refer to audio in a broad enough sense to include (in
any combination):

Purpose:

To develop a common vocabulary for describing the extended capabilities and
concepts required for the creation and performance of IAE's.

Once defined, this vocabulary can be used to aid in the development of:

This vocabulary could also provide a means to effectively communicate the
form and behavior of IAE's during all phases of product development in a
comprehensive, unambiguous, and implementable fashion.

2. Describe the motivation behind (for example, benefit to the market)
addressing this topic?

In the past year or two, the level of audio playback hardware available on
interactive entertainment platforms has advanced by a quantum leap. In fact,
the resultant increase in production value achievable on 32 and 64 bit
console systems, and the latest generation of PC sound cards can comfortably
be said to meet or exceed that of broadcast T.V. Therefore, it is no longer
sufficient to strive to produce audio that is of equal production value to
broadcast T.V. or home video. Rather, we as a community need to look at that
which will allow our medium to move beyond, namely interactivity. Thus, the
motivation here is to discover ways of creating more interesting, dynamic and
engaging products through the addition of interactivity to traditional sound
design techniques.

Defining a common vocabulary will allow composers, sound designers,
producers, game designers, game programmers, sound driver programmers, and
hardware companies to work together in:

Progress in these three areas will allow advances to be made into the
largely uncharted possibilities of interactive audio. New efforts will be
able to build on common knowledge accumulated and distributed by the work
group, instead of starting from scratch every time.

3. Describe the current situation (technology, market influence, etc.)?

a. What formats are already appearing, and are already marketed?

There are currently a number of sound drivers with some interactive authoring
and playback capabilities. (A.I.L., HMI, GEMS, Thomas Dolby's Driver, IMUS,
etc.) All have been developed largely in isolation from one another and
contain widely varied functionality and feature sets. Few of these tool sets
are available to independent or third party developers, and then, only under
strict proprietary information agreements.

Thus, the ability to author IAE's is largely limited to a small group of
insiders (employees) practitioners and programmers.

b. Which developers are supporting which formats? What libraries exist?

IMUS, and Thomas Dolby's tools are proprietary. GEMS, AIL, and HMI are more
widely available. However the methods of creating interactivity in these last
three tools are either overly obscure, extremely labor intensive, or require
too much time and attention from the game programmer to have been widely used
or explored.

c. Define the requisite feature set for the "best" format? Discuss practical
limitations?

The 'best' format for the findings of the group should comprise a definition
of the requisite functionality required for interactive sound drivers as well
as a glossary of term contained therein. It has been apparent right from the
start of these discussions that focusing on terms and functionality rather
than specific implementation techniques will allow for more open and useful
discussions. This level of generality is also necessary to support the widely
divergent hardware currently available , as well as new hardware yet to be
born.

It is my opinion that there should be at least two distinct levels of
functionality described.

(To describe these levels. I must first define a term. For the purposes of
this document, I will refer to a stream as any logical grouping of sound data
that occurs over time. Some examples of a 'stream' by my definition would be
a MIDI sequence, a track or group of tracks within a MIDI sequence, digital
audio files that exist as stand alone elements or as tracks or groups of
tracks within a multi channel digital audio file. )

David Rosenbloom has already suggested that the descriptions focus on the
functional 'services' to be provided by the playback environment. In other
words, we should describe the functionality required of the playback system
to perform real-time I.A.E.'s. This will lead naturally to a discussion of
the functional requirements for appropriate authoring environments, as well
as requiring a commonly agreed upon set of terms. I agree whole
wholeheartedly.

The functionality we will define will most likely focus on MIDI and MIDI-like
environments for the time being. It should, however, be broad enough to
describe the behavior of digital audio playback system as they exists now,
and as they will exist when technology advances far enough to make multi
track digital audio playback for interactive media a practical concern.

(At present, the issue of algorithmic composition remains an open one,
requiring much conversation and definition before it can even be written into
this proposal. Therefore, I will not write about it now, but leave the issue
open for discussion, to be included in a future rev. of the work group
goals.)

d. Besides features, what other factors might define the "best" format??

Discussion might include:

*** Defining WG Tasks

4. Determine specific issues which will must be addressed by the WG ....

a. Define a glossary of terminology
b. Define the 'basic' or 'intra-stream' functionality
c. Define the 'advanced' or 'inner-stream' functionality
d. Discuss (and possibly define) the format and usage of embedded data?
e. Discuss (and possibly define) a high level scripting language?
f. Discuss (and possibly define) methods of communication between the
sound driver and the game program
g. Explore the merits of an embedded script approach vs. a markers only
approach.

5. Present a timeline for WG results

I would like to present the WG finding on tasks a. and b. at an invited
session of the October '95 A.E.S. convention. (Credit will, of course, be
given to all participating members...)

Therefore: