SAOL and SASL

SAOL and SASL are languages included in MPEG-4's Structured Audio component. In section 5 we will comment on the whole conceptual model behind Structured Audio. But we will now concentrate on its Music-N facet represented by the SAOL language for defining orchestras and the SASL language for musical scores.

The Structured Audio Orchestra Language (SAOL) [Scheirer, 1999b,Scheirer, 1998a] is a general purpose synthesis language derived from CSound. Compared to CSound, SAOL presents improved syntax, a smaller set of core functions, and a number of additional syntactic features. SAOL is optimized for digital audio synthesis and digital audio effects but any digital signal processing that can be expressed as a signal flow can be expressed in SAOL [Scheirer, 1998b].

SAOL is a two rate synthesis language: every variable represents either an audio signal that varies with the sampling rate or a control signal that varies at the control rate. The sampling rate limits the audio frequencies and the control rate limits the speed with which the parameters can vary.

Stored function tables are called wavetables. SAOL has about 100 primitive processing instructions, signal generators, and operators. A typical decoding process may also include a step that resembles linking a high-level language with a fixed library of abstract functions. These primitives hold most complexity for SAOL and can be optimized for specific implementations.

In practice, two different implementations of SAOL decoders exist at the moment of this writing. The reference software included in the standard uses the interpreter approach resulting in a very inefficient application. The other implementation is called sfront [Lazzaro and Wawrzynek, 2001] and consists on a program that translates SAOL into a C program that is then compiled and executed.

SAOL defines a two-level syntax: at the bit level to describe the messages that will be streamed into a network and at a higher level to provide an understandable representation of the language. We will briefly mention this latter, which is more directly related to our interests.

The language is a BNF (Backus-Naur Form) grammar. It contains punctuation signs that are used to give messages a particular syntax; identifiers that define orchestra symbols; numbers that describe constant values; comments that add internal documentation; and whitespaces that lexically separate the different text elements.

The orchestra is the set of signal processing routines and declarations that conform a procedure description in Structured Audio. It is made up of four different elements:

The Global Block contains the definition of those parameters that are global to the orchestra. It must be unique for an orchestra and it can hold five different kinds of messages: global parameters such as sampling rate, control rate, or number of audio inputs and outputs; definition of global variables that can be used from different instruments; path definitions describing how the instrument outputs will be addressed to the buses; and sequence definitions in order to control instruments on real-time.

After the Global Block we find the instrument definitions where the necessary sequences in order to process SASL or MIDI instructions are defined. An instrument declaration is made up of the following parts (in the given order): an identifier that defines the instrument name; a list of identifiers that define the names of the parameters involved in that particular instrument (pfield); an optional value to specify the MIDI preset; an optional value specifying the MIDI channel; a list of variable declarations; and a set of messages that define the instrument functionality.

An Opcode is simply a function that can have several inputs or variables and a single output or result. Opcodes can be used from any instrument in the orchestra. SAOL offers a set of ready-to-use Opcodes called Core Opcodes that include things such as mathematical functions or noise generators. The user can use the implemented opcodes or define new ones.

A function declaration has different elements in the following order: a number that defines the velocity with which it is executed; an identifier that defines the name of the function; a list of the formal parameters in the function; a list of variable declarations; and a set of messages that define the functionality.

Template Instruments describe multiple instruments that are made slightly different using a limited syntax of parameters.

Elements in the orchestra can appear in any order. For instance, a function definition can appear before or after being used.

The other language in Structured Audio is the Structured Audio Score Language (SASL), an event description language that will be used to generate sounds in the orchestra. SASL syntax has been kept very simple and includes very few high-level control structures, this is left for the implementer of the specific tool (sequencer, editor...).

Just as SAOL, SASL describes a two-level control language although we will just mention the user level, based on a list of text messages.

Any event in a SASL score has a temporal statement that defines at what moment it takes place. This time statement can only be specified in musical notation and therefore the absolute time depends on the value of the tempo global variable.

A SASL score has different kinds of lines: Instrument Lines, Control Lines, Tempo Lines, Table Lines, and End Lines.

An Instrument Line (InstrLine) specifies an instrument initialization at a particular moment. It has the following elements: the first identifier is the label that will be used to refer to the instrument; the first number is the initial time of the instrument; the second identifier is the instrument name that is used to select one of the instruments described in the SAOL file; the second number is the temporal duration of the instrument initialization (if it is -1, the initialization has no temporal limit); and finally it has a list of parameters (pfields) that will be passed to the instrument for its creation.

A Control Line specifies an instruction that is sent to the orchestra or a set of instruments. It is made of the following elements: the first number specifies the initial time of the event; the first identifier (optional) specifies what instruments will receive the event; the second identifier is the name of the variable that will receive the event; finally the second number is the new value for the control variable.

A Tempo Line specifies the new value for this global variable for the decoding process. It has two elements: the first number is the time when the tempo change will be applied; and the second number is the new tempo value specified in ppm.

A Table Line specifies the creation or destruction of a wavetable. It contains the following elements: the first identifier is the name of the table; the second identifier is the name of the table generator or the ``destroy'' instruction; the list of pfields id the list of parameter for the wavetable; and the sample refers to the sound from which the wavetable is extracted.

And finally a Final Line specifies the end of the sound generating process.

Structured Audio also offers a simpler format for music synthesis. A format for representing banks of wavetables, the Structured Audio Sample Bank Format (SASBF) was created in collaboration with the MIDI Manufacturer's Association. Wavetables can be downloaded into the synthesizer and controlled with MIDI sequences.

2004-10-18