Data flow architecture

A variety of applications apply a series of transformations to a data stream. The architectures emphasize data flow and control flow is not represented explicitly. They consist of a set of modules that interconnect forming a new module or network. The modules are self-contained entities that perform generic operations that can be used in a variety of contexts. A module is a computational unit while a network is an operational unit. The application functionality is determined by: types of modules and interconnections between modules. The application could also be required to adapt dynamically to new requirements.

In this context, sometimes a high-performance toolkit applicable to a wide range of problems is required. The application may need to adapt dynamically or at run-time. In complex applications it is not possible to construct a set of components that cover all potential combinations. The loose coupling associated with the black-box paradigm usually has performance penalties: generic context-free efficient algorithms are difficult to obtain. Software modules could have different incompatible interfaces, share state, or need global variables.

The Solution is to highlight the data flow such that the application's architecture can be seen as a network of modules. Inter-module communication is done by passing messages (sometimes called tokens) through unidirectional input and output ports (replacing direct calls). Depending on the number and types of ports, modules can be classified into sources (only have output ports and interface with an input device), sinks (only have input ports and interface with output devices), and filters (have both input and output ports).

Because any of the component depends only on the upstream modules it is possible to change output connections at run-time. For two modules to be connected the output port of the upstream module and the input port of the downstream module must be plug-compatible. Having more than one data type means that some modules perform specialized processing. Filters that do not have internal state could be replaced while the system is running. The network usually triggers recomputations whenever a filter output changes.

In a network, adjacent performance-critical modules could be regarded as a larger filter and replaced with an optimized version, using the Adaptive Pipeline pattern [Posnak et al., 1996] which trades flexibility for performance. Modules that use static composition cannot be dynamically configured.

2004-10-18