Copyright © 1999-2001 Stefan Westerfeld & Jeff Tranter
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".
This handbook describes aRts, the Analog Real-time Synthesizer.
Table of Contents
The Analog Real-Time Synthesizer, or aRts, is a modular system for synthesizing sound and music on a digital computer. Using small building blocks called modules, the user can easily build complex audio processing tools. Modules typically provide functions such as sound waveform generators, filters, audio effects, mixing, and playback of digital audio in different file formats.
The artsd sound server mixes audio from several sources in real time, allowing multiple sound applications to transparently share access to sound hardware.
Using MCOP, the Multimedia Communication Protocol, multimedia applications can be network transparent, authenticated for security, and cross-platform using interfaces defined in a language-independent way using IDL. Support is also provided for non aRts-aware legacy applications. As a core component of the KDE 2 desktop environment, aRts provides the basis for the KDE multimedia architecture, and will in future support more media types including video. Like KDE, aRts runs on a number of operating systems, including Linux® and BSD variants. It can also be used independently of KDE.
This manual is intended to provide comprehensive documentation on aRts for users at different skill levels. Depending on whether you are a casual user of multimedia applications that make use of aRts or a multimedia application developer, you may want to take different paths through the manual.
It is suggested that you first read the Downloading and Building aRts chapter if you need to get aRts initially installed and running. If you already have a working system, likely bundled with your operating system distribution, you may choose to skip this section.
You should then read the sections in the aRts Tools chapter, especially artsd, artscontrol;, artsshell, and artsdsp. This will help you make the most effective use of aRts.
If you are interested in going further with aRts, read the chapter on aRts-builder and go through the tutorial. This should give you an appreciation of the powerful capabilities of aRts and the provided modules that can be used without the need to be a programmer.
If you want to know more about the internals of aRts, either to develop multimedia applications or extend aRts itself, read some or all of the chapter aRts in Detail. This should give you an understanding of all of the concepts that are prerequisites to aRts software development.
If you are interested specifically in the MIDI capabilities of aRts, you should read the chapter on MIDI.
If you want to develop aRts-aware multimedia applications, the aRts Application Programming Interfaces chapter covers the different APIs in detail.
If you want to extend aRts by creating new modules, read the aRts Modules chapter.
If you are modifying an existing application to run under aRts, read the chapter on Porting Applications to aRts.
You you can find out how to help contribute to the aRts project in the Contributing to aRts chapter, read about upcoming aRts development in the chapter on Future Work, and find links to more information in the References section.
We have also rounded out the manual with some additional material, including answers to frequently asked questions, a list of contributors, the details on aRts copyright and licensing, and some background material on digital audio and MIDI. A glossary of terms is also included.
This manual is still very much a work in progress. You are welcome to contribute by writing portions of it, but if you wish to do so, contact Jeff Tranter <tranter@kde.org> or Stefan Westerfeld <stefan@space.twc.de> first to avoid duplication of effort.
In late 1997 Stefan Westerfeld started working on a real-time, modular system for sound synthesis. The code initially ran on a PowerPC system running AIX®. This first implementation was quite simple but supported a full-featured flow system that was able to do such things as play MP3 files and pipe audio streams through effects modules.
The next step was to implement a GUI so that modules could be manipulated graphically. Stefan had had some good experience using KDE, so that was chosen as the GUI toolkit, (knowing that it might be necessary to do a GNOME/Gtk+ version as well) and this later led to using Linux® as the main development platform. Originally named ksynth, the project was renamed aRts and the pace of development accelerated. The project at this stage was quite complete, with a CORBA-based protocol, dozens of modules, a graphical module editing tool, C and C++ APIs, documentation, utilities, and a mailing list and web site with a small group of developers. The project had come a long way after only a little more than a year of development.
As the KDE team started planning for KDE 2.0, it became clear that KDE needed a more powerful infrastructure for sound and other streaming media. It was decided to adapt aRts, as it was a good step in this direction with a proven architecture. Much new development effort went into this new version of aRts, most notably the replacement of the CORBA code with an entirely new subsystem, MCOP, optimized for multimedia. Version 0.4 of aRts was included in the KDE 2.0 release.
Work continues on aRts, improving performance and adding new functionality. It should be noted that even though aRts is now a core component of KDE, it can be used without KDE, and is also being used for applications that go beyond traditional multimedia. The project has attracted some interest from the GNOME team, opening up the possibility that it may someday become the standard multimedia architecture for UNIX® desktop systems.
Included with aRts is a number of utilities for controlling and configuring its behavior. You need to have some familiarity with most of these tools in order to use aRts effectively. This section describes each of the utilities and their command options.
When running aRts under KDE, the KDE Control Center provides a group of control panel settings under the Sound category. Some of these settings are used by aRts. You can also associate sounds with various window manager and KDE events using the Look & Feel+System Notifications panel. See the KControl manual for information on using the panel settings.
Access to the sound hardware resources is controlled by artsd, the aRts daemon. This allows different applications to simultaneously send requests to the server, where they can be mixed together and played. Without a centralized sound server a single application using a sound device would prevent other applications from using it.
To use aRts there should be one and only one copy of artsd running. It is typically run when KDE starts up if it is enabled in the KControl Sound Server panel.
The program accepts the following arguments:
artsd [-n-p-N-W n] [-a audiomethod-r sampling rate-b bits-d-D devicename-F fragments-S size-s seconds-m appName] [-h-A-v-l level]
Set sampling rate to use.
Display command usage.
Enable network transparency.
Set TCP port to use (implies -n).
Public, no authentication (dangerous).
Enable full duplex operation.
Specify audio device (usually /dev/dsp).
Set number of fragments.
Set fragment size, in bytes.
Set server auto-suspend time, in seconds. A value of zero disables auto-suspend.
Specify the name of an application to be used to display error, warning, and informational messages. If you are running KDE you can use the artsmessage utility for this.
Increase the size of network buffers to a value suitable for running over a 10 mbps LAN. This is equivalent to using the -w 5 option (see below).
When running artsd over a network connection to another host you typically want to use a larger buffer size to avoid dropouts. ARts provides applications with a suggested minimum buffer size. Without this option the default size is based on the fragement size * fragment count. Using this option you can increase the size from the default by a factor of n.
Set information level - 3 (quiet), 2 (warnings), 1 (info), 0 (debug).
Display version level.
In most cases simply running artsd will suffice.
To provide good real-time response, artsd is usually run as a real-time process (on platforms where real-time priorities are supported). This requires root permissions, so to minimize the security implications, artsd can be started using the small wrapper program artswrapper which simply sets real-time priority (running as root) and then executes artsd as a non-root user.
The artsshell command is intended as a utility to perform miscellaneous functions related to the sound server. It is expected that the utility will be extended with new commands in the future (see the comments in the source code for some ideas).
The command accepts the following format:
artsshell [suspend | status | terminate | autosuspend secs | networkbuffers n | volume [volume] | stereoeffect options] [-h-q]
artsshell [options] command [command-options]
The following options are supported:
Suppress all output.
Display command usage.
The following commands are supported:
Suspend the sound server.
Display sound server status information.
Terminate the sound server. This may confuse and/or crash any applications that are currently using it.
Set the autosuspend time to the specified number of seconds. The sound server will suspend itself if idle for that period of time. A value of zero disables auto-suspend.
Set the size of the nework buffers to be a factor of n times the default size.
Sets volume scaling for sound server audio output. The volume argument is a floating point value. With no argument the current volume is displayed.
List all of the available stereo effect modules.
Insert a stereo effect into the stereo effect stack. Returns an identifier that can be used for later removing it. It can be installed at the top or the bottom (the default).
Removes the stereo effect with identifier id from the effects stack.
The artsplay command is a simple utility to play a sound file. It accepts a single argument corresponding to the name of a sound file which is sent to the sound server to be played. The sound file can be any common sound file type such as wav or au. This utility is good for testing that the sound server is working. By running two commands in parallel or in rapid succession you can demonstrate how the sound servers mixes more than one sound source.
The sound server only supports applications that are aRts-aware. Many legacy applications want to access the sound device directly. The artsdsp command provides an interim solution that allows most of these applications to run unchanged.
When an application is run under artsdsp all accesses to the /dev/dsp audio device are intercepted and mapped into aRts API calls. While the device emulation is not perfect, most applications work this way, albeit with some degradation in performance and latency.
The artsdsp command follows the format:
artsdsp [options] application arguments
The following options are recognized:
Show brief help.
Use name to identify player to artsd.
Emulate memory mapping (e.g. for Quake).
Show parameters.
A typical invocation is:
artsdsp -v -m realplay song.mp3
Some applications work better with the --mmap option. Not all features of the sound device are fully emulated, but most applications should work. If you find one that does not, submit a detailed bug report and the developers may be able to fix it. Again, remember this is an interim solution and something of an ugly hack; the best solution is to add native aRts support to the applications. If your favorite sound application does not have aRts support, ask the developer to provide it.
This is a simple utility to send raw audio data to the sound server. You need to specify the data format (sampling rate, sample size, and number of channels). This is probably not a utility that you will use often, but it can be handy for testing purposes. The command syntax is:
artscat [ options ] [ filename ]
If no file name is specified, it reads standard input. The following options are supported:
Set the sampling rate to use.
Set sample size to use (8 or 16).
Set number of channels (1 or 2).
Display command usage and exit.
This is a graphical utility for performing a number of tasks related to the sound server. The default window displays two volume level indicators and a slider to control overall output volume. From the View menu you can select other functions:
Opens a window which shows a real-time spectrum analyzer style display.
Displays active sound sources and allows you to connect them to any of the available busses.
Shows if the sound server is running and if scheduling is real-time. Indicates when server will autosuspend and allows you to suspend it immediately.
Shows active MIDI inputs and outputs and allows you to make connections [TODO: Does this work yet? Need more detail].
Connects a FreeVerb reverb effect to the stack of aRts output effects and allows you to control the effect parameters graphically.
Changes the volume indicators in the main window to use a colored LED display format instead of progress bars.
This is a utility to assist developers using the aRts C API. It outputs the appropriate compiler and linker options needed when compiling and linking code with aRts. It is intended to be used within make files to assist in portability. The command accepts three options:
Displays the compiler flags needed when compiling with the aRts C API.
Displays the linker flags needed when linking with the aRts C API.
Displays the version of the artsc-config command.
Typical output from the command is shown below:
% artsc-config --cflags -I/usr/local/kde2/include/artsc % artsc-config --libs -L/usr/local/kde2/lib -ldl -lartsc -DPIC -fPIC -lpthread % artsc-config --version 0.9.5
You could use this utility in a make file using a rule such as:
artsc: artsc.c gcc `artsc-config --cflags` -o artsc artsc.c `artsc-config --libs`
The mcopidl command is the IDL file compiler for MCOP, the Multimedia Communication Protocol used by aRts. Interfaces in aRts are defined in IDL, a language independent Interface Definition Language. The mcopidl utility accepts an IDL file as input and generates C++ header and source files for a class implementing the interface. The command accepts the following syntax:
mcopidl [ options ] filename
The valid options are:
Search in directory for includes.
Exclude a struct, interface, or enum type name from code generation.
Also create .mcoptype/.mcopclass files containing type information for the IDL file.
More information about MCOP and IDL is covered in the section Interfaces and IDL.
First of all, when trying to run aRts-builder , you should also be running the sound server (artsd). Usually, when you use KDE 2.1, this should already be the case. If not, you can configure the automatic sound server startup in KControl under Sound+Sound Server.
When you are running aRts, it always runs small modules. aRts-builder is a tool to create new structures of small connected modules. You simply click the modules inside the grid. To do so, choose them from the Modules menu, and then click somewhere in the green-grey plane.
Modules usually have ports (where usually audio signals are flowing in or out). To connect two ports, click on the first, which causes it to turn orange, and then click on the second. You can only connect an input port (on the upper side of a module) with an output port (on the lower side of a module). If you want to assign a fixed value to a port (or disconnect it), do so by double clicking on the port.
Start aRts-builder.
You need a Synth_AMAN_PLAY-module to hear the output you are creating. So create a Synth_AMAN_PLAY-module by selecting Modules->Synthesis->SoundIO->Synth_AMAN_PLAY and clicking on the empty module space. Put it below the fifth line or so, because we'll add some stuff above.
The module will have a parameter title (leftmost port), and autoRestoreID (besides the leftmost port) for finding it. To fill these out, doubleclick on these ports, select constant value and type tutorial in the edit box. Click OK to apply.
Select File->Execute structure. You will hear absolutely nothing. The play module needs some input yet... ;) If you have listened to the silence for a while, click OK and go to Step 2
Create a Synth_WAVE_SIN module (from Modules->Synthesis->Waveforms) and put it above the Synth_AMAN_PLAY module. (Leave one line space in between).
As you see, it produces some output, but requires a pos as input. First lets put the output to the speakers. Click on the out port of the Synth_WAVE_SIN and then on the left port of Synth_AMAN_PLAY. Voila, you have connected two modules.
All oscillators in aRts don't require a frequency as input, but a position in the wave. The position should be between 0 and 1, which maps for a standard Synth_WAVE_SIN object to the range 0..2*pi. To generate oscillating values from a frequency, a Synth_FREQUENCY modules is used.
Create a Synth_FREQUENCY module (from Modules+Synthesis+Oscillation & Modulation) and connect it's ‘pos’ output to the ‘pos’ input of your Synth_WAVE_SIN. Specify the frequency port of the FREQUENCY generator as constant value 440.
Select File->Execute structure. You will hear a sinus wave at 440 Hz on one of your speakers. If you have listened to it for a while, click OK and go to Step 3.
Ok, it would be nicer if you would hear the sin wave on both speakers. Connect the right port of Synth_PLAY to the outvalue of the Synth_WAVE_SIN as well.
Create a Synth_SEQUENCE object (from Modules->Synthesis->Midi & Sequencing). It should be at the top of the screen. If you need more room you can move the other modules by selecting them (to select multiple modules use Shift), and dragging them around.
Now connect the frequency output of Synth_SEQUENCE to the frequency input of the Synth_FREQUENCY module. Then specify the sequence speed as constant value 0.13 (the speed is the leftmost port).
Now go to the rightmost port (sequence) of Synth_SEQUENCE and type in as constant value A-3;C-4;E-4;C-4; this specifies a sequence. More to that in the Module Reference.
Synth_SEQUENCE really needs a sequence and the speed. Without that you'll perhaps get core dumps.
Select File->Execute Structure. You will hear a nice sequence playing. If you have enjoyed the feeling, click OK and go to Step 4.
Create a Synth_PSCALE module (from Modules->Synthesis->Envelopes). Disconnect the outvalue of the SIN wave by doubleclicking it and choosing not connected. Connect
The SIN outvalue to the PSCALE invalue
The PSCALE outvalue to the AMAN_PLAY left
The PSCALE outvalue to the AMAN_PLAY right
The SEQUENCE pos to the PSCALE pos
Finally, set the PSCALE top to some value, for instance 0.1.
How that works now: The Synth_SEQUENCE gives additional information about the position of the note it is playing right now, while 0 means just started and 1 means finished. The Synth_PSCALE module will scale the audio stream that is directed through it from a volume 0 (silent) to 1 (original loudness) back to 0 (silent). According to the position. The position where the peak should occur can be given as pos. 0.1 means that after 10% of the note has been played, the volume has reached its maximum, and starts decaying afterwards.
Select File->Execute Structure. You will hear a nice sequence playing. If you have enjoyed the feeling, click OK and go to Step 5.
Start another aRts-builder
Put a Synth_AMAN_PLAY into it, configure it to a sane name. Put a Synth_BUS_DOWNLINK into it and:
Set Synth_BUS_DOWNLINK bus to audio (that is just a name, call it fred if you like)
Connect Synth_BUS_DOWNLINK left to Synth_AMAN_PLAY left
Connect Synth_BUS_DOWNLINK right to Synth_AMAN_PLAY right
Start executing the structure. As expected, you hear nothing, ... not yet.
Go back to the structure with the Synth_WAVE_SIN stuff and replace the Synth_AMAN_PLAY module by an Synth_BUS_UPLINK, and configure the name to audio (or fred if you like). Deleting modules works with selecting them and choosing Edit->delete from the menu (or pressing the Del key).
Hit File+Execute structure. You will hear the sequence with scaled notes, transported over the bus.
If you want to find out why something like this can actually be useful, click OK (in the aRts-builder that is executing the Synth_SEQUENCE stuff, you can leave the other one running) and go to Step 6.
Choose File->Rename structure from the menu of the artsbuilder which contains the Synth_SEQUENCE stuff, and call it tutorial. Hit OK.
Choose File->Save
Start yet another aRts-builder and choose File->Load, and load the tutorial again.
Now you can select File->Execute structurein both aRts-builders having that structure. You'll now hear two times the same thing. Depending on the time when you start it it will sound more or less nice.
Another thing that is good to do at this point in time is: start Noatun, and play some mp3. Start artscontrol. Go to View->View audio manager. What you will see is Noatun and your ‘tutorial’ playback structure playing something. The nice thing you can do is this: doubleclick on Noatun. You'll now get a list of destinations. And see? You can assign Noatun to send it's output via the audio bus your tutorial playback structure provides.
Finally, now you should be able to turn your sin wave into an real instrument. This only makes sense if you have something handy that could send MIDI events to aRts. I'll describe here how you can use some external keyboard, but a midibus aware sequence like Brahms will work as well.
First of all, clean up on your desktop until you only have one aRts-builder with the sine wave structure running (not executing). Then, three times go to Ports->Create IN audio signal, and three times to Ports->Create OUT audio signal. Place the ports somewhere.
Finally, go to Ports+Change positions and names and call the ports frequency, velocity, pressed, left, right, done.
Finally, you can delete the Synth_SEQUENCE module, and rather connect connect the frequency input port of the structure to the Synth_FREQUENCY frequency port. Hm. But what do do about pos?
We don't have this, because with no algorithm in the world, you can predict when the user will release the note he just pressed on the midi keyboard. So we rather have a pressed parameter instead that just indicates wether the user still holds down the key. (pressed = 1: key still hold down, pressed = 0: key released)
That means the Synth_PSCALE object also must be replaced now. Plug in a Synth_ENVELOPE_ADSR instead (from Modules->Synthesis->Envelopes). Connect:
The pressed structure input to the ADSR active
The SIN outvalue to the ADSR invalue
The ADSR outvalue to the left structure output
The ADSR outvalue to the right structure output
Set the parameters attack to 0.1, decay to 0.2, sustain to 0.7, release to 0.1.
Another thing we need to think of is that the instrument structure somehow should know when it is ready playing and then be cleaned up, because otherwise it would be never stopped even if the note has been released. Fortunately, the ADSR envelope knows when the will be nothing to hear anymore, since it anyway scales the signal to zero at some point after the note has been released.
This is indicated by setting the done output to 1. So connect this to the done output of the structure. The structure will be removed as soon as done goes up to 1.
Rename your structure to instrument_tutorial (from File->Rename structure. Then, save it using save as (the default name offered should be instrument_tutorial now).
Start artscontrol, and go to View->Midi Manager, and choose Add->aRts Synthesis Midi Output. Finally, you should be able to select your instrument (tutorial) here.
Open a terminal and type midisend. You'll see that midisend and the instrument are listed now in the aRts MIDI manager. After selecting both and hitting connect, we're finally done. Take your keyboard and start playing (of course it should be connected to your computer).
You now should be able to work with aRts. Here are a few tips what you could try to improve with your structures now:
Try using other things than a SIN wave. When you plug in a TRI wave, you will most likely think the sound is not too nice. But try appending a SHELVE_CUTOFF filter right after the TRI wave to cut the frequenciesabove a certain frequency (try something like 1000 Hz, or even better two times the input frequency or input frequency+200Hz or something like that).
Try using more than one oscillator. Synth_XFADE can be used to cross fade (mix) two signals, Synth_ADD to add them.
Try setting the frequencies of the oscillators to not exactly the same value, that gives nice oscillations.
Experiment with more than one envelope.
Try synthesizing instruments with different output left and right.
Try postprocessing the signal after it comes out the bus downlink. You could for instance mix a delayed version of the signal to the original to get an echo effect.
Try using the velocity setting (its the strength with which the note has been pressed, you could also say volume). The special effect is always when this not only modifies the volume of the resulting signal, but as well the sound of the instrument (for instance the cutoff frequency).
...
If you have created something great, please consider providing it for the aRts web page. Or for inclusion into the next release.
aRts-builder comes with several examples, which can be opened through File->Open Example.... Some of them are in the directory, some of them (which for some reason don't work with the current release) are left in the todo directory.
The examples fall into several categories:
Standalone examples illustrating how to use each of the built-in arts modules (named example_*.arts). These typically send some output to a sound card.
Instruments built from lower level arts modules (named instrument_*.arts). These following a standard convention for input and output ports so they can be used by the MIDI manager in artscontrol.
Templates for creating new modules (names template_*.arts).
Effects which can be used as reusable building blocks (named effect_*.arts) [ all in todo ]
Mixer elements used for creating mixers, including graphical controls (named mixer_element_*.arts). [ all in todo ]
Miscellaneous modules that don't fit into any of the above categories.
Detailed Description Of Each Module:
Generates a 440Hz sine wave tone in the left channel and an 880Hz sine wave tone in the right channel, and sends it to the sound card output. This is referenced in the aRts documentation.
Generates a 440 Hz sine wave.
Generates a 440 Hz pulse wave with a 20% duty cycle.
Generates a 440 Hz sawtooth wave.
Generates a 440 Hz square wave.
Generates a 440 Hz triangle wave.
Generates white noise.
Generates a dual tone by producing 697 and 1209 Hz sine waves, scaling them by 0.5, and adding them together. This is the DTMF tone for the digit "1" on a telephone keypad.
Runs a triangle wave through the atan saturate filter.
Uses an autopanner to pan a 400 Hz sine wave between the left and right speakers at a 2 Hz rate.
Scales a sine wave by a factor of 5 and then runs it through a brickwall limiter.
Downlinks from a bus called ‘Bus’ and uplinks to the bus ‘out_soundcard’ with the left and right channels reversed.
Downlinks from a bus called ‘Delay’, uplinks the right channel through a 0.5 second cdelay, and the left channel unchanged. You can use artscontrol to connect the effect to a sound player and observe the results.
This is the same as example_cdelay.arts but used the delay effect.
This uses the Synth_CAPTURE_WAV to save a 400 Hz sine wave as a wav file. Run the module for a few seconds, and then examine the file created in /tmp. You can play the file with a player such as kaiman.
This uses the Data module to generate a constant stream of the value ‘3’ and sends it to a Debug module to periodically display it. It also contains a Nil module, illustrating how it can be used to do nothing at all.
Shows how to create a simple instrument sound using the Envelope Adsr module, repetitively triggered by a square wave.
This uses the FM Source module to generate a 440 Hz sine wave which is frequency modulated at a 5 Hz rate.
This connects the Freeverb effect from a bus downlink to a bus outlink. You can use artscontrol to connect the effect to a sound player and observe the results.
This implements a simple flanger effect (it doesn't appear to work yet, though).
This structure combines the two channels from a bus into one, passes it though the Moog VCF filter, and sends it out the out_soundcard bus.
This structure passes the left channel of sound card data through the Pitch Shift effect. Adjust the speed parameter to vary the effect.
This structure passes a white noise generator though an RC filter and out to the sound card. By viewing the FFT Scope display in artscontrol you can see how this varies from an unfiltered noise waveform.
This demonstrates the Sequence module by playing a sequence of notes.
This structure passes a white noise generator though a Shelve Cutoff filter and out to the sound card. By viewing the FFT Scope display in artscontrol you can see how this varies from an unfiltered noise waveform.
This demonstrates the Std_Equalizer module. It boosts the low and high frequencies by 6 dB.
This demonstrates the Tremolo effect. It modulates the left and right channels using a 10 Hz tremolo.
This example mixes 440 and 880 Hz sine waves using a cross fader. Adjust the value of the cross fader's percentage input from -1 to 1 to control the mixing of the two signals.
This illustrates the Pscale module (I'm not sure if this is a meaningful example).
This illustrates the Play Wave module. You will need to enter the full path to a .wav file as the filename parameter.
This shows the Multi Add module which accepts any number of inputs. It sums three Data modules which produce inputs of 1, 2, and 3, and displays the result 6.
The idea of aRts is, that synthesis can be done using small modules, which only do one thing, and then recombine them in complex structures. The small modules normally have inputs, where they can get some signals or parameters, and outputs, where they produce some signals.
One module (Synth_ADD) for instance just takes the two signals at it's input and adds them together. The result is available as output signal. The places where modules provide their input/output signals are called ports.
A structure is a combination of connected modules, some of which may have parameters coded directly to their input ports, others which may be connected, and others, which are not connected at all.
What you can do with aRts-builder is describing structures. You describe, which modules you want to be connected with which other modules. When you are done, you can save that structure description to a file, or tell aRts to create such a structure you described (Execute).
Then you'll probably hear some sound, if you did everything the right way.
Suppose you have an application called ‘mousepling’(that should make a ‘pling’ sound if you click on a button. The latency is the time between your finger clicking the mouse button and you hearing the pling. The latency in this setup composes itself out of certain latencies, that have different causes.
In this simple application, latency occurs at these places:
The time until the kernel has notified the X11 server that a mouse button was pressed.
The time until the X11 server has notified your application that a mouse button was pressed.
The time until the mousepling application has decided that this button is worth playing a pling.
The time it takes the mousepling application to tell the soundserver that it should play a pling.
The time it takes for the pling (which the soundserver starts mixing to the other output at once) to go through the buffered data, until it really reaches the position where the soundcard plays.
The time it takes the pling sound from the speakers to reach your ear.
The first three items are latencies external to aRts. They are interesting, but beyond the scope of this document. Nevertheless be aware that they exist, so that even if you have optimized everything else to really low values, you may not necessarily get exactly the result you calculated.
Telling the server to play something involves usually one single MCOP call. There are benchmarks which confirm that, on the same host with unix domain sockets, telling the server to play something can be done about 9000 times in one second with the current implementation. I expect that most of this is kernel overhead, switching from one application to another. Of course this value changes with the exact type of the parameters. If you transfer a whole image with one call, it will be slower than if you transfer only one long value. For the returncode the same is true. However for ordinary strings (such as the filename of the wav file to play) this shouldn't be a problem.
That means, we can approximate this time with 1/9000 sec, that is below 0.15 ms. We'll see that this is not relevant.
Next is the time between the server starting playing and the soundcard getting something. The server needs to do buffering, so that when other applications are running, such as your X11 server or ‘mousepling’ application no dropouts are heard. The way this is done under Linux® is that there are a number fragments of a size. The server will refill fragments, and the soundcard will play fragments.
So suppose there are three fragments. The server refills the first, the soundcard starts playing it. The server refills the second. The server refills the third. The server is done, other applications can do something now.
As the soundcard has played the first fragment, it starts playing the second and the server starts refilling the first. And so on.
The maximum latency you get with all that is (number of fragments)*(size of each fragment)/(samplingrate * (size of each sample)). Suppose we assume 44kHz stereo, and 7 fragments a 1024 bytes (the current aRts defaults), we get 40 ms.
These values can be tuned according to your needs. However, the CPU usage increases with smaller latencies, as the sound server needs to refill the buffers more often, and in smaller parts. It is also mostly impossible to reach better values without giving the soundserver realtime priority, as otherwise you'll often get drop-outs.
However, it is realistic to do something like 3 fragments with 256 bytes each, which would make this value 4.4 ms. With 4.4ms delay the idle CPU usage of aRts would be about 7.5%. With 40ms delay, it would be about 3% (of a PII-350, and this value may depend on your soundcard, kernel version and others).
Then there is the time it takes the pling sound to get from the speakers to your ear. Suppose your distance from the speakers is 2 meters. Sound travels at a speed of 330 meters per second. So we can approximate this time with 6 ms.
Streaming applications are those that produce their sound themselves. Assume a game, which outputs a constant stream of samples, and should now be adapted to replay things via aRts. To have an example: when I press a key, the figure which I am playing jumps, and a boing sound is played.
First of all, you need to know how aRts does streaming. Its very similar to the I/O with the soundcard. The game sends some packets with samples to the sound server. Lets say three packets. As soon as the sound server is done with the first packet, it sends a confirmation back to the game that this packet is done.
The game creates another packet of sound and sends it to the server. Meanwhile the server starts consuming the second sound packet, and so on. The latency here looks similar like in the simple case:
The time until the kernel has notified the X11 server that a key was pressed.
The time until the X11 server has notified the game that a key was pressed.
The time until the game has decided that this key is worth playing a boing.
The time until the packet of sound in which the game has started putting the boing sound is reaching the sound server.
The time it takes for the boing (which the soundserver starts mixing to the other output at once) to go through the buffered data, until it really reaches the position where the soundcard plays.
The time it takes the boing sound from the speakers to reach your ear.
The external latencies, as above, are beyond the scope of this document.
Obviously, the streaming latency depends on the time it takes all packets that are used for streaming to be played once. So it is (number of packets)*(size of each packet)/(samplingrate * (size of each sample))
As you see that is the same formula as applies for the fragments. However for games, it makes no sense to do such small delays as above. I'd say a realistic configuration for games would be 2048 bytes per packet, use 3 packets. The resulting latency would be 35ms.
This is based on the following: assume that the game renders 25 frames per second (for the display). It is probably safe to assume that you won't notice a difference of sound output of one frame. Thus 1/25 second delay for streaming is acceptable, which in turn means 40ms would be okay.
Most people will also not run their games with realtime priority, and the danger of drop-outs in the sound is not to be neglected. Streaming with 3 packets a 256 bytes is possible (I tried that) - but causes a lot of CPU usage for streaming.
For server side latencies, you can calculate these exactly as above.
There are a lot of factors which influence _CPU usage in a complex scenario, with some streaming applications and some others, some plugins on the server etc. To name a few:
Raw CPU usage by the calculations necessary.
aRts internal scheduling overhead - how aRts decides when which module should calculate what.
Integer to float conversion overhead.
MCOP0 protocol overhead.
Kernel: process/context switching.
Kernel: communication overhead
For raw CPU usage for calculations, if you play two streams, simultaneuosly you need to do additions. If you apply a filter, some calculations are involved. To have a simplified example, adding two streams involves maybe four CPU cycles per addition, on a 350Mhz processor, this is 44100*2*4/350000000 = 0.1% CPU usage.
aRts internal scheduling: aRts needs to decide which plugin when calculates what. This takes time. Take a profiler if you are interested in that. Generally what can be said is: the less realtime you do (i.e.. the larger blocks can be calculated at a time) the less scheduling overhead you have. Above calculating blocks of 128 samples at a time (thus using fragment sizes of 512 bytes) the scheduling overhead is probably not worth thinking about it.
Integer to float conversion overhead: aRts uses floats internally as data format. These are easy to handle and on recent processors not slower than integer operations. However, if there are clients which play data which is not float (like a game that should do its sound output via aRts), it needs to be converted. The same applies if you want to replay the sounds on your soundcard. The soundcard wants integers, so you need to convert.
Here are numbers for a Celeron, approx. ticks per sample, with -O2 +egcs 2.91.66 (taken by Eugene Smith <hamster@null.ru>). This is of course highly processor dependant:
convert_mono_8_float: 14 convert_stereo_i8_2float: 28 convert_mono_16le_float: 40 interpolate_mono_16le_float: 200 convert_stereo_i16le_2float: 80 convert_mono_float_16le: 80
So that means 1% CPU usage for conversion and 5% for interpolation on this 350 MHz processor.
MCOP protocol overheadL MCOP does, as a rule of thumb, 9000 invocations per second. Much of this is not MCOPs fault, but relates to the two kernel causes named below. However, this gives a base to do calculations what the cost of streaming is.
Each data packet transferred through streaming can be considered one MCOP invocation. Of course large packets are slower than 9000 packets/s, but its about the idea.
Suppose you use packet sizes of 1024 bytes. Thus, to transfer a stream with 44kHz stereo, you need to transfer 44100*4/1024 = 172 packets per second. Suppose you could with 100% cpu usage transfer 9000 packets, then you get (172*100)/9000 = 2% CPU usage due to streaming with 1024 byte packets.
That are approximations. However, they show, that you would be much better off (if you can afford it for the latency), to use for instance packets of 4096 bytes. We can make a compact formula here, by calculating the packet size which causes 100% CPU usage as 44100*4/9000 = 19.6 samples, and thus getting the quick formula:
streaming CPU usage in percent = 1960/(your packet size)
which gives us 0.5% CPU usage when streaming with 4096 byte packets.
Kernel process/context switching: this is part of the MCOP protocol overhead. Switching between two processes takes time. There is new memory mapping, the caches are invalid, whatever else (if there is a kernel expert reading this - let me know what exactly are the causes). This means: it takes time.
I am not sure how many context switches Linux® can do per second, but that number isn't infinite. Thus, of the MCOP protocol overhead I suppose quite a bit is due to context switching. In the beginning of MCOP, I did tests to use the same communication inside one process, and it was much faster (four times as fast or so).
Kernel: communication overhead: This is part of the MCOP protocol overhead. Transferring data between processes is currently done via sockets. This is convenient, as the usual select() methods can be used to determine when a message has arrived. It can also be combined with other I/O sources as audio I/O, X11 server or whatever else easily.
However, those read and write calls cost certainly processor cycles. For small invocations (such as transferring one midi event) this is probably not so bad, for large invocations (such as transferring one video frame with several megabytes) this is clearly a problem.
Adding the usage of shared memory to MCOP where appropriate is probably the best solution. However it should be done transparent to the application programmer.
Take a profiler or do other tests to find out how much exactly current audio streaming is impacted by the not using sharedmem. However, its not bad, as audio streaming (replaying mp3) can be done with 6% total CPU usage for artsd and artscat (and 5% for the mp3 decoder). However, this includes all things from the necessary calculations up do the socket overhead, thus I'd say in this setup you could perhaps save 1% by using sharedmem.
These are done with the current development snapshot. I also wanted to try out the real hard cases, so this is not what everyday applications should use.
I wrote an application called streamsound which sends streaming data to aRts. Here it is running with realtime priority (without problems), and one small serverside (volume-scaling and clipping) plugin:
4974 stefan 20 0 2360 2360 1784 S 0 17.7 1.8 0:21 artsd 5016 stefan 20 0 2208 2208 1684 S 0 7.2 1.7 0:02 streamsound 5002 stefan 20 0 2208 2208 1684 S 0 6.8 1.7 0:07 streamsound 4997 stefan 20 0 2208 2208 1684 S 0 6.6 1.7 0:07 streamsound
Each of them is streaming with 3 fragments a 1024 bytes (18 ms). There are three such clients running simultaneously. I know that that does look a bit too much, but as I said: take a profiler and find out what costs time, and if you like, improve it.
However, I don't think using streaming like that is realistic or makes sense. To take it even more to the extreme, I tried what would be the lowest latency possible. Result: you can do streaming without interruptions with one client application, if you take 2 fragments of 128 bytes between aRts and the soundcard, and between the client application and aRts. This means that you have a total maximum latency of 128*4/44100*4 = 3 ms, where 1.5 ms is generated due to soundcard I/O and 1.5 ms is generated through communication with aRts. Both applications need to run realtimed.
But: this costs an enormous amount of CPU. This example cost you about 45% of my P-II/350. I also starts to click if you start top, move windows on your X11 display or do disk I/O. All these are kernel issues. The problem is that scheduling two or more applications with realtime priority cost you an enormous amount of effort, too, even more if the communicate, notify each other etc..
Finally, a more real life example. This is aRts with artsd and one artscat (one streaming client) running 16 fragments a 4096 bytes:
5548 stefan 12 0 2364 2364 1752 R 0 4.9 1.8 0:03 artsd 5554 stefan 3 0 752 752 572 R 0 0.7 0.5 0:00 top 5550 stefan 2 0 2280 2280 1696 S 0 0.5 1.7 0:00 artscat
Busses are dynamically built connections that transfer audio. Basically, there are some uplinks and some downlinks. All signals from the uplinks are added and send to the downlinks.
Busses as currently implemented operate in stereo, so you can only transfer stereo data over busses. If you want mono data, well, transfer it only over one channel and set the other to zero or whatever. What you need to to, is to create one or more Synth_BUS_UPLINK objects and tell them a bus name, to which they should talk (e.g. “audio” or “drums”). Simply throw the data in there.
Then, you'll need to create one or more Synth_BUS_DOWNLINK objects, and tell them the bus name (“audio” or “drums” ... if it matches, the data will get through), and the mixed data will come out again.
The uplinks and downlinks can reside in different structures, you can even have different aRts-builders running and start an uplink in one and receive the data from the other with a downlink.
What is nice about busses is, that they are fully dynamic. Clients can plug in and out on the fly. There should be no clicking or noise as this happens.
Of course, you should not plug out a client playing a signal, since it will probably not be a zero level when plugged out the bus, and then it will click.
aRts/MCOP heavily relies on splitting up things into small components. This makes things very flexible, as you can extend the system easily by adding new components, which implement new effects, fileformats, oscillators, gui elements, ... As almost everything is a component, almost everything can be extended easily, without changing existing sources. New components can be simply loaded dynamically to enhance already existing applications.
However, to make this work, two things are required:
Components must advertise themselves - they must describe what great things they offer, so that applications will be able to use them.
Applications must actively look for components that they could use, instead of using always the same thing for some task.
The combination of this: components which say “here I am, I am cool, use me”, and applications (or if you like, other components) which go out and look which component they could use to get a thing done, is called trading.
In aRts, components describe themselves by specifying values that they “support” for properties. A typical property for a file-loading component could be the extension of the files that it can process. Typical values could be wav, aiff or mp3.
In fact, every component may choose to offer many different values for one property. So one single component could offer reading both, wav and aiff files, by specifying that it supports these values for the property “Extension”.
To do so, a component has to place a .mcopclass file at an appropriate place, containing the properties it supports, for our example, this could look like this (and would be installed in componentdir/Arts/WavPlayObject.mcopclass):
Interface=Arts::WavPlayObject,Arts::PlayObject,Arts::SynthModule,Arts::Object Author="Stefan Westerfeld <stefan@space.twc.de>" URL="http://www.arts-project.org" Extension=wav,aiff MimeType=audio/x-wav,audio/x-aiff
It is important that the filename of the .mcopclass-file also says what the interface of the component is called like. The trader doesn't look at the contents at all, if the file (like here) is called Arts/WavPlayObject.mcopclass, the component interface is called Arts::WavPlayObject (modules map to directories).
To look for components, there are two interfaces (which are defined in core.idl, so you have them in every application), called Arts::TraderQuery and Arts::TraderOffer. You to go on a “shopping tour” for components like this:
Create a query object:
Arts::TraderQuery query;
Specify what you want. As you saw above, components describe themselves using properties, for which they offer certain values. So specifying what you want is done by selecting components that support a certain value for a property. This is done using the supports method of a TraderQuery:
query.supports("Interface","Arts::PlayObject"); query.supports("Extension","wav");
Finally, perform the query using the query method. Then, you'll (hopefully) get some offers:
vector<Arts::TraderOffer> *offers = query.query();
Now you can examine what you found. Important is the interfaceName method of TraderOffer, which will tell you the name of the component, that matched the query. You can also find out further properties by getProperty. The following code will simply iterate through all components, print their interface names (which could be used for creation), and delete the results of the query again:
vector<Arts::TraderOffer>::iterator i; for(i = offers->begin(); i != offers->end(); i++) cout << i->interfaceName() << endl; delete offers;
For this kind of trading service to be useful, it is important to somehow agree on what kinds of properties components should usually define. It is essential that more or less all components in a certain area use the same set of properties to describe themselves (and the same set of values where applicable), so that applications (or other components) will be able to find them.
Author (type string, optional): This can be used to ultimately let the world know that you wrote something. You can write anything you like in here, e-mail adress is of course helpful.
Buildable (type boolean, recommended): This indicates whether the component is usable with RAD tools (such as aRts-builder) which use components by assigning properties and connecting ports. It is recommended to set this value to true for almost any signal processing component (such as filters, effects, oscillators, ...), and for all other things which can be used in RAD like fashion, but not for internal stuff like for instance Arts::InterfaceRepo.
Extension (type string, used where relevant): Everything dealing with files should consider using this. You should put the lowercase version of the file extension without the “.” here, so something like wav should be fine.
Interface (type string, required): This should include the full list of (useful) interfaces your components supports, probably including Arts::Object and if applicable Arts::SynthModule.
Language (type string, recommended): If you want your component to be dynamically loaded, you need to specify the language here. Currently, the only allowed value is C++, which means the component was written using the normal C++ API. If you do so, you'll also need to set the “Library” property below.
Library (type string, used where relevant): Components written in C++ can be dynamically loaded. To do so, you have to compile them into a dynamically loadable libtool (.la) module. Here, you can specify the name of the .la-File that contains your component. Remember to use REGISTER_IMPLEMENTATION (as always).
MimeType (type string, used where relevant): Everything dealing with files should consider using this. You should put the lowercase version of the standard mimetype here, for instance audio/x-wav.
URL (type string, optional): If you like to let people know where they can find a new version of the component (or a homepage or anything), you can do it here. This should be standard HTTP or FTP URL.
Each namespace declaration corresponds to a ‘module’ declaration in the MCOP IDL.
// mcop idl module M { interface A { } }; interface B;
In this case, the generated C++ code for the IDL snippet would look like this:
// C++ header namespace M { /* declaration of A_base/A_skel/A_stub and similar */ class A { // Smartwrapped reference class /* [...] */ }; } /* declaration of B_base/B_skel/B_stub and similar */ class B { /* [...] */ };
So when referring the classes from the above example in your C++ code, you would have to write M::A, but only B. However, you can of course use ‘using M’ somewhere - like with any namespace in C++.
There is one global namespace called ‘Arts’, which all programs and libraries that belong to aRts itself use to put their declarations in. This means, that when writing C++ code that depends on aRts, you normally have to prefix every class you use with Arts::, like this:
int main(int argc, char **argv) { Arts::Dispatcher dispatcher; Arts::SimpleSoundServer server(Arts::Reference("global:Arts_SimpleSoundServer")); server.play("/var/foo/somefile.wav");
The other alternative is to write a using once, like this:
using namespace Arts; int main(int argc, char **argv) { Dispatcher dispatcher; SimpleSoundServer server(Reference("global:Arts_SimpleSoundServer")); server.play("/var/foo/somefile.wav"); [...]
In IDL files, you don't exactly have a choice. If you are writing code that belongs to aRts itself, you'll have to put it into module aRts.
// IDL File for aRts code: #include <artsflow.idl> module Arts { // put it into the Arts namespace interface Synth_TWEAK : SynthModule { in audio stream invalue; out audio stream outvalue; attribute float tweakFactor; }; };
If you write code that doesn't belong to aRts itself, you should not put it into the ‘Arts’ namespace. However, you can make an own namespace if you like. In any case, you'll have to prefix classes you use from aRts.
// IDL File for code which doesn't belong to aRts: #include <artsflow.idl> // either write without module declaration, then the generated classes will // not use a namespace: interface Synth_TWEAK2 : Arts::SynthModule { in audio stream invalue; out audio stream outvalue; attribute float tweakFactor; }; // however, you can also choose your own namespace, if you like, so if you // write an application "PowerRadio", you could for instance do it like this: module PowerRadio { struct Station { string name; float frequency; }; interface Tuner : Arts::SynthModule { attribute Station station; // no need to prefix Station, same module out audio stream left, right; }; };
Often, in interfaces, casts, method signatures and similar, MCOP needs to refer to names of types or interfaces. These are represented as string in the common MCOP datastructures, while the namespace is always fully represented in the C++ style. This means the strings would contain ‘M::A’ and ‘B’, following the example above.
Note this even applies if inside the IDL text the namespace qualifiers were not given, since the context made clear which namespace the interface A was meant to be used in.
Using threads isn't possible on all platforms. This is why aRts was originally written without using threading at all. For almost all problems, for each threaded solution to the problem, there is a non-threaded solution that does the same.
For instance, instead of putting audio output in a seperate thread, and make it blocking, aRts uses non-blocking audio output, and figures out when to write the next chunk of data using select().
However, aRts (in very recent versions) at least provides support for people who do want to implement their objects using threads. For instance, if you already have code for an mp3 player, and the code expects the mp3 decoder to run in a seperate thread, it's usally the easiest thing to do to keep this design.
The aRts/MCOP implementation is built along sharing state between seperate objects in obvious and non-obvious ways. A small list of shared state includes:
The Dispatcher object which does MCOP communication.
The Reference counting (Smartwrappers).
The IOManager which does timer and fd watches.
The ObjectManager which creates objects and dynamically loads plugins.
The FlowSystem which calls calculateBlock in the appropriate situations.
All of the above objects don't expect to be used concurrently (i.e. called from seperate threads at the same time). Generally there are two ways of solving this:
Require the caller of any functions on this objects to acquire a lock before using them.
Making these objects really threadsafe and/or create per-thread instances of them.
aRts follows the first approach: you will need a lock whenever you talk to any of these objects. The second approach is harder to do. A hack which tries to achieve this is available at http://space.twc.de/~stefan/kde/download/arts-mt.tar.gz, but for the current point in time, a minimalistic approach will probably work better, and cause less problems with existing applications.
You can get/release the lock with the two functions:
Generally, you don't need to acquire the lock (and you shouldn't try to do so), if it is already held. A list of conditions when this is the case is:
You receive a callback from the IOManager (timer or fd).
You get call due to some MCOP request.
You are called from the NotificationManager.
You are called from the FlowSystem (calculateBlock)
There are also some exceptions of functions. which you can only call in the main thread, and for that reason you will never need a lock to call them:
Constructor/destructor of Dispatcher/IOManager.
Dispatcher::run() / IOManager::run()
IOManager::processOneEvent()
But that is it. For everything else that is somehow related to aRts, you will need to get the lock, and release it again when done. Always. Here is a simple example:
class SuspendTimeThread : Arts::Thread { public: void run() { /* * you need this lock because: * - constructing a reference needs a lock (as global: will go to * the object manager, which might in turn need the GlobalComm * object to look up where to connect to) * - assigning a smartwrapper needs a lock * - constructing an object from reference needs a lock (because it * might need to connect a server) */ Arts::Dispatcher::lock(); Arts::SoundServer server = Arts::Reference("global:Arts_SoundServer"); Arts::Dispatcher::unlock(); for(;;) { /* * you need a lock here, because * - dereferencing a smartwrapper needs a lock (because it might * do lazy creation) * - doing an MCOP invocation needs a lock */ Arts::Dispatcher::lock(); long seconds = server.secondsUntilSuspend(); Arts::Dispatcher::unlock(); printf("seconds until suspend = %d",seconds); sleep(1); } } }
The following threading related classes are currently available:
Arts::Thread - which encapsulates a thread.
Arts::Mutex - which encapsulates a mutex.
Arts::ThreadCondition - which provides support to wake up threads which are waiting for a certain condition to become true.
Arts::SystemThreads - which encapsulates the operating system threading layer (which offers a few helpful functions to application programmers).
See the links for documentation.
MCOP references are one of the most central concepts in MCOP programming. This section will try to describe how exactly references are used, and will especially also try to cover cases of failure (server crashes).
An MCOP reference is not an object, but a reference to an object: Even though the following declaration
Arts::Synth_PLAY p;looks like a definition of an object, it only declares a reference to an object. As C++ programmer, you might also think of it as Synth_PLAY *, a kind of pointer to a Synth_PLAY object. This especially means, that p can be the same thing as a NULL pointer.
You can create a NULL reference by assigning it explicitly
Arts::Synth_PLAY p = Arts::Synth_PLAY::null();
Invoking things on a NULL reference leads to a core dump
Arts::Synth_PLAY p = Arts::Synth_PLAY::null(); string s = p.toString();
will lead to a core dump. Comparing this to a pointer, it is essentially the same as
QWindow* w = 0; w->show();which every C++ programmer would know to avoid.
Uninitialized objects try to lazy-create themselves upon first use
Arts::Synth_PLAY p; string s = p.toString();
is something different than dereferencing a NULL pointer. You didn't tell the object at all what it is, and now you try to use it. The guess here is that you want to have a new local instance of a Arts::Synth_PLAY object. Of course you might have wanted something else (like creating the object somewhere else, or using an existing remote object). However, it is a convenient short cut to creating objects. Lazy creation will not work once you assigned something else (like a null reference).
The equivalent C++ terms would be
QWidget* w; w->show();which obviously in C++ just plain segfaults. So this is different here. This lazy creation is tricky especially as not necessarily an implementation exists for your interface.
For instance, consider an abstract thing like a Arts::PlayObject. There are certainly concrete PlayObjects like those for playing mp3s or wavs, but
Arts::PlayObject po; po.play();will certainly fail. The problem is that although lazy creation kicks in, and tries to create a PlayObject, it fails, because there are only things like Arts::WavPlayObject and similar. Thus, use lazy creation only when you are sure that an implementation exists.
References may point to the same object
Arts::SimpleSoundServer s = Arts::Reference("global:Arts_SimpleSoundServer"); Arts::SimpleSoundServer s2 = s;
creates two references referring to the same object. It doesn't copy any value, and doesn't create two objects.
All objects are reference counted So once an object isn't referred any longer by any references, it gets deleted. There is no way to explicitely delete an object, however, you can use something like this
Arts::Synth_PLAY p; p.start(); [...] p = Arts::Synth_PLAY::null();to make the Synth_PLAY object go away in the end. Especially, it should never be necessary to use new and delete in conjunction with references.
As references can point to remote objects, the servers containing these objects can crash. What happens then?
A crash doesn't change whether a reference is a null reference. This means that if foo.isNull() was true before a server crash then it is also true after a server crash (which is clear). It also means that if foo.isNull() was false before a server crash (foo referred to an object) then it is also false after the server crash.
Invoking methods on a valid reference stays safe Suppose the server containing the object calc crashed. Still calling things like
int k = calc.subtract(i,j)are safe. Obviously subtract has to return something here, which it can't because the remote object no longer exists. In this case (k == 0) would be true. Generally, operations try to return something ‘neutral’ as result, such as 0.0, a null reference for objects or empty strings, when the object no longer exists.
Checking error() reveals whether something worked.
In the above case,
int k = calc.subtract(i,j) if(k.error()) { printf("k is not i-j!\n"); }would print out k is not i-j whenever the remote invocation didn't work. Otherwise k is really the result of the subtract operation as performed by the remote object (no server crash). However, for methods doing things like deleting a file, you can't know for sure whether it really happened. Of course it happened if .error() is false. However, if .error() is true, there are two possibilities:
The file got deleted, and the server crashed just after deleting it, but before transferring the result.
The server crashed before beeing able to delete the file.
Using nested invocations is dangerous in crash resistent programs
Using something like
window.titlebar().setTitle("foo");is not a good idea. Suppose you know that window contains a valid Window reference. Suppose you know that window.titlebar() will return a Titlebar reference because the Window object is implemented properly. However, still the above statement isn't safe.
What could happen is that the server containing the Window object has crashed. Then, regardless of how good the Window implementation is, you will get a null reference as result of the window.titlebar() operation. And then of course invoking setTitle on that null reference will lead to a crash as well.
So a safe variant of this would be
Titlebar titlebar = window.titlebar(); if(!window.error()) titlebar.setTitle("foo");add the appropriate error handling if you like. If you don't trust the Window implementation, you might as well use
Titlebar titlebar = window.titlebar(); if(!titlebar.isNull()) titlebar.setTitle("foo");which are both safe.
There are other conditions of failure, such as network disconnection (suppose you remove the cable between your server and client while your application runs). However their effect is the same like a server crash.
Overall, it is of course a consideration of policy how strictly you try to trap communcation errors throughout your application. You might follow the ‘if the server crashes, we need to debug the server until it never crashes again’ method, which would mean you need not bother about all these problems.
An object, to exist, must be owned by someone. If it isn't, it will cease to exist (more or less) immediately. Internally, ownership is indicated by calling _copy(), which increments an reference count, and given back by calling _release(). As soon as the reference count drops to zero, a delete will be done.
As a variation of the theme, remote usage is indicated by _useRemote(), and dissolved by _releaseRemote(). These functions lead a list which server has invoked them (and thus owns the object). This is used in case this server disconnects (i.e. crash, network failure), to remove the references that are still on the objects. This is done in _disconnectRemote().
Now there is one problem. Consider a return value. Usually, the return value object will not be owned by the calling function any longer. It will however also not be owned by the caller, until the message holding the object is received. So there is a time of ‘ownershipless’ objects.
Now, when sending an object, one can be reasonable sure that as soon as it is received, it will be owned by somebody again, unless, again, the receiver dies. However this means that special care needs to be taken about object at least while sending, probably also while receiving, so that it doesn't die at once.
The way MCOP does this is by ‘tagging’ objects that are in process of being copied across the wire. Before such a copy is started, _copyRemote is called. This prevents the object from being freed for a while (5 seconds). Once the receiver calls _useRemote(), the tag is removed again. So all objects that are send over wire are tagged before transfer.
If the receiver receives an object which is on his server, of course he will not _useRemote() it. For this special case, _cancelCopyRemote() exists to remove the tag manually. Other than that, there is also timer based tag removal, if tagging was done, but the receiver didn't really get the object (due to crash, network failure). This is done by the ReferenceClean class.
GUI elements are currently in the experimental state. However, this section will describe what is supposed to happen here, so if you are a developer, you will be able to understand how aRts will deal with GUIs in the future. There is some code there already, too.
GUI elements should be used to allow synthesis structures to interact with the user. In the simplest case, the user should be able to modify some parameters of a structure directly (such as a gain factor which is used before the final play module).
In more complex settings, one could imagine the user modifying parameters of groups of structures and/or not yet running structures, such as modifying the ADSR envelope of the currently active MIDI instrument. Another thing would be setting the filename of some sample based instrument.
On the other hand, the user could like to monitor what the synthesizer is doing. There could be oscilloscopes, spectrum analyzers, volume meters and “experiments” that figure out the frequency transfer curve of some given filter module.
Finally, the GUI elements should be able to control the whole structure of what is running inside aRts and how. The user should be able to assign instruments to midi channels, start new effect processors, configure his main mixer pult (which is built of aRts structures itself) to have one channel more and use another strategy for its equalizers.
You see - the GUI elements should bring all possibilities of the virtual studio aRts should simulate to the user. Of course, they should also gracefully interact with midi inputs (such as sliders should move if they get MIDI inputs which also change just that parameter), and probably even generate events themselves, to allow the user interaction to be recorded via sequencer.
Technically, the idea is to have an IDL base class for all widgets (Arts::Widget), and derive a number of commonly used widgets from there (like Arts::Poti, Arts::Panel, Arts::Window, ...).
Then, one can implement these widgets using a toolkit, for instance Qt™ or Gtk. Finally, effects should build their GUIs out of existing widgets. For instance, a freeverb effect could build it's GUI out of five Arts::Poti thingies and an Arts::Window. So IF there is a Qt™ implementation for these base widgets, the effect will be able to display itself using Qt™. If there is Gtk implementation, it will also work for Gtk (and more or less look/work the same).
Finally, as we're using IDL here, aRts-builder (or other tools) will be able to plug GUIs together visually, or autogenerate GUIs given hints for parameters, only based on the interfaces. It should be relatively straight forward to write a “create GUI from description” class, which takes a GUI description (containing the various parameters and widgets), and creates a living GUI object out of it.
Based on IDL and the aRts/MCOP component model, it should be easy to extend the possible objects which can be used for the GUI just as easy as it is to add a plugin implementing a new filter to aRts.
The MIDI support in aRts can do a number of things. First of all, it allows communication between different pieces of software that produce or consume MIDI events. If you for instance have a sequencer and a sampler that are both aRts aware, aRts can send the MIDI events from the sequencer to the sampler.
On the other hand, aRts can also help applications to interact with the hardware. If a piece of software (for instance the sampler) works together with aRts, it will be able to receive the MIDI events from an external MIDI keyboard as well.
Finally, aRts makes a great modular synthesizer. It is designed to do exactly this. So you can build instruments out of small modules using artsbuilder, and then use these instruments to compose or play music. Synthesis does not necessarily mean pure synthesis, there are modules you can use to play samples. So aRts can be a sampler, synthesizer, and so on, and being fully modular, it is very easy to extend, very easy to experiment with, powerful and flexible.
The central component in aRts that keeps track which applications are connected and how midi events should be passed between them is the midi manager. To see or influence what it does, start artscontrol. Then, choose View+View Midi Manager from the menu.
On the left side, you will see Midi Inputs. There, all objects that produce MIDI events, such as an external MIDI port which sends data from a connected MIDI keyboard, a sequencer which plays a song and so on will be listed. On the right side, you will see Midi Outputs. There, all things that consume MIDI events, such as a simulated sampler (as software), or the external MIDI port where your hardware sampler outside your computer is connected will be listed. New applications, such as sequencers and so on will register themselves, so the list will be changing over time.
You can connect inputs and outputs if you mark the input on the left side and the output on the right side, and choose Connect with the button below. Disconnect works the same. You will see what is connected as small lines between the inputs and outputs, in the middle of the window. Note that you can connect one sender to more than one receiver (and the other way round).
Programs (like the Brahms sequencer) will add themselves when they start and be removed from the list when they are terminated. But you can also add new things in the Add menu:
This will create a new aRts object that talks to an external midi port.
As external midi ports can do both, send and receive data, choosing this option will add a midi input and a midi output. Under Linux®, you should either have an OSS (or OSS/Free, the thing that comes with your Linux® kernel) or an ALSA driver for your soundcard installed, to make it work. It will ask for the name of the device. Usually, this is /dev/midi or /dev/midi00.
However, if you have more than one MIDI device or a MIDI loopback driver installed, there might be more choices. To see information about your midi ports, start the KDE Control Center, and choose Information+Sound.
This will add a new MIDI output with an aRts synthesis instrument. If you choose the menu item, a dialog will pop up, and allow you to choose an instrument. You can create new instruments using artsbuilder. All .arts files with a name that starts with instrument_ will appear here.
Actually, getting started is quite easy. You need a KDE 2.1-aware version of Brahms, which can be found in the kmusic CVS module. There is also information on how to get Brahms on the aRts Homepage in the Download section.
When you start it, it will show up in the MIDI manager. If you want to do synthesis, simply add a synthesis MIDI instrument via Add+aRts Synthesis Midi Output.
Choose an instrument (for instance organ2). Connect them using the Connect button. Finally, you can start composing in Brahms, and the output will be synthesized with aRts.
It is usually a good idea to have the artscontrol window open, and see that the volume is not too loud (quality gets bad when the bars hit the upper limit). Now you can start working on a new aRts demosong, and if you are done, you can get it published on aRts-project.org ;-).
midisend is a small application that will allow you to send MIDI events from the shell. It will register as client like all other applications. The most simple way to use it is to do
% midisend -f /dev/midi00which will achieve about the same as adding a system MIDI port in artscontrol. (Not quite, because midisend only sends events). The difference is that it is easy for instance to start midisend on different computers (and like that, use network transparency).
It is also possible to make midisend send data from stdin, which you can use to pipe data from non-aRts-aware applications to aRts, like this:
% applicationwhichproducesmidieventsonstdout | midisend -f -
The way aRts does midi synthesis is this: you have a structures which has some input ports, where it gets the frequency, velocity (volume) and a parameter which indicates whether the note is still pressed. The structure should now synthesize exactly that note with that volume, and react on the pressed parameter (where pressed = 1 means the user still holds down that key and pressed = 0 means the user has released that key).
When MIDI events arrive, aRts will create new structures for the notes as needed, give them the parameters, and clean them up once they are done.
To create and use such a structure, you should do the following:
To get started, the most convenient way is to open template_Instrument.arts in aRts-builder.
This can be achieved by using File->Open Example... and choosing template_Instrument in the file selector. This will give you an empty structure with the required parameters, which you only need to “fill out”.
To process the pressed parameter, it is convenient to use Synth_ENVELOPE_ADSR, or, in case of playing some drum wav, just play it anyway, and ignore the pressed parameter.
The structure should indicate when it is no longer needed on the “done” output. If done is 1, aRts assumes that it can delete the structure. Conveniently, the ADSR envelope provides a parameter when it is done, so you just need to connect this to the done output of the structure.
You should rename your structure to some name starting with instrument_, like instrument_piano.arts - you should save the file under the same name, in your $HOME/arts/structures directory (which is where artsbuilder wants to save files normally).
Finally, once you saved it, you will be able to use it with artscontrol in the MIDI manager .
Oh, and of course your structure should play the audio data it generates to the left and right output of the structure, which will then be played via audio manager (you can see that in artscontrol), so that you finally can hear it (or postprocess it with effects).
A good way to learn how to do instruments is to open an existing instrument via File+Open Example and see how it works ;)
Mapped instruments are instruments, that behave differently depending on the pitch, the program, the channel or the velocity. You could for instance build a piano of 5 octaves, using one sample for each octave (pitchshifting it accordingly). That sounds a whole lot better than only using one sample.
You could also build a drum map, that plays one specific drum sample per key.
Finally, it is very useful if you put quite some different sounds into one mapped instrument on different programs. That way, you can use your sequencer, external keyboard or other MIDI source to switch between the sounds without having to tweak aRts as you work.
A good example for this is the instrument arts_all, which just puts together all instruments that come with aRts in one map. That way, you just need to setup once in artscontrol to use this “instrument”, and then, you can compose a whole song in a sequencer without ever bothering about aRts. Need another sound? Simply change the program in the sequencer, and aRts will give you another sound.
Creating such maps is pretty straightforward. You just need to create a textfile, and write rules which look like this:
ON [ conditions ...] DO structure=somestructure.arts
The conditions could be one or more than one of the following:
The pitch that is being played. You would use this if you want to split your instrument depending on the pitch. In our initial examples, a piano which uses different samples for different octaves would use this as condition. You can specify a single pitch, like pitch=62 or a range of pitches, like pitch=60-72. The possible pitches are between 0 and 127.
The program that is active on the channel that the note is being sent on. Usually, sequencers let you choose the ‘instrument’ via the program setting. Single programs or ranges are allowed, that is program=3 or program=3-6. The possible programs are between 0 and 127.
The channel that that the note is being sent on. Single channels or ranges are allowed, that is channel=0 or channel=0-8. The possible channels are between 0 and 15.
The velocity (volume) that that the note has. Single velocities (who would use that?) or ranges are allowed, that is velocity=127 or velocity=64-127. The possible velocities are between 0 and 127.
A complete example for a map would be (this is taken from the current instrument_arts_all.arts-map):
ON program=0 DO structure=instrument_tri.arts ON program=1 DO structure=instrument_organ2.arts ON program=2 DO structure=instrument_slide1.arts ON program=3 DO structure=instrument_square.arts ON program=4 DO structure=instrument_neworgan.arts ON program=5 DO structure=instrument_nokind.arts ON program=6 DO structure=instrument_full_square.arts ON program=7 DO structure=instrument_simple_sin.arts ON program=8 DO structure=instrument_simple_square.arts ON program=9 DO structure=instrument_simple_tri.arts ON program=10 DO structure=instrument_slide.arts ON program=11 pitch=60 DO structure=instrument_deepdrum.arts ON program=11 pitch=61 DO structure=instrument_chirpdrum.arts
As you see, the structure is choosen depending on the program. On program 11, you see a “drum map” (with two entries), which would play a “deepdrum” on C-5 (pitch=60), and a “chirpdrum” on C#5 (pitch=61).
To make map files automatically appear in artscontrol as choice for the instrument, they have to be called instrument_something.arts-map and reside either in your Home Directory, under $HOME/arts/structures, or in the KDE directory under $KDEDIR/usr/local/kde/share/apps/artsbuilder/examples. Structures that are used by the map can either be given with an absolute path, or relative to the directory the map file resides in.
Extending the arts_all map or even making a complete general MIDI map for aRts is a good idea for making aRts easier to use out-of-the-box. Please consider contributing interesting instruments you make, so that they can be included in further version of aRts.
MCOP is the standard aRts uses for:
Communication between objects.
Network transparency.
Describing object interfaces.
Language independancy.
One major aspect of MCOP is the interface description language, IDL, in which many of the aRts interfaces and APIs are defined in a language independent way.
To use IDL interfaces from C++, is compiled by the IDL compiler into C++ code. When you implement an interface, you derive from the skeleton class the IDL compiler has generated. When you use an interface, you do so using a wrapper. This way, MCOP can use a protocol if the object you are talking to is not local - you get network transparency.
This chapter is supposed to describe the basic features of the object model that results from the use of MCOP, the protocol, how do use MCOP in C++ (language binding), and so on.
Many of the services provided by aRts, such as modules and the sound server, are defined in terms of interfaces. Interfaces are specified in a programming language independent format: IDL.
This allows many of the implementation details such as the format of multimedia data streams, network transparency, and programming language dependencies, to be hidden from the specification for the interface. A tool, mcopidl, translates the interface definition into a specific programming language (currently only C++ is supported).
The tool generates a skeleton class with all of the boilerplate code and base functionality. You derive from that class to implement the features you want.
The IDL used by aRts is similar to that used by CORBA and DCOM.
IDL files can contain:
C-style #include directives for other IDL files.
Definitions of enumerated and struct types, as in C/C++.
Definitions of interfaces.
In IDL, interfaces are defined much like a C++ class or C struct, albeit with some restrictions. Like C++, interfaces can subclass other interfaces using inheritance. Interface definitions can include three things: streams, attributes, and methods.
Streams define multimedia data, one of the most important components of a module. Streams are defined in the following format:
[ async ] in|out [ multi ] type stream name [ , name ] ;
Streams have a defined direction in reference to the module, as indicated by the required qualifiers in or out. The type argument defines the type of data, which can be any of the types described later for attributes (not all are currently supported). Many modules use the stream type audio, which is an alias for float since that is the internal data format used for audio stream. Multiple streams of the same type can defined in the same definition uisng comma separated names.
Streams are by default synchronous, which means they are continuous flows of data at a constant rate, such as PCM audio. The async qualifier specifies an asynchronous stream, which is used for non-continuous data flows. The most common example of an async stream is MIDI messages.
The multi keyword, only valid for input streams, indicates that the interface supports a variable number of inputs. This is useful for implementing devices such as mixers that can accept any number of input streams.
Attributes are data associated with an instance of an interface. They are declared like member variables in C++, and can can use any of the primitive types boolean, byte, long, string, or float. You can also use user-defined struct or enum types as well as variable sized sequences using the syntax sequence<type>. Attributes can optionally be marked readonly.
As in C++, methods can be defined in interfaces. The method parameters are restricted to the same types as attributes. The keyword oneway indicates a method which returns immediately and is executed asynchronously.
Several standard module interfaces are already defined for you in aRts, such as StereoEffect, and SimpleSoundServer.
A simple example of a module taken from aRts is the constant delay module, found in the file kdemultimedia/arts/modules/artsmodules.idl. The interface definition is listed below.
interface Synth_CDELAY : SynthModule { attribute float time; in audio stream invalue; out audio stream outvalue; };
This modules inherits from SynthModule. That interface, defined in artsflow.idl, defines the standard methods implemented in all music synthesizer modules.
The CDELAY effect delays a stereo audio stream by the time value specified as a floating point parameter. The interface definition has an attribute of type float to store the delay value. It defines two input audio streams and two output audio streams (typical of stereo effects). No methods are required other than those it inherits.
This section covers some additional topics related to streams.
There are various requirements for how a module can do streaming. To illustrate this, consider these examples:
Scaling a signal by a factor of two.
Performing sample frequency conversion.
Decompressing a run-length encoded signal.
Reading MIDI events from /dev/midi00 and inserting them into a stream.
The first case is the simplest: upon receiving 200 samples of input the module produces 200 samples of output. It only produces output when it gets input.
The second case produces different numbers of output samples when given 200 input samples. It depends what conversion is performed, but the number is known in advance.
The third case is even worse. From the outset you cannot even guess how much data 200 input bytes will generate (probably a lot more than 200 bytes, but...).
The last case is a module which becomes active by itself, and sometimes produces data.
In aRtss-0.3.4, only streams of the first type were handled, and most things worked nicely. This is probably what you need most when writing modules that process audio. The problem with the other, more complex types of streaming, is that they are hard to program, and that you don't need the features most of the time. That is why we do this with two different stream types: synchronous and asynchronous.
Synchronous streams have these characteristics:
Modules must be able to calculate data of any length, given enough input.
All streams have the same sampling rate.
The calculateBlock() function will be called when enough data is available, and the module can rely on the pointers pointing to data.
There is no allocation and deallocation to be done.
Asynchronous streams, on the other hand, have this behaviour:
Modules may produce data sometimes, or with varying sampling rate, or only if they have input from some filed escriptor. They are not bound by the rule ‘must be able to satisfy requests of any size’.
Asynchronous streams of a module may have entirely different sampling rates.
Outgoing streams: there are explicit functions to allocate packets, to send packets - and an optional polling mechanism that will tell you when you should create some more data.
Incoming streams: you get a call when you receive a new packet - you have to say when you are through with processing all data of that packet, which must not happen at once (you can say that anytime later, and if everybody has processed a packet, it will be freed/reused)
When you declare streams, you use the keyword ‘async’ to indicate you want to make an asynchronous stream. So, for instance, assume you want to convert an asynchronous stream of bytes into a synchronous stream of samples. Your interface could look like this:
interface ByteStreamToAudio : SynthModule { async in byte stream indata; // the asynchonous input sample stream out audio stream left,right; // the synchronous output sample streams };
Suppose you decided to write a module to produce sound asynchronously. Its interface could look like this:
interface SomeModule : SynthModule { async out byte stream outdata; };
How do you send the data? The first method is called ‘push delivery’. With asynchronous streams you send the data as packets. That means you send individual packets with bytes as in the above example. The actual process is: allocate a packet, fill it, send it.
Here it is in terms of code. First we allocate a packet:
DataPacket<mcopbyte> *packet = outdata.allocPacket(100);
The we fill it:
// cast so that fgets is happy that it has a (char *) pointer char *data = (char *)packet->contents; // as you can see, you can shrink the packet size after allocation // if you like if(fgets(data,100,stdin)) packet->size = strlen(data); else packet->size = 0;
Now we send it:
packet->send();
This is quite simple, but if we want to send packets exactly as fast as the receiver can process them, we need another approach, the ‘pull delivery’ method. You ask to send packets as fast as the receiver is ready to process them. You start with a certain amount of packets you send. As the receiver processes one packet after another, you start refilling them with fresh data, and send them again.
You start that by calling setPull. For example:
outdata.setPull(8, 1024);
This means that you want to send packets over outdata. You want to start sending 8 packets at once, and as the receiver processes some of them, you want to refill them.
Then, you need to implement a method which fills the packets, which could look like this:
void request_outdata(DataPacket<mcopbyte> *packet) { packet->size = 1024; // shouldn't be more than 1024 for(int i = 0;i < 1024; i++) packet->contents[i] = (mcopbyte)'A'; packet->send(); }
Thats it. When you don't have any data any more, you can start sending packets with zero size, which will stop the pulling.
Note that it is essential to give the method the exact name request_streamname.
We just discussed sending data. Receiving data is much much simpler. Suppose you have a simple ToLower filter, which simply converts all letters in lowercase:
interface ToLower { async in byte stream indata; async out byte stream outdata; };
This is really simple to implement; here is the whole implementation:
class ToLower_impl : public ToLower_skel { public: void process_indata(DataPacket<mcopbyte> *inpacket) { DataPacket<mcopbyte> *outpacket = outdata.allocPacket(inpacket->size); // convert to lowercase letters char *instring = (char *)inpacket->contents; char *outstring = (char *)outpacket->contents; for(int i=0;i<inpacket->size;i++) outstring[i] = tolower(instring[i]); inpacket->processed(); outpacket->send(); } }; REGISTER_IMPLEMENTATION(ToLower_impl);
Again, it is essential to name the method process_streamname.
As you see, for each arriving packet you get a call for a function (the process_indata call in our case). You need to call the processed() method of a packet to indicate you have processed it.
Here is an implenetation tip: if processing takes longer (i.e. if you need to wait for soundcard output or something like that), don't call processed immediately, but store the whole data packet and call processed only as soon as you really processed that packet. That way, senders have a chance to know how long it really takes to do your work.
As synchronization isn't so nice with asynchronous streams, you should use synchronous streams wherever possible, and asynchronous streams only when necessary.
Suppose you have 2 objects, for example an AudioProducer and an AudioConsumer. The AudioProducer has an output stream and AudioConsumer has an input one. Each time you want to connect them, you will use those 2 streams. The first use of defaulting is to enable you to make the connection without specifying the ports in that case.
Now suppose the teo objects above can handle stereo, and each have a ‘left’ and ‘right’ port. You'd still like to connect them as easily as before. But how can the connecting system know which output port to connect to which input port? It has no way to correctly map the streams. Defaulting is then used to specify several streams, with an order. Thus, when you connect an object with 2 default output streams to another one with 2 default input streams, you don't have to specify the ports, and the mapping will be done correctly.
Of course, this is not limited to stereo. Any number of streams can be made default if needed, and the connect function will check that the number of defaults for 2 object match (in the required direction) if you don't specify the ports to use.
The syntax is as follows: in the IDL, you can use the default keyword in the stream declaration, or on a single line. For example:
interface TwoToOneMixer { default in audio stream input1, input2; out audio stream output; };
In this example, the object will expect its two input ports to be connected by default. The order is the one specified on the default line, so an object like this one:
interface DualNoiseGenerator { out audio stream bzzt, couic; default couic, bzzt; };
Will make connections from ‘couic’ to ‘input1’, and ‘bzzt’ to ‘input2’ automatically. Note that since there is only one output for the mixer, it will be made default in this case (see below). The syntax used in the noise generator is useful to declare a different order than the declaration, or selecting only a few ports as default. The directions of the ports on this line will be looked up by mcopidl, so don't specify them. You can even mix input and output ports in such a line, only the order is important.
There are some rules that are followed when using inheritance:
If a default list is specified in the IDL, then use it. Parent ports can be put in this list as well, whether they were default in the parent or not.
Otherwise, inherit parent's defaults. Ordering is parent1 default1, parent1 default2..., parent2 default1... If there is a common ancestor using 2 parent branches, a ‘virtual public’-like merging is done at that default's first occurrence in the list.
If there is still no default and a single stream in a direction, use it as default for that direction.
Attribute change notifications are a way to know when an attribute changed. They are a bit comparable with Qt™'s or Gtk's signals and slots. For instance, if you have a GUI element, a slider, which configures a number between 0 and 100, you will usually have an object that does something with that number (for instance, it might be controlling the volume of some audio signal). So you would like that whenever the slider is moved, the object which scales the volume gets notified. A connection between a sender and a receiver.
MCOP deals with that by being able to providing notifications when attributes change. Whatever is declared as “attribute” in the IDL, can emit such change notifications, and should do so, whenever it is modified. Whatever is declared as “attribute” can also receive such change notifications. So for instance if you had two IDL interfaces, like these:
interface Slider { attribute long min,max; attribute long position; }; interface VolumeControl : Arts::StereoEffect { attribute long volume; // 0..100 };
You can connect them using change notifications. It works using the normal flowsystem connect operation. In this case, the C++ code to connect two objects would look like this:
#include <connect.h> using namespace Arts; [...] connect(slider,"position_changed",volumeControl,"volume");
As you see, each attribute offers two different streams, one for sending the change notifications, called attributename_changed, and one for receiving change notifications, called attributename.
It is important to know that change notifications and asynchronous streams are compatible. They are also network transparent. So you can connect a change notification of a float attribute of a GUI widget has to an asynchronous stream of a synthesis module running on another computer. This of course also implies that change notifications are not synchronous, this means, that after you have sent the change notification, it may take some time until it really gets received.
When implementing objects that have attributes, you need to send change notifications whereever an attribute changes. The code for doing this looks like this:
void KPoti_impl::value(float newValue) { if(newValue != _value) { _value = newValue; value_changed(newValue); // <- send change notification } }
It is strongly recommended to use code like this for all objects you implement, so that change notifications can be used by other people. You should however void sending notifications too often, so if you are doing signal processing, it is probably the best if you keep track when you sent your last notification, so that you don't send one with every sample you process.
It will be especially useful to use change notifications in conjunction with scopes (things that visualize audio data for instance), gui elements, control widgets, and monitoring. Code using this is in kdelibs/arts/tests, and in the experimental artsgui implementation, which you can find under kdemultimedia/arts/gui.
The .mcoprc file (in each user's home directory) can be used to configure MCOP in some ways. Currently, the following is possible:
The name of an interface to be used for global communication. Global communication is used to find other objects and obtain the secret cookie. Multiple MCOP clients/servers that should be able to talk to each other need to have a GlobalComm object which is able to share information between them. Currently, the possible values are “Arts::TmpGlobalComm” to communicate via /tmp/mcop-username directory (which will only work on the local computer) and “Arts::X11GlobalComm” to communicate via the root window properties on the X11 server.
Specifies where to look for trader information. You can list more than one directory here, and separate them with commas, like
Specifies from which directories extensions (in the form of shared libraries) are loaded. Multiple values can be specified comma seperated.
An example which uses all of the above is:
# $HOME/.mcoprc file GlobalComm=Arts::X11GlobalComm # if you are a developer, it might be handy to add a directory in your home # to the trader/extension path to be able to add components without # installing them TraderPath="/opt/kde2/lib/mcop","/home/joe/mcopdevel/mcop" ExtensionPath="/opt/kde2/lib","/home/joe/mcopdevel/lib"
If you have used CORBA before, you will see that MCOP is much the same thing. In fact, aRts prior to version 0.4 used CORBA.
The basic idea of CORBA is the same: you implement objects (components). By using the MCOP features, your objects are not only available as normal classes from the same process (via standard C++ techniques) - they also are available to remote servers transparently. For this to work, the first thing you need to do is to specify the interface of your objects in an IDL file - just like CORBA IDL. There are only a few differences.
In MCOP there are no “in” and “out” parameters on method invocations. Parameters are always incoming, the return code is always outgoing, which means that the interface:
// CORBA idl interface Account { void deposit( in long amount ); void withdraw( in long amount ); long balance(); };
is written as
// MCOP idl interface Account { void deposit( long amount ); void withdraw( long amount ); long balance(); };
in MCOP.
There is no exception support. MCOP doesn't have exceptions - it uses something else for error handling.
There are no union types and no typedefs. I don't know if that is a real weakness, something one would desperately need to survive.
There is no support for passing interfaces or object references
You declare sequences as “sequencetype” in MCOP. There is no need for a typedef. For example, instead of:
// CORBA idl struct Line { long x1,y1,x2,y2; }; typedef sequence<Line> LineSeq; interface Plotter { void draw(in LineSeq lines); };
you would write
// MCOP idl struct Line { long x1,y1,x2,y2; }; interface Plotter { void draw(sequence<Line> lines); };
You can declare streams, which will then be evaluated by the aRts framework. Streams are declared in a similar manner to attributes. For example:
// MCOP idl interface Synth_ADD : SynthModule { in audio stream signal1,signal2; out audio stream outvalue; };
This says that your object will accept two incoming synchronous audio streams called signal1 and signal2. Synchronous means that these are streams that deliver x samples per second (or other time), so that the scheduler will guarantee to always provide you a balanced amount of input data (e.g. 200 samples of signal1 are there and 200 samples signal2 are there). You guarantee that if your object is called with those 200 samples signal1 + signal2, it is able to produce exactly 200 samples to outvalue.
This differs from CORBA mostly:
Strings use the C++ STL string class. When stored in sequences, they are stored “plain”, that means they are considered to be a primitive type. Thus, they need copying.
longs are plain long's (expected to be 32 bit).
Sequences use the C++ STL vector class.
Structures are all derived from the MCOP class Type, and generated by the MCOP IDL compiler. When stored in sequences, they are not stored “plain” , but as pointers, as otherwise, too much copying would occur.
After having them passed through the IDL compiler, you need to derive from the _skel class. For instance, consider you have defined your interface like this:
// MCOP idl: hello.idl interface Hello { void hello(string s); string concat(string s1, string s2); long sum2(long a, long b); };
You pass that through the IDL compiler by calling mcopidl hello.idl, which will in turn generate hello.cc and hello.h. To implement it, you need to define a C++-class that inherits the skeleton:
// C++ header file - include hello.h somewhere class Hello_impl : virtual public Hello_skel { public: void hello(const string& s); string concat(const string& s1, const string& s2); long sum2(long a, long b); };
Finally, you need to implement the methods as normal C++
// C++ implementation file // as you see string's are passed as const string references void Hello_impl::hello(const string& s) { printf("Hello '%s'!\n",s.c_str()); } // when they are a returncode they are passed as "normal" strings string Hello_impl::concat(const string& s1, const string& s2) { return s1+s2; } long Hello_impl::sum2(long a, long b) { return a+b; }
Once you do that, you have an object which can communicate using MCOP. Just create one (using the normal C++ facilities to create an object):
Hello_impl server;
And as soon as you give somebody the reference
string reference = server._toString(); printf("%s\n",reference.c_str());
and go to the MCOP idle loop
Dispatcher::the()->run();
People can access the thing using
// this code can run anywhere - not necessarily in the same process // (it may also run on a different computer/architecture) Hello *h = Hello::_fromString([the object reference printed above]);
and invoke methods:
if(h) h->hello("test"); else printf("Access failed?\n");
Since MCOP servers will listen on a TCP port, potentially everybody (if you are on the Internet) may try to connect MCOP services. Thus, it is important to authenticate clients. MCOP uses the md5-auth protocol.
The md5-auth protocol does the following to ensure that only selected (trusted) clients may connect to a server:
It assumes you can give every client a secret cookie.
Every time a client connects, it verifies that this client knows that secret cookie, without actually transferring it (not even in a form that somebody listening to the network traffic could find it out).
To give each client the secret cookie, MCOP will (normally) put it in the mcop directory (under /tmp/mcop-USER/secret-cookie). Of course, you can copy it to other computers. However, if you do so, use a secure transfer mechanism, such as scp (from ssh).
The authentication of clients uses the following steps:
Procedure 6.1.
[SERVER] generate a new (random) cookie R
[SERVER] send it to the client
[CLIENT] read the "secret cookie" S from a file
[CLIENT] mangle the cookies R and S to a mangled cookie M using the MD5 algorithm
[CLIENT] send M to the server
[SERVER] verify that mangling R and S gives just the same thing as the cookie M received from the client. If yes, authentication is successful.
This algorithm should be secure, given that
The secret cookies and random cookies are ‘random enough’ and
The MD5 hashing algorithm doesn't allow to find out the ‘original text’, that is the secret cookie S and the random cookie R (which is known, anyway), from the mangled cookie M.
The MCOP protocol will start every new connection with an authentication process. Basically, it looks like this:
Procedure 6.2.
Server sends a ServerHello message, which describes the known authentication protocols.
Client sends a ClientHello message, which includes authentication info.
Server sends an AuthAccept message.
To see that the security actually works, we should look at how messages are processed on unauthenticated connections:
Before the authentication succeeds, the server will not receive other messages from the connection. Instead, if the server for instance expects a ‘ClientHello’ message, and gets an mcopInvocation message, it will drop the connection.
If the client doesn't send a valid MCOP message at all (no MCOP magic in the message header) in the authentication phase, but something else, the connection is dropped.
If the client tries to send a very very large message (> 4096 bytes in the authentication phase, the message size is truncated to 0 bytes, which will cause that it isn't accepted for authentication) This is to prevent unauthenticated clients from sending e.g. 100 megabytes of message, which would be received and could cause the server to run out of memory.
If the client sends a corrupt ClientHello message (one, for which demarshalling fails), the connection is dropped.
If the client send nothing at all, then a timeout should occur (to be implemented).
It has conceptual similarities to CORBA, but it is intended to extend it in all ways that are required for real time multimedia operations.
It provides a multimedia object model, which can be used for both: communication between components in one adress space (one process), and between components that are in different threads, processes or on different hosts.
All in all, it will be designed for extremely high performance (so everything shall be optimized to be blazingly fast), suitable for very communicative multimedia applications. For instance streaming videos around is one of the applications of MCOP, where most CORBA implementations would go down to their knees.
The interface definitions can handle the following natively:
Continous streams of data (such as audio data).
Event streams of data (such as MIDI events).
Real reference counting.
and the most important CORBA gimmicks, like
Synchronous method invocations.
Asynchronous method invocations.
Constructing user defined data types.
Multiple inheritance.
Passing object references.
Design goals/ideas:
Marshalling should be easy to implement.
Demarshalling requires the receiver to know what type he wants to demarshall.
The receiver is expected to use every information - so skipping is only in the protocol to a degree that:
If you know you are going to receive a block of bytes, you don't need to look at each byte for an end marker.
If you know you are going to receive a string, you don't need to read it until the zero byte to find out it's length while demarshalling, however,
If you know you are going to receive a sequence of strings, you need to look at the length of each of them to find the end of the sequence, as strings have variable length. But if you use the strings for something useful, you'll need to do that anyway, so this is no loss.
As little overhead as possible.
Marshalling of the different types is show in the table below:
Type | Marshalling Process | Result |
---|---|---|
void | void types are marshalled by omitting them, so nothing is written to the stream for them. | |
long | is marshalled as four bytes, the most significant byte first, so the number 10001025 (which is 0x989a81) would be marshalled as: | 0x00 0x98 0x9a 0x81 |
enums | are marshalled like longs | |
byte | is marshalled as a single byte, so the byte 0x42 would be marshalled as: | 0x42 |
string | is marshalled as a long, containing the length of the following string, and then the sequence of characters strings must end with one zero byte (which is included in the length counting). Importantinclude the trailing 0 byte in length counting! ‘hello’ would be marshalled as: | 0x00 0x00 0x00 0x06 0x68 0x65 0x6c 0x6c 0x6f 0x00 |
boolean | is marshalled as a byte, containing 0 if false or 1 if true, so the boolean value true is marshalled as: | 0x01 |
float | is marshalled after the four byte IEEE754 representation - detailed docs how IEEE works are here: http://twister.ou.edu/workshop.docs/common-tools/numerical_comp_guide/ncg_math.doc.html and here: http://java.sun.com/docs/books/vmspec/2nd-edition/html/Overview.doc.html. So, the value 2.15 would be marshalled as: | 0x9a 0x99 0x09 0x40 |
struct | A structure is marshalled by marshalling it's contents. There are no additional prefixes or suffixes required, so the structure struct test { string name; // which is "hello" long value; // which is 10001025 (0x989a81) }; would be marshalled as |
|
sequence | a sequence is marshalled by listing the number of elements that follow, and then marshalling the elements one by one. So a sequence of 3 longs a, with a[0] = 0x12345678, a[1] = 0x01 and a[2] = 0x42 would be marshalled as: |
|
If you need to refer to a type, all primitive types are referred by the names given above. Structures and enums get own names (like Header). Sequences are referred as *normal type, so that a sequence of longs is “*long” and a sequence of Header struct's is “*Header”.
The MCOP message header format is defined as defined by this structure:
struct Header { long magic; // the value 0x4d434f50, which is marshalled as MCOP long messageLength; long messageType; };
The possible messageTypes are currently
mcopServerHello = 1 mcopClientHello = 2 mcopAuthAccept = 3 mcopInvocation = 4 mcopReturn = 5 mcopOnewayInvocation = 6
A few notes about the MCOP messaging:
Every message starts with a Header.
Some messages types should be dropped by the server, as long as the authentication is not complete.
After receiving the header, the protocol (connection) handling can receive the message completely, without looking at the contents.
The messageLength in the header is of course in some cases redundant, which means that this approach is not minimal regarding the number of bytes.
However, it leads to an easy (and fast) implementation of non-blocking messaging processing. With the help of the header, the messages can be received by protocol handling classes in the background (non-blocking), if there are many connections to the server, all of them can be served parallel. You don't need to look at the message content, to receive the message (and to determine when you are done), just at the header, so the code for that is pretty easy.
Once a message is there, it can be demarshalled and processed in one single pass, without caring about cases where not all data may have been received (because the messageLength guarantees that everything is there).
To call a remote method, you need to send the following structure in the body of an MCOP message with the messageType = 1 (mcopInvocation):
struct Invocation { long objectID; long methodID; long requestID; };
after that, you send the parameters as structure, e.g. if you invoke the method string concat(string s1, string s2), you send a structure like
struct InvocationBody { string s1; string s2; };
if the method was declared to be oneway - that means asynchronous without return code - then that was it. Otherwise, you'll receive as answer the message with messageType = 2 (mcopReturn)
struct ReturnCode { long requestID; <resulttype> result; };
where <resulttype> is the type of the result. As void types are omitted in marshalling, you can also only write the requestID if you return from a void method.
So our string concat(string s1, string s2) would lead to a returncode like
struct ReturnCode { long requestID; string result; };
To do invocations, you need to know the methods an object supports. To do so, the methodID 0, 1, 2 and 3 are hardwired to certain functionalities. That is
long _lookupMethod(MethodDef methodDef); // methodID always 0 string _interfaceName(); // methodID always 1 InterfaceDef _queryInterface(string name); // methodID always 2 TypeDef _queryType(string name); // methodID always 3
to read that, you of course need also
struct MethodDef { string methodName; string type; long flags; // set to 0 for now (will be required for streaming) sequence<ParamDef> signature; }; struct ParamDef { string name; long typeCode; };
the parameters field contains type components which specify the types of the parameters. The type of the returncode is specified in the MethodDef's type field.
Strictly speaking, only the methods _lookupMethod() and _interfaceName() differ from object to object, while the _queryInterface() and _queryType() are always the same.
What are those methodIDs? If you do an MCOP invocation, you are expected to pass a number for the method you are calling. The reason for that is, that numbers can be processed much faster than strings when executing an MCOP request.
So how do you get those numbers? If you know the signature of the method, that is a MethodDef that describes the method, (which contains name, type, parameter names, parameter types and such), you can pass that to _lookupMethod of the object where you wish to call a method. As _lookupMethod is hardwired to methodID 0, you should encounter no problems doing so.
On the other hand, if you don't know the method signature, you can find which methods are supported by using _interfaceName, _queryInterface and _queryType.
Since KDE dropped CORBA completely, and is using DCOP everywhere instead, naturally the question arises why aRts isn't doing so. After all, DCOP support is in KApplication, is well-maintained, supposed to integrate greatly with libICE, and whatever else.
Since there will be (potentially) a lot of people asking whether having MCOP besides DCOP is really necessary, here is the answer. Please don't get me wrong, I am not trying to say ‘DCOP is bad’. I am just trying to say ‘DCOP isn't the right solution for aRts’ (while it is a nice solution for other things).
First, you need to understand what exactly DCOP was written for. Created in two days during the KDE-TWO meeting, it was intended to be as simple as possible, a really ‘lightweight’ communication protocol. Especially the implementation left away everything that could involve complexity, for instance a full blown concept how data types shall be marshalled.
Even although DCOP doesn't care about certain things (like: how do I send a string in a network-transparent manner?) - this needs to be done. So, everything that DCOP doesn't do, is left to Qt™ in the KDE apps that use DCOP today. This is mostly type management (using the Qt™ serialization operator).
So DCOP is a minimal protocol which perfectly enables KDE applications to send simple messages like ‘open a window pointing to http://www.kde.org’ or ‘your configuration data has changed’. However, inside aRts the focus lies on other things.
The idea is, that little plugins in aRts will talk involving such data structures as ‘midi events’ and ‘songposition pointers’ and ‘flow graphs’.
These are complex data types, which must be sent between different objects, and be passed as streams, or parameters. MCOP supplies a type concept, to define complex data types out of simpler ones (similar to structs or arrays in C++). DCOP doesn't care about types at all, so this problem would be left to the programmer - like: writing C++ classes for the types, and make sure they can serialize properly (for instance: support the Qt™ streaming operator).
But that way, they would be inaccessible to everything but direct C++ coding. Specifically, you could not design a scripting language, that would know all types plugins may ever expose, as they are not self describing.
Much the same argument is valid for interfaces as well. DCOP objects don't expose their relationships, inheritance hierarchies, etc. - if you were to write an object browser which shows you ‘what attributes has this object got’, you'd fail.
While Matthias told me that you have a special function ‘functions’ on each object that tells you about the methods that an object supports, this leaves out things like attributes (properties), streams and inheritance relations.
This seriously breaks applications like aRts-builder. But remember: DCOP was not so much intended to be an object model (as Qt™ already has one with moc and similar), nor to be something like CORBA, but to supply inter-application communication.
Why MCOP even exists is: it should work fine with streams between objects. aRts makes heavily use of small plugins, which interconnect themselves with streams. The CORBA version of aRts had to introduce a very annoying split between ‘the SynthModule objects’, which were the internal work modules that did do the streaming, and ‘the CORBA interface’, which was something external.
Much code cared about making interaction between ‘the SynthModule objects’ and ‘the CORBA interface’ look natural, but it didn't, because CORBA knew nothing at all about streams. MCOP does. Look at the code (something like simplesoundserver_impl.cc). Way better! Streams can be declared in the interface of modules, and implemented in a natural looking way.
One can't deny it. One of the reasons why I wrote MCOP was speed. Here are some arguments why MCOP will definitely be faster than DCOP (even without giving figures).
An invocation in MCOP will have a six-‘long’-header. That is:
magic ‘MCOP’
message type (invocation)
size of the request in bytes
request ID
target object ID
target method ID
After that, the parameters follow. Note that the demarshalling of this is extremely fast. You can use table lookups to find the object and the method demarshalling function, which means that complexity is O(1) [ it will take the same amount of time, no matter how many objects are alive, or how many functions are there ].
Comparing this to DCOP, you'll see, that there are at least
a string for the target object - something like ‘myCalculator’
a string like ‘addNumber(int,int)’ to specify the method
several more protocol info added by libICE, and other DCOP specifics I don't know
These are much more painful to demarshall, as you'll need to parse the string, search for the function, etc..
In DCOP, all requests are running through a server (DCOPServer). That means, the process of a synchronous invocation looks like this:
Client process sends invocation.
DCOPserver (man-in-the-middle) receives invocation and looks where it needs to go, and sends it to the ‘real’ server.
Server process receives invocation, performs request and sends result.
DCOPserver (man-in-the-middle) receives result and ... sends it to the client.
Client decodes reply.
In MCOP, the same invocation looks like this:
Client process sends invocation.
Server process receives invocation, performs request and sends result.
Client decodes reply.
Say both were implemented correctly, MCOPs peer-to-peer strategy should be faster by a factor of two, than DCOPs man-in-the-middle strategy. Note however that there were of course reasons to choose the DCOP strategy, which is namely: if you have 20 applications running, and each app is talking to each app, you need 20 connections in DCOP, and 200 with MCOP. However in the multimedia case, this is not supposed to be the usual setting.
I tried to compare MCOP and DCOP, doing an invocation like adding two numbers. I modified testdcop to achieve this. However, the test may not have been precise on the DCOP side. I invoked the method in the same process that did the call for DCOP, and I didn't know how to get rid of one debugging message, so I used output redirection.
The test only used one object and one function, expect DCOPs results to decrease with more objects and functions, while MCOPs results should stay the same. Also, the dcopserver process wasn't connected to other applications, it might be that if many applications are connected, the routing performance decreases.
The result I got was that while DCOP got slightly more than 2000 invocations per second, MCOP got slightly more than 8000 invocations per second. That makes a factor of 4. I know that MCOP isn't tuned to the maximum possible, yet. (Comparision: CORBA, as implemented with mico, does something between 1000 and 1500 invocations per second).
If you want ‘harder’ data, consider writing some small benchmark app for DCOP and send it to me.
CORBA had the nice feature that you could use objects you implemented once, as ‘seperate server process’, or as ‘library’. You could use the same code to do so, and CORBA would transparently descide what to do. With DCOP, that is not really intended, and as far as I know not really possible.
MCOP on the other hand should support that from the beginning. So you can run an effect inside artsd. But if you are a wave editor, you can choose to run the same effect inside your process space as well.
While DCOP is mostly a way to communicate between apps, MCOP is also a way to communicate inside apps. Especially for multimedia streaming, this is important (as you can run multiple MCOP objects parallely, to solve a multimedia task in your application).
Although MCOP does not currently do so, the possibilities are open to implement quality of service features. Something like ‘that MIDI event is really really important, compared to this invocation’. Or something like ‘needs to be there in time’.
On the other hand, stream transfer can be integrated in the MCOP protocol nicely, and combined with QoS stuff. Given that the protocol may be changed, MCOP stream transfer should not really get slower than conventional TCP streaming, but: it will be easier and more consistent to use.
There is no need to base a middleware for multimedia on Qt™. Deciding so, and using all that nice Qt™-streaming and stuff, will easily lead to the middleware becoming a Qt™-only (or rather KDE-only) thing. I mean: as soon as I'll see the GNOMEs using DCOP, too, or something like that, I am certainly proven wrong.
While I do know that DCOP basically doesn't know about the data types it sends, so that you could use DCOP without using Qt™, look at how it is used in daily KDE usage: people send types like QString, QRect, QPixmap, QCString, ..., around. These use Qt™-serialization. So if somebody choose to support DCOP in a GNOME program, he would either have to claim to use QString,... types (although he doesn't do so), and emulate the way Qt™ does the streaming, or he would send other string, pixmap and rect types around, and thus not be interoperable.
Well, whatever. aRts was always intended to work with or without KDE, with or without Qt™, with or without X11, and maybe even with or without Linux® (and I have even no problems with people who port it to a popular non-free operating systems).
It is my position that non-GUI-components should be written non-GUI-dependant, to make sharing those among wider amounts of developers (and users) possible.
I see that using two IPC protocols may cause inconveniences. Even more, if they are both non-standard. However, for the reasons given above, switching to DCOP is no option. If there is significant interest to find a way to unite the two, okay, we can try. We could even try to make MCOP speak IIOP, then we'd have a CORBA ORB ;).
I talked with Matthias Ettrich a bit about the future of the two protocols, and we found lots of ways how things could go on. For instance, MCOP could handle the message communication in DCOP, thus bringing the protocols a bit closer together.
So some possible solutions would be:
Write an MCOP - DCOP gateway (which should be possible, and would make interoperation possible) - note: there is an experimental prototype, if you like to work on that.
Integrate everything DCOP users expect into MCOP, and try to only do MCOP - one could add an ‘man-in-the-middle-option’ to MCOP, too ;)
Base DCOP on MCOP instead of libICE, and slowly start integrating things closer together.
However, it may not be the worst possibility to use each protocol for everything it was intended for (there are some big differences in the design goals), and don't try to merge them into one.
aRts is not only a piece of software, it also provides a variety of APIs for a variety of purposes. In this section, I will try to describe the "big picture", a brief glance what those APIs are supposed to do, and how they interact.
There is one important distinction to make: most of the APIs are language and location independant because they are specified as mcopidl. That is, you can basically use the services they offer from any language, implement them in any language, and you will not have to care whether you are talking to local or remote objects. Here is a list of these first:
Basic definitions that form the core of the MCOP functionality, such as the protocol itself, definitions of the object, the trader, the flow system and so on.
These contain the flow system you will use for connecting audio streams, the definition of Arts::SynthModule which is the base for any interface that has streams, and finally a few useful audio objects
Here, an object that can play a media, Arts::PlayObject gets defined. Media players such as the KDE media player noatun will be able to play any media for which a PlayObject can be found. So it makes sense to implement PlayObjects for various formats (such as mp3, mpg video, midi, wav, ...) on that base, and there are a lot already.
Here, an interface for the system wide sound server artsd is defined. The interface is called Arts::SoundServer, which implements functionality like accepting streams from the network, playing samples, creating custom other aRts objects and so on. Network transparency is implied due to the use of MCOP (as for everything else here).
This module defines basic flow graph functionality, that is, combining simpler objects to more complex ones, by defining a graph of them. It defines the basic interface Arts::StructureDesc, Arts::ModuleDesc and Arts::PortDesc which contain a description of a structure, module, and port. There is also a way to get a "living network of objects" out of these connection and value descriptions, using a factory.
This module defines basic midi functionality, like objects that produce midi events, what is a midi event, an Arts::MidiManager to connect the producers and consumers of midi events, and so on. As always network transparency implied.
Here are various additional filters, oscillators, effects, delays and so on, everything required for real useful signal processing, and to build complex instruments and effects out of these basic building blocks.
This cares about visual objects. It defines the basic type Arts::Widget from which all GUI modules derive. This will produce toolkit independency, and ... visual GUI editing, and serializable GUIs. Also, as the GUI elements have normal attributes, their values can be straight forward connected to some signal processing modules. (I.e. the value of a slider to the cutoff of a filter). As always: network transparent.
Where possible, aRts itself is implemented using IDL. On the other hand, there are some language specific APIs, using either plain C++ or plain C. It is usually wise to use IDL interfaces where possible, and the other APIs where necessary. Here is a list of language specific APIs:
These are convenience KDE APIs for the simple and common common case, where you just want to play a sample. The APIs are plain C++, Qt/KDE optimized, and as easy as it can get.
Plain C interface for the sound server. Very useful for porting legacy applications.
Here all magic for MCOP happens. The library contains the basic things you need to know for writing a simple MCOP application, the dispatcher, timers, iomanagement, but also the internals to make the MCOP protocol itself work.
Besides the implementation of artsflow.idl, some useful utilities like sampling rate conversion.
Integration of MCOP into the Qt event loop, when you write Qt applications using MCOP.
The aRts C API was designed to make it easy to writing and port plain C applications to the aRts sound server. It provides streaming functionality (sending sample streams to artsd), either blocking or non-blocking. For most applications you simply remove the few system calls that deal with your audio device and replace them with the appropriate aRts calls.
I did two ports as a proof of concept: mpg123 and quake. You can get the patches from here. Feel free to submit your own patches to the maintainer of aRts or of multimedia software packages so that they can integrate aRts support into their code.
Sending audio to the sound server with the API is very simple:
Procedure 7.1.
include the header file using #include <artsc.h>
initialize the API with arts_init()
create a stream with arts_play_stream()
configure specific parameters with arts_stream_set()
write sampling data to the stream with arts_write()
close the stream with arts_close_stream()
free the API with arts_free()
Here is a small example program that illustrates this:
#include <stdio.h> #include <artsc.h> int main() { arts_stream_t stream; char buffer[8192]; int bytes; int errorcode; errorcode = arts_init(); if (errorcode < 0) { fprintf(stderr, "arts_init error: %s\n", arts_error_text(errorcode)); return 1; } stream = arts_play_stream(44100, 16, 2, "artsctest"); while((bytes = fread(buffer, 1, 8192, stdin)) > 0) { errorcode = arts_write(stream, buffer, bytes); if(errorcode < 0) { fprintf(stderr, "arts_write error: %s\n", arts_error_text(errorcode)); return 1; } } arts_close_stream(stream); arts_free(); return 0; }
To easily compile and link programs using the aRts C API, the artsc-config utility is provided which knows which libraries you need to link and where the includes are. It is called using
artsc-config --libs
to find out the libraries and
artsc-config --cflags
to find out additional C compiler flags. The example above could have been compiled using the command line:
cc -o artsctest artsctest.c `artsc-config --cflags` `artsc-config --libs` cc -o artsctest artsctest.c `artsc-config --cflags` `artsc-config --libs`
This chapter describes all of the standard aRts modules. One of the most powerful features of aRts, modules can be connected together into structures to implement new functions such as effects and instruments.
Modules are broken down into two categories. Synthesis modules are used for implementing the “plumbing” that manipulates multimedia data streams to implement new effects, instruments, mixers, and applications. Visual modules allow you to provide a graphical user interface to control the sound structures that are built up with the synthesis modules.
This multiplies a signal by a factor. You can use this to scale signals down (0 < factor < 1) or up (factor > 1) or invert signals (factor < 0). Note that the factor may be a signal and don't has to be constant (e.g. envelope or real signal).
This adds an arbitary number of signals. If you need to sum up the waveforms produces by four different oscillators, you for instance can connect all their outputs to one Synth_MULTI_ADD module. This is more efficient than using three Synth_ADD modules.
This crossfades two signals. If the percentage input is -1, only the left signal is heard, if it is 1, only the right signal is heard. When it is 0, both signals a heard with the same volume.
This allows you to ensure that your signal stays in a well defined range. If you had two signals that were between -1 and 1 before crossfading, they will be in the same range after crossfading.
The opposite of a crossfader. This takes a mono signal and splits it into a stereo signal: It is used to automatically pan the input signal between the left and the right output. This makes mixes more lively. A standard application would be a guitar or lead sound.
Connect a LFO, a sine or saw wave for example to inlfo. and select a frequency between 0.1 and 5Hz for a traditional effect or even more for Special FX.
This delays the input signal for an amount of time. The time specification must be between 0 and maxdelay for a delay between 0 and maxdelay seconds.
This kind of delay may not be used in feedback structures. This is because it's a variable delay. You can modify it's length while it is running, and even set it down to zero. But since in a feedback structure the own output is needed to calculate the next samples, a delay whose value could drop to zero during synthesis could lead to a stall situation.
Use CDELAYs in that setup, perhaps combine a small constant delay (of 0.001 seconds) with a flexible delay.
You can also combine a CDELAY and a DELAY to achieve a variable length delay with a minimum value in a feedback loop. Just make sure that you have a CDELAY involved.
This delays the input signal for an amount of time. The time specification must be greater than 0 for a delay of 0 seconds or more. The delay is constant during the calculation, that means it can't be modified.
This saves computing time as no interpolation is done, and is useful for recursive structures. See description above (Synth_DELAY).
This is a classic ADSR envelope which means you specify:
Whether the note is being pressed right now by the user.
The input signal.
The time that should pass between the user presses the note and the signal reaching it's maximum amplitude (in seconds).
The time that should pass between the the signal reaching it's maximum amplitude and the signal going back to some constant level (in seconds).
The constant level the signal is held at afterwards, until the user releases the note.
The time that should pass after the user has released the note until the signal is scaled down to zero (in seconds).
You'll get the scaled signal at outvalue. If the ASDR envelope is finished, it will set done to 1. You can use this to provide the “done” output of an instrument (which will make the instrument structure be deleted by the MIDI router object once the release phase is over).
The Synth_PSCALE module will scale the audio stream that is directed through it from a volume 0 (silent) to 1 (original loudness) back to 0 (silent). According to the position (get the position from Synth_SEQUENCE). The position where the peak should occur can be given as pos.
Example: Setting top to 0.1 means that after 10% of the note has been played, the volume has reached its maximum, and starts decaying afterwards.
This is a reverb effect. In the current implementation, it is thought to pass a stereo signal through the reverb, and it will -add- it's reverb effect to the signal.
This means that it can be used inside an StereoEffectStack as well.
The input signal should be connected to (inleft, inright), the output signal will be (outleft, outright).
The parameters which you can configure are:
The size of the room which the reverb simulates (range: 0..1, where 1 is the largest possible room).
This specifies a filter which will make the simulated room absorb high frequencies (range 0..1, where 1 means absorb high frequencies quite agressive).
The amount of reverb-signal (that is, the amount of the signal that should be modified by the filters, resulting in a ‘wet’, that is ‘reverb sound’.
The amount of pure signal passed through, resulting in an echo (or combined delay) rather than reverb effect (range: 0..1).
The amount of stereo-magic the reverb algorithm adds to the reverb effect, making the reverb sound wider in the stereo panorama (range: 0..1).
[ TODO: I think if mode is 1, the reverb holds the current image of the sound, whereas 0 is normal operation ]
The tremolo module modulates the amplitude according to a LFO-Wave. Traditionally you would use a sine wave but why limit yourself? What you get is a very intense effect that cuts through most arrangements because of its high dynamic range. The tremolo effect is still one of guitarists favourite effects although it's not as popular as in the 1960's.
[ TODO: currently this is implemented as invalue + abs(inlfo) - maybe it would make more sense to implement it as invalue * (1+inlfo*depth), where depth would be a parameter between 0..1 - decide this after KDE2.1 ; if you have a comment, send a mail to the aRts list ;). ]
A flanger is a time-varying delay effect. To make development of complex flanger effects simpler, this module is provided, which contains the core of a one-channel flanger.
It has the following ports:
The signal which you want to process.
Preferably a sine wave which modulates the delay time inside the flanger (-1 .. 1).
The minimum value for the delay inside the flanger in milliseconds. Suggested values: try something like 1 ms. Please use values < 1000 ms.
The minimum value for the delay inside the flanger in milliseconds. Suggested values: try something like 5 ms. Please use values < 1000 ms.
The output signal. It is important that you mix that with the original (unflanged) signal to get the desired effect.
You can use this as a basis for a chorus effect.
This pitch shifting effect changes the frequency of the input signal without affecting the speed. An application for this is for instance changing the pitch of your voice while you record (and replay) it in realtime.
The speed parameter is the relative speed with which the signal will be replayed. So a speed of two would make it sound twice as high (i.e. an input frequency of 440 Hz would result in an output frequency of 880 Hz).
The frequency parameter is used internally to switch between different grains of the signal. It is tunable, and depending on your choice, the pitch shifting will sound more or less realistic for your use case. A good value to start with is something like 5 or 10.
This modules clips a signal to make it fit into the range of [-1;1]. It doesn't do anything to prevent the distortion that happens when clipping loud signals. You can use this as effect (for instance to create a slightly clipped sine wave). However, it's probably a good idea to run the signal through a lowpass filter afterwards if you do so, to make it sound less agressive.
This is a nice parametric equalizer building block. It's parameters are:
The signal that gets filtered by the equalizer.
How low frequencies should be changed. The value is in dB, while 0 means don't change low frequencies, -6 would mean take them out by 6dB, and +6 mean boost them by 6dB.
How middle frequencies should be changed by the equalizer in dB (see low).
How high frequencies should be changed by the equalizer in dB (see low).
This is the center frequency of the equalizer in Hz, the mid frequencies are around that spectrum, the low and high frequencies below and above. Note that the frequency may not be higher than half the sampling rate, usually that is 22050 Hz, and not lower than 1 Hz.
This influences how broad the mid spectrum is. It must be be a positive number > 0. A value of one is reasonable, higher values of q mean a narrower spectrum of middle frequencies. Lower values than one mean a broader sprectrum.
A damped resonator filter filtering all frequencies around some peak value. There is no useful way of specifying middle frequency (that won't be cut), since the input are two strange constants f and b. The code is very old, from the first days of the synthesizer, and will probably replaced by a new filter which will have a frequency and a resonance value as parameters.
Try something like b=5, f=5 or b=10, f=10 or b=15, f=15 though.
This modules loads an instrument structure from a file, and registers itself as midi output with the aRts MIDI manager. Notes sent to this output will result in instrument voices being created.
You can setup something like this more convenient in artscontrol than manually in aRts-builder.
Will play a sequence of notes over and over again. The notes are given in tracker notation, and are seperated by semicolons. An example is A-3;C-4;E-4;C-4;. The speed is given as seconds per note, so if you want to get 120 bpm, you will probably specify 0.5 seconds/note, as 60 seconds/0.5 seconds per note=120 bpm.
You can give each note an length relative to the speed by using a colon after the note and then then length. A-3:2;C-4:0.5;D-4:0.5;E-4; demonstrates this. As you see, midi composing programs tend to offer more comfort ;)
The Synth_SEQUENCE gives additional information about the position of the note it is playing right now, while 0 means just started and 1 means finished. This information you can use with Synth_PSCALE (see below).
This will play a wav file. It will only be present if you have libaudiofile on your computer. The wave file will start as soon as the module gets created.
It will stop as soon as it's over, then finished will be set to 1. The speed parameter can be used to replay the file faster or slower, where 1.0 is the normal (recorded) speed.
You will normally not need this module, unless you are writing standalone applications. Inside artsd, there normally is already a Synth_PLAY module, and creating another one will not work.
The Synth_PLAY-module will output your audio signal to the soundcard. The left and right channels should contain the normalized input for the channels. If your input is not between -1 and 1, you get clipping.
As already mentioned, there may only be one Synth_PLAY module used, as this one directly accesses your soundcard. Use busses if you want to mix more than one audio stream together before playing. Use the Synth_AMAN_PLAY module to get something like an output inside artsd.
Note that Synth_PLAY also does the timing of the whole structure. This means: no Synth_PLAY = no source for timing = no sound. So you absolutely need (exactly) one Synth_PLAY object.
You will normally not need this module, unless you are writing standalone applications. Inside artsd, there normally is already a Synth_RECORD module, and creating another one will not work.
The Synth_RECORD-module will record a signal from the soundcard. The left and right channels will contain the input for the channels (between -1 and 1).
As already mentioned, there may only be one Synth_RECORD module used, as this one directly accesses your soundcard. Use busses if you want to use the recorded audio stream in more than one place. Use the Synth_AMAN_RECORD module to get something like an input inside artsd. For this to work, artsd must run with full duplex enabled .
The Synth_AMAN_PLAY-module will output your audio signal. It is nice (but not necessary) if you output a normalized signal (between -1 and 1).
This module will use the audio manager to assign where the signal will be played. The audio manager can be controlled through artscontrol. To make it more intuitive to use, it is good to give the signal you play a name. This can be achieved through setting title. Another feature of the audio manager is to be able to remember where you played a signal the last time. To do so it needs to be able to distinguish signals. That is why you should assign something unique to autoRestoreID, too.
The Synth_AMAN_RECORD-module will record an audio signal from an external source (i.e.. line in/microphone) within artsd. The output will be a normalized signal (between -1 and 1).
This module will use the audio manager to assign where the signal will be played. The audio manager can be controlled through artscontrol. To make it more intuitive to use, it is good to give the signal you record a name. This can be achieved through setting title. Another feature of the audio manager is to be able to remember where you recorded a signal the last time. To do so it needs to be able to distinguish signals. That is why you should assign something unique to autoRestoreID, too.
You can use this for debugging. It will print out the value of the signal at invalue in regular intervals (ca. 1 second), combined with the comment you have specified. That way you can find out if some signals stay in certain ranges, or if they are there at all.
You can use this to debug how your MIDI events are actually arriving in aRts.
When a MIDI_DEBUG is running artsserver will print out a lines like:
201 100753.837585 on 0 42 127
202 101323.128355 off 0 42
While the first line would be telling you that 100753ms (that is 100 seconds) after the MIDI_DEBUG started, a MIDI on event arrived on channel 0. This midi on event had the velocity (volume) of 127, the loudest possible. The next line shows the midi release event. [ TODO: this does not work currently, make it work, and do it via MIDI manager ].
All oscillators in aRts don't require a frequency as input, but a position in the wave. The position should be between 0 and 1, which maps for a standard Synth_WAVE_SIN object to the range 0..2*pi. To generate oscillating values from a frequency, a Synth_FREQUENCY modules is used.
This is used for frequency modulation. Put your frequency to the frequency input and put another signal on the modulator input. Then set modlevel to something, say 0.3. The frequency will be modulated with modulator then. Just try it. Works nice when you put a feedback in there, that means take a combination of the delayed output signal from the Synth_FM_SOURCE (you need to put it to some oscillator as it only takes the role of Synth_FREQUENCY) and some other signal to get good results.
Works nicely in combination with Synth_WAVE_SIN oscillators.
Sinus oscillator. Put a pos signal from Synth_FREQUENCY or Synth_FM_SOURCE at the input. And get a sinus wave as output. The pos signal specifies the position in the wave, the range 0..1 is mapped to 0..2*pi internally.
Triangle oscillator. Put a pos signal from Synth_FREQUENCY or Synth_FM_SOURCE at the input. And get a triangle wave as output. The pos signal specifies the position in the wave, the range 0..1 is mapped to 0..2*pi internally. Be careful. The input signal must be in the range 0..1 for the output signal to produce good results.
Square oscillator. Put a pos signal from Synth_FREQUENCY or Synth_FM_SOURCE at the input. And get a square wave as output. The pos signal specifies the position in the wave, the range 0..1 is mapped to 0..2*pi internally. Be careful. The input signal must be in the range 0..1 for the output signal to produce good results.
Softened saw wave, similar in look like the Synth_WAVE_TRI oscillator. Put a pos signal from Synth_FREQUENCY or Synth_FM_SOURCE at the input. You'll get a softened saw wave as output. The pos signal specifies the position in the wave, the range 0..1 is mapped to 0..2*pi internally. Be careful. The input signal must be in the range 0..1 for the output signal to produce good results.
Pulse oscillator - this module is similar in spirit like the rectangular oscillator (Synth_WAVE_RECT), but it provides a configurable up/down ratio, through the dutycycle parameter. Put a pos signal from Synth_FREQUENCY or Synth_FM_SOURCE at the input. Get a pulse wave as output. The pos signal specifies the position in the wave, the range 0..1 is mapped to 0..2*pi internally. Be careful. The input signal must be in the range 0..1 for the output signal to produce good results.
This module reduces the dynamic range of the signal. For example compressors are useful in compensating for the wide variations in loudness of somebody talking into a microphone.
As soon as the input level exceeds a certain level (the threshold) the signal gets compressed. It simply multiplies everything above the threshold with the ratio, which should be a number between 0 and 1. Finally the whole signal is multiplied by the output factor.
The attack and release arguments delay the start and end of the compression. Use this if you, for example, still want to hear the loud beginning of a basedrum. The argument is in milliseconds and an attack or release of 0ms is possible but may result in a slight noise.
The artsdsp utility, described previously, allows most legacy sound applications that talk to the audio devices directly, to work properly under aRts. Applications written to use the Enlightenment Sound Daemon (esd) will also work in most cases by running esd under artsdsp.
This makes a good short term solution to porting existing applications to KDE. However, it does not allow the application to directly take advantage of all of the power of aRts, such as using modules and multimedia streams other than digital audio. If the application ges beyond simple playing of sound files, it usually makes sense to add native support for aRts to the application.
Using aRts also means that application does not have to do as much work -- it can leverage the functions in aRts to handle issues like codecs for different media formats and control of the sound hardware.
When using aRts, you have a number of different APIs to choose from. The decision of which to use depends on a number of factors, including what type of streaming media is used (sound, MIDI, CD audio, etc.), the API features required, and whether it is written in C++. In most cases the choice should be relatively obvious based on the required features.
For cross-platform portability, applications that need to run on environments other than KDE cannot rely on aRts being present. Using the plug-ins paradigm is a good way to support different multimedia environments. Making the plug-in API open and documented (especially for closed source applications) also has the advantage of allowing someone other than the application developer to implement an aRts plug-in.
The aRts project can use help from developers to make existing multimedia applications aRts-aware, write new multimedia applications, and enhance the capabilities of aRts. However, you don't have to be a developer to contribute. We can also use help from testers to submit bug reports, translators to translate the application text and documentation into other languages, artists to design bitmaps (especially for artsbuilder modules), musicians to create sample aRts modules, and writers to write or proofread documentation.
Most development discussions on aRts take place on two mailing lists. This is the place to discuss new feature and implementation ideas or ask for help with problems.
The KDE Multimedia mailing list is for general KDE multimedia issues including aRts as well as the multimedia applications like Noatun and aKtion. You can subscribe from the web page at http://www.kde.org/mailinglists.html or send an email with the subject set to subscribe your-email-address to <kde-multimedia-request@kde.org>. The list is also archived at http://lists.kde.org.
The aRts mailing list is for issues specific to aRts, including non-KDE use of aRts. To subscribe, send an email containing the message body subscribe your-email-address to <arts-request@space.twc.de>. The list is archived at http://space.twc.de/~stefan/arts-archive.
For getting a consistent reading through all the sources, it is important to keep the coding style the same, all over the aRts source. Please, even if you just write a module, try to write/format your source accordingly, as it will make it easier for different people to maintain the source tree, and easier to copy pieces of from one source to another.
Qt™/Java™ style. That means capitalization on word breaks, and first letter always without capitalization; no underscores.
This means for instance:
createStructureDesc() updateWidget(); start();
Class members are not capitalized, such as menubar or button.
When there are accessing functions, the standard should be the MCOP way, that is, when having an long member foo, which shouldn't be visible directly, you create:
foo(long new_value); long foo();
functions to get and set the value. In that case, the real value of foo should be stored in _foo.
All classes should be wordwise capitalized, that means ModuleView, SynthModule. All classes that belong to the libraries should use the aRts namespace, like Arts::Soundserver.
The implementations of MCOP classes should get called Class_impl, such as SoundServer_impl.
Parameters are always uncapitalized.
Local variables are always uncapitalized, and may have names like i, p, x, etc. where appropriate.
One tab is as long as 4 spaces.
You normally don't need to use spaces in expressions. You can however use them between operator and their operands. However, if you put a space before an operator (i.e. +), you also need to put a space after the operator. The only exception to this are list-like expressions (with ,), where you should only put a space after the ",", but not before. It's okay to omit the space here, too.
The following examples demonstrate good use of spaces:
{ int a,b; int c, d, e; int f = 4; a=b=c=d+e+f; a = b = c = d + e + f; if(a == 4) { a = b = c = (d+e)/2; } while(b<3) c--; arts_debug("%d\n", c); }
The following examples demonstrate how not to use spaces. For function calls, after if, while, for, switch and so on, no space is being written.
{ // BAD: if you write a list, write spaces only after the "," int a , b , c , d , e , f; // BAD: non-symmetric use of spaces for = operator a= 5; // BAD: if is considered a function, and isn't followed by a space if (a == 5) { } // BAD: don't write a space after while while (a--) b++; // BAD: functions names are not followed by a space arts_debug ("%d\n", c); // BAD: neither are member names Arts::Object o = Arts::Object::null (); }
Source files should have no capitalization in the name. They should have the name of the class when they implement a single class. Their extension is .cc if they refer to Qt™/GUI independant code, and .cpp if they refer to Qt™/GUI dependant code. Implementation files for interfaces should be called foo_impl, if Foo was the name of the interface.
IDL files should be called in a descriptive way for the collection of interfaces they contain, also all lower case. Especially it is not good to call an IDL file like the class itself, as the .mcopclass trader and type info entries will collide, then.
This section describes some of the aRts work that is in progress. Development progresses quickly, so this information may be out of date. You should check the TODO list file and the mailing list archives to see what new functionality is planned. Feel free to get involved in new design and implementation.
This is a draft document which tries to give you an overview how new technologies will be integrated in aRts. Namely, it does cover the following:
How interfaces work.
Codecs - decoding of mp3 or wav streams in a form that they can be used as data.
Video.
Threading.
Synchronization.
Dynamic expansion/masquerading.
Dynamic composition.
GUI
MIDI
This is work in progress. However, it should be the base if you want to see new technology in aRts. It should give you a general idea how these problems will be adressed. However, feel free to correct anything you see here.
Things that will be use aRts technology (so please, coordinate your efforts):
KPhone (voice over IP)
Noatun (video / audio player)
artscontrol (sound server control program, for scopes)
Brahms (music sequencer)
Kaiman (KDE2 media player - kmedia2 compliant)
mpglib/kmpg (mpg audio and video playing technology)
SDL (direct media layer for games not yet started but maybe nice)
electric ears (author contacted me - status unknown)
MCOP interfaces are the base of the aRts concept. They are the network transparent equivalent to C++ classes. Whenever possible you should orient your design towards interfaces. Interfaces consist of four parts:
Synchronous streams
Asynchronous streams
Methods
Attributes
These can be mixed in any way you like. New technologies should be defined in terms of interfaces. Read the sections about asynchronous streams and synchronous streams, as well as the KMedia2 interfaces, which are a good example how such things work
Interfaces are specified in .idl code and run through the mcopidl compiler. You derive the Interfacename_impl class to implement them, and use REGISTER_IMPLEMENTATION(Interfacename_impl) to insert your object implementations into the MCOP object system.
The kmedia2 interfaces allow you to ignore that wav files, mp3s and whatever consist of data streams. Instead, you only implement methods to play them.
Thus, you can write a wave loading routine in a way that you can play wave files (as PlayObject), but nobody else can use your code.
Asynchronous streams would be the alternative. You define an interface which allows you to pass data blocks in, and get data blocks out. This looks like that in MCOP:
interface Codec { in async byte stream indata; out async byte stream outdata; };
Of course codecs could also provide attributes to emit additional data, such as format information.
interface ByteAudioCodec { in async byte stream indata; out async byte stream outdata; readonly attribute samplingRate, bits, channels; };
This ByteAudioCodec for instance could be connected to a ByteStreamToAudio object, to make real float audio.
Of course, other Codec types could involve directly emitting video data, such as
interface VideoCodec { in async byte stream indata; out video stream outdata; /* note: video streams do not exist yet */ };
Most likely, a codec concept should be employed rather than the “you know how to play and I don't” way for instance WavPlayObject currently uses. However, somebody needs to sit down and do some experiments before an API can be finalized.
My idea is to provide video as asynchronous streams of some native MCOP data type which contains images. This data type is to be created yet. Doing so, plugins which deal with video images could be connected the same way audio plugins can be connected.
There are a few things that are important not to leave out, namely:
There are RGB and YUV colorspaces.
The format should be somehow tagged to the stream.
Synchronization is important.
My idea is to leave it possible to reimplement the VideoFrame class so that it can store stuff in a shared memory segment. Doing so, even video streaming between different processes would be possible without too much pain.
However, the standard situation for video is that things are in the same process, from the decoding to the rendering.
I have done a prototypic video streaming implementation, which you can download here . This would need to be integrated into MCOP after some experiments.
A rendering component should be provided that supports XMITSHM (with RGB and YUV), Martin Vogt told me he is working on such a thing.
Currently, MCOP is all single threaded. Maybe for video we will no longer be able to get around threading. Ok. There are a few things that should be treated carefully:
SmartWrappers - they are not threadsafe due to non-safe reference counting and similar.
Dispatcher / I/O - also not threadsafe.
However, what I could imagine is to make selected modules threadsafe, for both, synchronous and asynchronous streaming. That way - with a thread aware flow system, you could schedule the signal flow over two or more processors. This would also help audio a lot on multiprocessor things.
How it would work:
The Flow System decides which modules should calculate what - that is:
video frames (with process_indata method)
synchronous audio streams (calculateBlock)
other asynchronous streams, mainly byte streams
Modules can calculate these things in own threads. For audio, it makes sense to reuse threads (e.g. render on four threads for four processors, no matter if 100 modules are running). For video and byte decompression, it may be more confortable to have a blocking implementation in an own thread, which is synchronized against the rest of MCOP by the flow system.
Modules may not use MCOP functionality (such as remote invocations) during threaded operation.
Video and MIDI (and audio) may require synchonization. Basically, that is timestamping. The idea I have is to attach timestamps to asynchronous streams, by adding one timestamp to each packet. If you send two video frames, simply make it two packets (they are large anyway), so that you can have two different time stamps.
Audio should implicitely have time stamps, as it is synchronous.
The MIDI stuff will be implemented as asynchronous streams. There are two options, one is using normal MCOP structures to define the types and the other is to introduce yet another custom types.
I think normal structures may be enough, that is something like:
struct MidiEvent { byte b1,b2,b3; sequence<byte> sysex; }
Asynchronous streams should support custom stream types.
This is the primary web site for KDE-related multimedia information.
This is the home page for the aRts project.
Chapter 14 of this published book covers multimedia, including aRts. It is available in print or on-line with annotations at http://www.andamooka.org.
This site has a comprehensive listing of sound and MIDI applications for Linux®.
This section answers some frequently asked questions about aRts.
13.1. | Does KDE support my sound card for audio output? | ||||||||||||||||||||
KDE uses aRts to play sound, and aRts uses the Linux® kernel sound drivers, either OSS or ALSA (using OSS emulation). If your sound card is supported by either ALSA or OSS and properly configured (i.e. any other Linux® application can output sound), it will work. There are however some problems with some specific hardware, please read the section for hardware specific problems if you're having problems with artsd on your machine. Meanwhile also support for various other platforms has been added. Here is a complete list of how the most recent version of aRts can play sound. If you have an unsupported platform, please consider porting aRts to your platform.
| |||||||||||||||||||||
13.2. | I can't play wav files with artsd! | ||||||||||||||||||||
Check that artsd is linked to libaudiofile (ldd artsd). If it isn't, download kdesupport, recompile everything, and it will work. | |||||||||||||||||||||
13.3. | I hear sound when logged in as root but no other users have sound! | ||||||||||||||||||||
The permissions of the file /dev/dsp affect which users will have sound. To allow everyone to use it, do this: Procedure 13.1.
You can achieve the same effect in a terminal window using the command chmod 666 /dev/dsp. For restricting access to sound to specific users, you can use group permissions. On some Linux® distributions, for instance Debian/Potato, /dev/dsp is already owned by a group called audio, so all you need to do is add the users to this group. | |||||||||||||||||||||
13.4. | This helps for artsd, but what about KMix, KMid, Kscd,etc.? | ||||||||||||||||||||
There are various other devices which provide functionality accessed by multimedia applications. You can treat them in the same way, either by making them accessible for everyone, or using groups to control access. Here is a list, which may still be incomplete (also if there are various devices in a form like midi0, midi1, ..., then only the 0-version is listed here):
| |||||||||||||||||||||
13.5. | What can I do if artsd doesn't start or crashes while running? | ||||||||||||||||||||
First of all: try using the default settings in KControl (or if you are starting manually, don't give additional options besides maybe -F10 -S4096 for latency). Especially full duplex is likely to break with various drivers, so try disabling it. A good way to figure out why artsd doesn't start (or crashes while running) is to start it manually. Open a Konsole window, and do: % artsd -F10 -S4096> You can also add the -l0 option, which will print more information about what is happening, like this: % artsd -l0 -F10 -S4096 Doing so, you will probably get some useful information why it didn't start. Or, if it crashes when doing this-and-that, you can do this-and-that, and see “how” it crashes. If you want to report a bug, producing a backtrace with gdb and/or an strace may help finding the problem. | |||||||||||||||||||||
13.6. | Can I relocate artsd (move compiled files to another directory)? | ||||||||||||||||||||
You can't relocate aRts perfectly. The problem is that artswrapper has the location of artsd compiled in due to security reasons. You can however use the .mcoprc file (TraderPath/ExtensionPath entries) to at least make a relocated artsd find it's components. See the chapter about the .mcoprc file for details on how to do this. | |||||||||||||||||||||
13.7. | Can I compile aRts with gcc-3.0? | ||||||||||||||||||||
Short answer: no, aRts will not work if you compile it with gcc-3.0. Long answer: In the official release, there are two gcc-3.0 bugs which affect aRts. The first, gcc-3.0 bug c++/2733 is relatively harmless (and has to do with problems with the asm statement). It breaks compilation of convert.cc. It has been fixed in the gcc-3.0 CVS, and will no longer be a problem with gcc-3.0.1 and higher. A workaround has also been added to the CVS version of KDE/aRts. The second gcc-3.0 bug, c++/3145 (which is generation of wrong code for some cases of multiple virtual inheritance) is critical. Applications like artsd will simply crash on startup when compiled with gcc-3.0. Even if some progress has been made in the gcc-3.0 branch at time of this writing, still artsd crashes quite often, unpredictably. | |||||||||||||||||||||
13.8. | What applications run under aRts? | ||||||||||||||||||||
Obviously, all of the applications included with KDE are aRts-aware. This includes:
Some KDE applications that are not yet included in the KDE release (e.g. in kdenonbeta) also support aRts, including:
The following non-KDE applications are known to work with aRts:
The following applications are known not to work with aRts:
See also the answers to the questions in the section on non-aRts applications. This section is incomplete -- if you have more information on supported and unsupported applications, please send them to the author so they can be included here. |
13.1. | I can't use aRts-builder. It crashes when executing a module! |
The most likely cause is that you are using old structures or modules which aren't supported with the KDE 2 version. Unfortunately the documentation which is on the web refers to aRts-0.3.4.1 which is quite outdated. The most often reported crash is: that performing an execute structure in aRts-builder results in the error message [artsd] Synth_PLAY: audio subsystem is already used. You should use a Synth_AMAN_PLAY instead of a Synth_PLAY module and the problem will go away. Also see the aRts-builder help file (hit F1 in aRts-builder). Recent versions of aRts-builder (KDE 2.1 beta 1 and later) come with a set of examples which you can use. |
aRts software copyright 1998-2001 Stefan Westerfeld <stefan@space.twc.de>
Documentation copyright 1999-2001 Stefan Westerfeld <stefan@space.twc.de> and Jeff Tranter <tranter@kde.org>.
This documentation is licensed under the terms of the GNU Free Documentation License.
All libraries that are in aRts are licensed under the terms of the GNU Lesser General Public license. The vast majority of the aRts code is in the libraries, including the whole of MCOP and ArtsFlow. This allows the libraries to be used for non-free/non-open source applications if desired.
There are a few programs (such as artsd), that are released under the terms of the GNU General Public License. As there have been different opinions on whether or not linking GPL programs with Qt™ is legal, I also added an explicit notice which allows that, in addition to the GPL: permission is also granted to link this program with the Qt™ library, treating Qt™ like a library that normally accompanies the operating system kernel, whether or not that is in fact the case.
In order to use aRts you obviously need to have it installed and running on your system. There are two approaches for doing this, which are described in the next sections.
The quickest and easiest way to get aRts up and running is to install precompiled binary packages for your system. Most recent Linux® distributions include KDE, and if it is KDE 2.0 or later it will include aRts. If KDE is not included on your installation media it may be available as a download from your operating system vendor. Alternatively it may be available from third parties. Make sure that you use packages that are compatible with your operating system version.
A basic install of KDE will include the sound server, allowing most applications to play sound. If you want the full set of multimedia tools and applications you will likely need to install additional optional packages.
The disadvantage of using precompiled binaries is that they may not be the most recent version of aRts. This is particularly likely if they are provided on CD-ROM, as the pace of development of aRts and KDE is such that CD-ROM media cannot usually keep pace. You may also find that, if you have one of the less common architectures or operating system distributions, precompiled binary packages may not be available and you will need to use the second method.
While time consuming, the most flexible way to build aRts is to compile it yourself from source code. This ensures you have a version compiled optimally for your system configuration and allows you to build the most recent version.
You have two choices here -- you can either install the most recent stable version included with KDE or you can get the most recent (but possibly unstable) version directly from the KDE project CVS repository. Most users who aren't developing for aRts should use the stable version. You can download it from ftp://ftp.kde.org or one of the many mirror sites. If you are actively developing for aRts you probably want to use the CVS version. If you want to use aRts without KDE, you can download a standalone development snapshot from http://space.twc.de/~stefan/kde/arts-snapshot-doc.html.
Note that if you are building from CVS, some components of aRts (i.e. the basic core components including the sound server) are found in the CVS module kdelibs, while additional components (e.g. artsbuilder) are included in the. This may change in the future. You may also find a version in the kmusic module; this is the old (pre-KDE 2.0) version which is now obsolete.
The requirements for building aRts are essentially the same as for building KDE. The configure scripts should detect your system configuration and indicate if any required components are missing. Make sure that you have a working sound driver on your system (either the OSS/Free driver in the kernel, OSS driver from 4Front Technologies, or ALSA driver with OSS emulation).
More information on downloading and installing KDE (including aRts) can be found in the KDE FAQ.
Advanced Linux® Sound Architecture; a Linux® sound card driver; not currently included with the standard kernel source code.
Analog Real-Time Synthesizer; the name of the multimedia architecture/library/toolkit used by the KDE project (note capitalization)
Berkeley Software Distribution; here refers to any of several free UNIX®-compatible operating systems derived from BSD UNIX®.
Common Object Request Broker Architecture; a standard for implementing object-oriented remote execution.
Concurrent Versions System; a software configuration management system used by many software projects including KDE and aRts.
Fast Fourier Transform; an algorithm for converting data from the time to frequency domain; often used in signal processing.
The ability of a sound card to simultaneously record and play audio.
GNU General Public License; a software license created by the Free Software Foundation defining the terms for releasing free software.
Graphical User Interface
Interface Definition Language; a programming language independent format for specifying interfaces (methods and data).
K Desktop Environment; a project to develop a free graphical desktop environment for UNIX® compatible systems.
GNU Lesser General Public License; a software license created by the Free Software Foundation defining the terms for releasing free software; less restrictive than the GPL and often used for software libraries.
Multimedia COmmunication Protocol; the protocol used for communication between aRts software modules; similar to CORBA but simpler and optimized for multimedia.
Musical Instrument Digital Interface; a standard protocol for communication between electronic musical instruments; often also used to refer to a file format for storing MIDI commands.
Open Sound System; the sound drivers included with the Linux® kernel (sometimes called OSS/Free) or a commercial version sold by 4Front Technologies.