Inside Out

Human experience, the one we always, already, are situated in, is something mysterious and, a philosopher said, intrinsically paradoxical. Yet, we can’t help but look for sense and meaning throughout it and an audio/visual performance is a very powerful tool to say and discover something about it. It is often said that its strength lies in the interaction between sounds and visuals, and that is certainly true; however, the same can be said of two musical instruments within a well-composed piece. Therefore, what is that really makes an audio/visual performance such an engaging event? I will try to make a few hypotheses based on the experience I gained throughout the Studio Project.

To use an abused sentence, people can be divided into two categories when listening to music: some like to shut their eyes, some others let them wander around. Sometimes I belong to the first group, and I can even end up placing my fingers where the forehead meets the nose, in an attempt to keep my eyes shut; when I do that, it’s because what I see in my mind doesn’t really match what’s physically around me. More: is the music itself that is evoking images in my mind, which would be weakened by the outside world’s interference. This can happen either when music doesn’t engage me so much or when what I see around me is too mismatching and invasive, but in any case the gesture seems to have the power to amplify music through the amplification of inner images. Actually, if one carefully thinks about it, the opposite is also true: images and pictures demand, and often strongly, for a sonic imaginative act:


This seems to be a relevant result in order to understand the mechanism behind audio/visual engagement: in facts, if an artist is able to provide visual inputs that not only match, but even support and boost the musical side (and vice-versa), the positive feedback can really open an experiential dimension beyond mere sound and mere image.

For my performance I chose not to have just a big screen for the audience to stare at the whole time, but rather a three-dimensional environment that could represent a mental space, inhabited by entities of different nature. This space is defined by four televisions and a sub woofer, which form an ideal flat surface, separating the audience from the stage. There are a number of things that can be said about the TV & sub complex: first, it can suggest a criticism towards facile audio/visual approaches, which only juxtapose the two; from this point of view, the flatness of the screens couldn’t be more different from the subwoofer, which is a black box that disappears in the dark and whose sonic emission, for physical reasons, is almost a-directional (if someone really wanted to push this observation further, s/he could see in it Heidegger’s criticism of being as simple-presence versus being as emerging within a world…). Secondly, it could suggest a criticism of consumerism: after all, I was able to collect, for free, working televisions that people are dismissing just because they “want to upgrade” (direct quotation of one of the ex-owners). Moreover, there is a theme which is probably specific of Italy: over many years spent there, I realized that a large part of public opinion was manipulated by a constant bombardment of lies via television, and the way those lies were passed off as truth was to create a fake world that would turn people into superficial persons. Remarkably, the word surface translates in italian as superficie, giving a linguistic identity between the physic and the moral concepts. At the same time, however, televisions per se are just a representational medium, and this is the reason why we find the famous quotation by Magritte ceci n’est pas une pipe, transformed into ceci n’est pas un homme, meaning “this is not a human being” (the linguistic gender issue is here maybe exagerated). From this perspective, the TV “wall” also becomes a gate, a perceptual door that invite us inside a new, psychical space.


As well as inviting in, doors usually protect what’s behind them. In this case they are protecting an internal, intimate space. The figure of the woman is emblematic in this perspective, symbolizing the fact that whatever tries to go outside gets transmuted, chopped up, destroyed. Above all, her hand is (hopefully) a powerful image representing our internal thoughts and feelings trying to connect to the external world, but failing. The typical theme of incommunicability is here presented.

The mannequin is in a contrapuntal relationship with the performer: as said, he is safe and protected, while she dies on the surface. But also: she is fake while he is alive, she is static while he flounders, she is naked while he is dressed. As always, in counterpoints, there are also connections: his gestures cast light on her, and she is a woman but on her body we read homme (french for “man” as well as for “human being”).

On the far back of the stage a strong red light casted on an elevated curtain reminds of the inscrutable wildness of human unconscious.

The other side of surface (the one defined by the TVs) is the reign of sound, in which the audience is fully immersed. Sounds circulate, with no preferred direction, between four speakers placed at the corners. The strong spatial difference between the flatness of the television screens and the permeating omnipresence of sound is a way of highlighting the theme of the surface, while at the same time it suggests a deep space that the eye is invited to look for on the stage; the nature of sounds themselves, with an extreme use several different kinds of resonances, pushes in the same direction.

I thought that a good title for the performance can be On The Surface.

final performance: technical details

The performance I presented as the final project involves an audio/visual instrument composed of a few different systems.


The audio processes are essentially embodied by a Max (Cycling’74) patch. It is a rather complex system and makes use of many non native objects as well as one proprietary process (a compressor) so I invite who is interested in having it to contact me at my e-mail address to have instructions.


The first sound heard through the piece is the signal of a microphone attached to a wooden box; the signal is then processed by two sets of eight resonant filter banks, each featuring eleven filters and each being independent from one another (they can keep ringing while other ones are excited). Let us consider the first set: each bank has fixed gains for each filter and a fixed fundamental frequency assigned to the first one, but this first filter has a very low gain, so that it is hard to perceive it compared to the others. The remaining ten filters have variable frequencies that are most of the time aliased beyond the Nyquist frequency and thus symmetrically reflecting within the audible spectrum, contributing to make the fundamental frequency of each bank imperceptible and to achieve a general timbral complexity. The current active bank is selected after each attack exceeding a value of 2.5 % of the full digital scale, while frequencies are shifted after a certain number of these attacks; this number is controlled by the height of the performer’s left shoulder. The second set works in a slightly different way: here the fundamental frequency of each bank is more audible, is lower and there isn’t aliasing, so that the final result is more similar to standard modal processes; however, fundamental frequencies across the banks are more spread out (55 to 220 Hz), so variety is preserved. Current bank is selected after each attack exceeding a value of 2.0 % of the full digital scale and frequencies are always moving in accordance to a random signal, whose rate and scale though is controlled by the height of the performer’s left shoulder. For both sets, the Q factor of the filters is scaled depending on their index (the higher the index, the higher the Q, but because of the aliasing higher index does not mean higher frequency) and globally multiplied by the value of the height of the performer’s shoulder. Lastly, the bank selection is not abrupt, but happens via a smooth routing with a ramp time of 5 milliseconds: this feature had to be introduced to deal with more continuous signal coming from the microphone; for similar reasons, a V-shaped envelope of a total time of 10 milliseconds multiplies the audio input so that, when frequencies of the first bank are shifted, its level is zero.

A second, slightly less “natural” voice is a live granulation of the reverberation – incidentally, a signal driven, attack responsive one: in turns, an ever-changing space – of the first voice. Even if it is controlled by the right shoulder of the performer, its sonic nature is more abstract and the gestures much less “organic”, providing most of the times crescendos which culminate with evenly-spaced short grains.
A third voice is the feedback system which can be heard towards the end of the first section of the performance.


As mentioned above, many parameters are controlled by the movement of the performer’s shoulders. This is done using a pair of stretch sensors (available here: attached to my pants:


the other end of the electrical cable is then connected to a voltage divider circuit connected to an Arduino board, as well explained I this tutorial:

The Max patch I used to interface Arduino is available here:

I found this solution very effective compared to other body motion tracking systems, first of all because they can cost thousands of pounds and secondly because it naturally provides a physical feedback of the stretching force.


I worked on a set of four televisions fed with audio signal (scaled up 10000 times) to make them flicker. In most cases I did it using their SCART plug:


to do that I soldered some odd cables with a TRS jack on one end and a SCART plug on the other:


Here’s a video of a test:

For previous experiments, some of which ended up with nice videos, see my blog:

an improvised solo

After having refined (redesigned almost all control messages of) my instrument, I started to explore possible gestures and short forms. I think I achieved a good variety of sounds and found a few interesting gestures, so I recorded a demo of about 8 minutes. Great inspiration is, again, coming from Di Scipio. It’s a one take improvisation, but I think there’s some “storytelling” in it. Having built this instrument from scratch, being able to paint a beginning and an end is for me a big result. The left channel is much hotter than the right one (I love asymmetry).

stretching sound

This time I’ve been using one of the stretch sensors. I like its feeling a lot, especially because it gives an absolutely clear feedback, allowing for a really fine control. This is quite visible around 1:05 min in the video. There still are several issues, though. First and foremost, I’d like to use its own sounds and noises as the processing material, but I can’t do it using Arduino (as I’m doing now) because its 5V power is so noisy that completely covers the sensor. I read that the 3.3V source is less noisy, but I’m running out of time to continue experimenting (sadly). For the DMSP, I might just use a few sensors to control the processing of the sound(s) coming from a contact mic placed on a “sculpture” (and that is another issue). I like the sounds of this new video, but they’re coming from the laptop mic and, to mimic a contact mic, I had to tap and scratch its surface. I did it both with my “wired” arm, having a perfect sync between the stretching and the impacts (but unfortunately that’s out of the camera field), and with my other hand, but it turned out I was not syncing them, I’m not sure why (maybe it felt insipid while playing). Finally, I’m not entirely sure of where and how I can place the sensors on my body, but I’ll figure it out soon. Good night, and good luck.

something else

This short video was born after experimenting with sound generation processes. The only source of all sounds is the laptop microphone, as I was trying to expand the palette of one of the systems I’ve been using for my A/V instrument. In particular, I focused on the interaction between a modal synthesis unit and a granulator I recently build. The latter seems to be not so flexible and probably needs some more work, but since it is completely built in the signal-domain, it has an interesting sound, at times almost analog. I experimented with fundamental pitches of the modal synthesis, but I’ll get more variety as soon as I’ll start to dynamically modify the mutual relationships between each filter – that is to say, modify timbre. As I said, this is just one portion of the system I’m working on: this video doesn’t feature the actual contact mic, the electro-magnetic feedback from the television screen and the background resonances/feedback running in Pure Data. Nevertheless, it shows some work done on the sounds that I want to be the centerpiece of my “thing”.

I thought the general mood matched pretty well this phone-resolution video I shot some time ago.

From Audio/Vision to Transmedial Experience

Many different ways of approaching the idea of an Audio/Visual Ensemble are possible and it might be argued that vision and music were originally united in performative and ritual practices: for instance, archeological studies about prehistoric art deal with the same time scale in dating the birth of both painting and music. Moreover, many ritual practices around the globe involve this unity: a big fire, a circle of people playing musical instruments, other people dancing, clapping hands, singing along in colourful dresses, casting shadows around: all that is food for eyes and ears. Actually, also for nose and skin (think of the fire heat, of the contact with others’ skin and with the ground…). All that is no different from “occidental” disco clubs.

The way our ensemble approaches audio/visual projects directly comes from the idea of preserving this original unity: actually, we do not aim at recomposing it a-posteriori, but rather at making it the starting point from which to grow different fronds, all belonging to the audio/visual realm. Therefore, we understand the “/” sign as a continuum consisting of a range of experiences (including the strictly auditive and visual ones) that can be freely navigated. Free navigation, however, may well lead us out of the map: that is to say, other expressive forms may at a certain point be included in our projects, for example spoken words or performative arts. To be ready to greet them we should start to think of our group as a Transmedial Ensemble.

As a matter of facts, some of our projects already include performative elements. Timo’s instrument, for example, consists of system whose output massively involves light and sound; at the same time, though, it can only be appreciated considering its performance-driven nature. The source of it all is Timo playing a bass guitar – an action that, indeed, is one of the most ancient and well recognised ways of performing. Its performative nature becomes even more clear as Timo makes use of unconventional gestures. Russell and Jessamine have been working together on two different projects that go in an even more abstract direction, leaving behind the idea of playing a musical instrument and freeing the performative element. Marco’s system can be placed somewhere in between, as the physical interaction that “plays” it can vary from a percussive-instrument style to more “theatrical” gestures.

Submission 1: including analog

“Get away from the computer screen” one day Martin said, “for instance, you could make this TV flicker” he added. I had no idea how that could be done, but I had always been fascinated by small, old tv screens laying in a corner, maybe in a group of three or four, in exhibitions of different kind and atmosphere. Videoart, they call it. So I asked him, and he replied: “touch these contacts with an audio cable”. The next thing I know, I was in front of a screen (an analog one, though) that seemed to appreciate the wild noise I was feeding it in a very responsive way.
I then had the idea of putting together a small, local audio/visual unit, by placing a small speaker on the monitor. The idea is to have many of these units scattered around a performative space, maybe on the sides or behind the audience, and use them to create a counterpoint with the main stage. While I was experimenting with this setup, I discovered an interesting phenomenon: as I tapped on the surface of the speaker, a signal was generated that made the tv screen flash quite intensely. Actually the tv and the speaker receive exactly the same signal, which translates at the same time into light and sounds: this was achieved through a physical connection of the two, which connection eventually allowed the use of the speaker as a vibrational pick up (basically a bad dynamic mic). I ended up discarding this feature, but I took from that two different ideas: first of all, the use of contact mics to pick up physical interaction between a performer and an object (this is something I did before and I process the signal quite heavily); secondly, the idea of multidirectional flows of electrical signal, which eventually made me use an EM microphone (an electric guitar one) to pick up the EM field variations produced by the tv screen as it flickered. This signal is then processed in a very similar way to the contact mic one and fed back to the screens and the speakers, but it also enters a digital feedback system whose sound is then sent to main amplification.

A large part of the hacking happened in collaboration with Russell, while the guitar pickup belongs to Wolfgang Thomas.