Creating Audio-Visual Metaphor

With the gifts given by the nature, we are able to see, to hear, to smell and to touch. Such sensory perceptions help us human beings to get aware of what is happening in our surroundings, and keep us away from danger so that we could survive. Thanks to the efficiency of nature, the total result that we get from these sensory perceptions is actually far more than a simple addition of each aspect.

In a piece of audio-visual artwork, the audio-visual language covers two fundamental types of sensory perception. What the audience expect when they seeing the visual effects and hearing the sonic contents is definitely not only a simple rhythmic coherence between audios and visuals but also the abstract concept or metaphor that represented or constructed by both the internal relationship between audio and visual elements and such elements themselves.

In terms of audio-visual ensemble, we could consider audio and visual as two players in an ensemble band. Rather than producing sound reacting to video or playing video reacting to audio, what this band could do and should do is organizing audio and visual elements together as some particular metaphors and integrating these metaphors together to create the narrative of performance. As we know, instruments in a band are played according to the well-composed structure in order to get an effect of 1+1>2. And in an audiovisual ensemble we could do the same way. Audio and visual elements don’t necessarily go along with each other, but organized together to be complementary to each other.

A filmmaker, Sergei Eisenstein, maintained that in filmmaking audio and visual elements should not accompany each other in synchronization dependently, but should be structured into a more complicated composition (Robertson, 2011). As filmmaking is also a process of expressing audio-visual language, this theory could also be applied in audio-visual live performance. Another area that is deploying audio-visual language is the game industry. One emerging mobile game, Monument Valley, provides a good example for the complementing effect of audio and visual elements. In the process of puzzle solving, when controllers are handled by players, they produce some notes as well as something happens visually, and different notes represent different levels of change by turning the handler while such difference might not be quite obvious if players observe with their eyes. In this way, sound contributes more to the gameplay rather than just be the sound effect that adds brilliance to the present splendor.

Therefore, in the construction of an audio-visual ensemble performance, while the rhythmic coherence shows the external link between audio and visual elements, the creation and organization of metaphor reveals the internal integration of audio and visual.

As far as I’ve concerned, method for creating audio-visual metaphor could be concluded into two types.

One way is using different combination of both audio and visual elements to create different metaphors that represent specific physical event or emotional situation. For example, while the combination of soft music and cloudy day might deliver a sense of sadness, the combination of intense music and cloudy day could express the emotion of anger. On the other hand, leisurely sound with sunny day makes people feel relaxing while rhythmic music with sunny day makes them feel excited.

Another way is representing a particular phenomenon or feeling with either audio or visual element, and organizing such elements together to create a context for narrative. Then the storytelling flow could be developed by composition in this context. According to my own experience, the creation of metaphor in our project is actually following this way.

We chose the allegory of cave as the framework of narrative. While constructing the storytelling line, we allocated those existing resources as representations of different elements in the allegory. Audience was set in a situation as the character in the allegory, the prisoner that chained in the cave. Graphics on the main screen worked as a representation of subjective consciousness of the character who was only allowed to see shadows in front of him. On the screen there were lights controlled by audio input and shadows that came along with lights. Obviously shadows represented the opinion of character about the world even though they were actually illusion. Lights were providing an environment for narrative. Through the flicking and switches of colours, these lights provided “keynotes” for each section of the story flow. At the same time, video projected on the surrounding walls was an abstract metaphor of the reality outside the cave, and the combination of abstract lines and colours revealed the crash and struggle of opinions.

While sonic contents structured the story flow, visual elements filled the space by illustrating the narrative section with different combination of metaphor. At first as prisoners were only able to see shadows in front of them, the visual effects existed mainly on the main screen. Shadows in this part seemed scary and evil. As the story developing, struggle appeared in the characters’ thinking with the lines and colours. While the character escaped from cave, videos were displayed on walls and shadows were muted. The with the return to cave, three visual elements finally showed at the same time as new and old theories crash together and create a dramatic conflict. At the end of the performance, while video on walls implied a probable sad ending for the innovator, shadows on screen illustrated some symbols of beautiful existence of real world which implied that seed of wisdom was already planted in people’s mind.

By creating audio-visual metaphor and allocating these metaphors on the time line appropriately the audio and visual elements of the project finally integrated together tightly. However there are still a lot details need to be improved. In my opinion the combination of metaphors could still be diversified and contents of metaphor in different section could still be differentiated to gain a better dramatic effect.


Robertson, Robert. 2011. Eisenstein on the Audiovisual: The Montage of Music, Image and Sound in Cinema. New York, NY: I.B. Tauris.

Gadassik, Alla. 2013. A Review of “Eisenstein on the Audiovisual: The Montage of Music, Image and Sound in Cinema”, Quarterly Review of Film and Video, 30:4, 377-381.

Leave a Reply