Invoke Features

Invoke offers a new way of working with spatial audio. As a spatial audio production tool, it explores different ways to embody audio workflow. The main feature of the app is a voice-based drawing tool that is used to make trajectories for spatial audio objects.

Voice Drawing

The interaction dynamic for drawing trajectories is a new way to work with spatial audio. Combining input from the voice and hand provides a continuous space-time method of composition. Voice Drawing allows detailed production of trajectories to spatially and temporally mix tracks. For a user, you position a “pen” in virtual space, pull the controller trigger, make a sound with your voice, and build shapes in space. After creating a Voice Sketch, the line data transforms into a control point based bezier curve, a trajectory, that retains the volume information of the voice input. Then, by placing an audio object on a trajectory, an audio object’s volume is automated based on the recorded volume of the voice.

When using Voice Drawing collaboratively, the volume information from both collaborators is used to draw each line. This means two lines could be drawn with the same volume information in two different places. But also that when you or your partner draws a line, the resulting trajectory is not totally controlled by you.

Pen active before drawing
Voice drawing pre-trajectory render
Voice drawing pre-trajectory render
Trajectory, a voice drawing converted to a bezier line. Colour is taken from the audio source that is attached to it.

Object Interaction

Traditionally object selection and manipulation in VR is considered either direct or indirect. Direct interaction uses a natural metaphor, you grab a thing close enough to touch, like you would a cup or a ball. Indirect interaction uses a form of mediation to allow action at a distance, like picking up a car from a crane. Each of these methods gives different sensations of embodiment but also changes how to design spaces for action.

Direct Interaction
Direct Interaction
Indirect Interaction
Indirect Interaction

Invoke uses both direct and indirect action; this allows precise control but also extended interaction spaces. For the user, laser-based object interaction sits on top of direct spatial selection and manipulation. What this means is you can either walk up to an object and grab it or, from a distance, aim and grab an object. When holding objects, you can pull them closer or push them further away using controls on the hardware VR controller.

The system is built on top of the VR control system by Hecomi, VRGrabber.


While it would be possible to put all functionality “in-the-world”, it was decided to use a set of Menus to manage the various options and abstractions. There are three main menu types:

  • Mixer – a timeline and audio mixer to control gain, solo, mute functions and spatial parameters like Doppler, Reverb Send, and Volumetric Radius.
  • Trajectory Manager – a means to – overview, toggle visibility, delete – trajectories.
  • Hand Menu – a way to manage world space menus and other global settings.
Invoke VR audio mixer panel

This system is built on top of some great work by Aesthetic Interactive, Hover UI Kit.


Avatar used for each player

As a shared VR experience, the mapping of your body to the virtual space is an important feature. Using an Inverse Kinematics system a VR puppet maps to your movements. This system uses the HMD, controllers and a tracking puck attached to your waist. Sometimes the mapping can go a bit funny though…

An early experiment with IK and avatar scaling.

Assorted Features


The picture highlights that the level of opacity on an object has meaning. For instance, when an audio source is muted, the object has a see-through quality. Also to manage the complexity of the space, the trajectory lines can be made semi-transparent, this also removes access to control points.

Transparency of objects and lines

Getting Around and staying in touch

As the interaction space provided is quite large, a teleport system was added. Also as a shared experience, spatialised voice communication system is available.

Non-realistic scaling

Given the size of the interaction space, objects change size depending on their distance from the user, getting bigger the farther away they get. This does introduce a subtle set of issues but gains improved usability for selection and manipulation from a distance. One issue is the perceptual confusion of pushing something into the distance and it gets bigger. The other issue is that for each user, there is asymmetric perceptual information about space and objects.

Spatial Music Making in VR

Do you create electronic music or sound design? Or are you a student or professional in Audio / Music Technology? If so, I am running a study over the next few weeks (August and September) and would be great to have your participation!

You will be asked to collaboratively mix a short track using a shared VR spatial audio app. You will then be asked to complete a survey about your experience.

The study will take two hours to complete. All studies will be done in the Media and Arts Technology studios in the Engineering and Materials Science building of Queen Mary University of London, Mile End Road, Tower Hamlets.

Study slots were available from 13/08/19 to 28/08/19.

If you are interested in the context of the research I have some resources here:

Looking back at people looking forward

In 1995, Heath, Luff, & Sellen lamented the uptake of video conferencing indicating that it had not at the time reached its promise. But looking back at this projection, the ubiquity of video systems for social and work communication can be seen. And subsequently, research has gone about understanding it further in a variety of HCI paradigms (CHI2010, CSCW2010, CHI2018). So, for my research, making projections on the use of VR for music collaboration, it might be that findings and insights do not reach fruition, either, in a timely fashion, or in the domain of interest that they were investigated in, or ever! Though this could be touching on a form of hindsight bias.

Going back to the article that speculated on the unobtained promise of video conferencing technologies, Heath Luff, and Sellen (1995), provide a piece of insight that can still be placed into perspective on design interventions for collaboration:

It becomes increasingly apparent, when you examine work and collaboration in more conventional environments, that the inflexible and restrictive views characteristic of even the most sophisticated media spaces, provide impoverished settings in which to work together. This is not to suggest that media space research should simply attempt to ‘replace’ co-present working environments, such ambitions are way beyond our current thinking and capabilities. Rather, we can learn a great deal concerning the requirements for the virtual office by considering how people work together and collaborate in more conventional settings. A more rigorous understanding of more conventional collaborative work, can not only provide resources with which to recognise how, in building technologies we are (inadvertently) changing the ways in which people work together, but also with ways in which demarcate what needs to be supported and what can be left to one side (at least for time being). Such understanding might also help us deploy these advanced technologies.

The bold section highlights the nub of what I’m interested in; for VR music collaboration systems. I break this down into how I’ve tackled framing collaboration in my research:

  • conventional collaborative work – ethnographies of current and developing practice. Even if you pitch a radical agenda of VR workspace, basic features of the domain of interest need to be understood for their contextual and technical practices.
  • building technology is changing practice – observing the impact of design interventions on how people collaborate in media production. Not only does a technology suggest new ways of working, it can enforce them! Observing and understanding this in domain-specific ways is important.
  • what needs to be supported – basic interactional requirements, we have to be able to make sense of each other, and the work, together, in an efficient manner.
  • what can be left to one side – the exact models and metaphors of how work is constructed in reality, in VR we can create work setups and perspectives that cannot exist in reality. For instance, shared spatial perspectives i.e. seeing the same thing from the same perspective is impossible in reality as we have to occupy a separate physical space. In repositioning basic features of spatial collaboration, the effects need to be understood in terms of interaction and domain requirement. But the value is in finding new ways of doing things not possible in face to face collaboration.

Overall, the key theme that should be taken away is that of humans’ need to communicate and collaborate. In this sense, any research that looks to make collaboration easier is provisioning for basic human understanding. That is quite nice to be a part of.

Polyadic update: changing hands

Managed to get the VR version of Polyadic scaled down, instead of a massive panel you have to stretch across to operate on, the scaled down version is roughly the width of an old MPC. This is important for visual pattern recognition in the music making process, but also the sizing allows for alternate workspace configurations, that are more ergonomic and can handle more toys being added!

To get the scaled down features to work a tool morphing process has been designed. The problem is the Oculus Rift and HTC Vice controllers are quite large, especially in comparison to a mouse pointer. So by using smaller hand models when you are in the proximity of the drum machine you can have a higher ratio of control to display, with respect to less of the hand model being able to physically touch features in the interface.

Control-display (C-D) ratio adaptation is an approach for facilitating target acquisition, for a mouse the C-D ratio is a coefficient that maps the physical displacement of the pointing device to the resulting on-screen cursor movement (Blanch, 2004), for VR it is the ratio between the amplitude of movements of the user’s real hand and the amplitude of movements of the virtual hand model. Low C-D ratio (high sensitivity) could save time when users are approaching a target, while high C-D ratio (low sensitivity) could help a user with fine adjustment when they reach the target area. Adaptive Control-Display ratios such as non-linear mappings have been shown to benefit 3D rotation and 3D navigation tasks.

But the consequence of this mapping change will be an expressive difference. In the original prototype with the oversized wall of buttons and sliders, the experience of physical exertion might have been quite enjoyable? By reducing this down, a very different body space will be created, the effects of this remain to be tested. Subjectively it did feel more precise and coherent as a VR interface, less toy-like and comical. As mentioned in the introduction, the sizing can have implications for pattern recognition. The smaller size allows you to overview the whole pattern while working on it, whereas previously the size meant stepping back or craning your neck to take it all in. It would be interesting to know how much effect the gestalt principles of pattern recognition have on cognitive load in music making situations, given the time-critical nature of the audiovisual interaction.

Blanch, R., Guiard, Y. & Beaudouin-Lafon, M., 2004. Semantic Pointing – Improving Target Acquisition with Control-display Ratio Adaptation. Proceedings of the International Conference on Human Factors in Computing Systems (CHI’04), 6(1), pp.519–526. Available at:

Media comparison for collaborative music making

Image credit Nicolas Ulloa

Do you create electronic music? Are you a musician, producer, artist or DJ? Or are you a student or professional in Music / Music Technology? If so, I am running a study over the next few weeks (July & August) and would love your participation!

You will be invited to use and compare two different interfaces one in virtual reality and another screen-based. You will be asked to create some drum loops collaboratively with another person using the provided interfaces. You will then be asked to complete a survey about your experience.

The study will take two hours to complete, and you will be paid £25 for your participation. All studies will be done in Computer Science building on the Mile End Campus of Queen Mary University of London.

Study slots are available from 25/07/17 to 18/08/17. Monday-Friday – time slots at 10 am, 12.30 pm, 3.30 pm, and 6 pm. If none of these are suitable for you alternative arrangements can easily be made.

Unfortunately, this study has ended and further appointments are not being made.

If you are interested in the context of the research I have some resources here:

  • Polyadic: design and research of an interface for collaborative music making on a desktop or in VR.
  • Design for collaborative music making: some previous work on the user-centred design cycle involved in the progress of my PhD.

Lessons learned in VR dev

Following is some pastoral advice gained from doing a project in a new field when the brief is quite open. As with all advice, it depends on your personality!

Define the concept as simply as possible – if communication of the underlying concept isn’t clear, how will the implementation not be?

If its a good idea, follow it – When populating a design space with early concepts, tangents and ideas abound. These may diverge significantly from the original concept you thought of, but in the creative process this is perhaps the nature of ideas. As when balancing all the elements, hidden parameters and approaches appear. These are things you couldn’t perceive in your original constructs and perhaps hold a grain of something truly novel. If you don’t have a strict brief, let go and see where it leads.

Domain knowledge – when coming from a specialist field, such as audio, be wary of the perceived knowledge in users. Your domain knowledge and intellectual predispositions will guide your design space decisions. If you are not careful your ability to communicate to a wider audience will be doomed from the start, due to relying on existing interface metaphors that do not communicate effectively to new users. But if you focus the application to specific domains, these nuances can make it through a design process and be of use to the field more generally.

Concept Development: Possible Futures

A important concept in early development was the use of VR as a lense into imagined worlds. The work of Dunne and Raby on Speculative design was particularly persuasive. Their techniques include:

+ Fictional worlds

+ Cautionary tales

+ What if… scenarios

+ Counterfactual histories

+ Thought experiments

+ Reductio ad absurdum

+ Artefacts from the future

+ Pre-figurative futures

+ Small things big issues

+ Tell worlds rather than tell stories

These aspects are employed as alternative aesthetics that engage us in different ways, questioning technology, ideology, and technological vs social imagination


The Metaverse, as traditionally imagined, would be an unfiltered firehose of humanity. The Metaverse that people are actually trying to build would be, in a meaningful sense, a social network.  Most of its value is bringing people together socially, and letting them communicate with their friends and make new ones.  Putting everyone together into the same chaotic chatroom has less value than intelligently providing spaces where friends can hang out, as web-based social networks have proven. This concept would be engaged with a speculative frame in VR by posing the question of how algorithms would mediate our interaction and communications, with people and machines that are sharing the space. found this data visualisation quite stimulating, the debug view is quite attractive too

Jaap Drupsteen’s music visualisations are fantastic, particularly the one below was of interest at around 3:40 where the concrete structure is morphed into a twitching jittering mass of nodes. This was to be imagined as a transition of aesthetic to be employed in a VR experience to draw the users attention to the concepts of the experience.

Michael Chorost’s book World Wide Mind, increasing emotional communication.