Looking back at people looking forward

In 1995, Heath, Luff, & Sellen lamented the uptake of video conferencing indicating that it had not at the time reached its promise. But looking back at this projection, the ubiquity of video systems for social and work communication can be seen. And subsequently, research has gone about understanding it further in a variety of HCI paradigms (CHI2010, CSCW2010, CHI2018). So, for my research, making projections on the use of VR for music collaboration, it might be that findings and insights do not reach fruition, either, in a timely fashion, or in the domain of interest that they were investigated in, or ever! Though this could be touching on a form of hindsight bias.

Going back to the article that speculated on the unobtained promise of video conferencing technologies, Heath Luff, and Sellen (1995), provide a piece of insight that can still be placed into perspective on design interventions for collaboration:

It becomes increasingly apparent, when you examine work and collaboration in more conventional environments, that the inflexible and restrictive views characteristic of even the most sophisticated media spaces, provide impoverished settings in which to work together. This is not to suggest that media space research should simply attempt to ‘replace’ co-present working environments, such ambitions are way beyond our current thinking and capabilities. Rather, we can learn a great deal concerning the requirements for the virtual office by considering how people work together and collaborate in more conventional settings. A more rigorous understanding of more conventional collaborative work, can not only provide resources with which to recognise how, in building technologies we are (inadvertently) changing the ways in which people work together, but also with ways in which demarcate what needs to be supported and what can be left to one side (at least for time being). Such understanding might also help us deploy these advanced technologies.

The bold section highlights the nub of what I’m interested in; for VR music collaboration systems. I break this down into how I’ve tackled framing collaboration in my research:

  • conventional collaborative work – ethnographies of current and developing practice. Even if you pitch a radical agenda of VR workspace, basic features of the domain of interest need to be understood for their contextual and technical practices.
  • building technology is changing practice – observing the impact of design interventions on how people collaborate in media production. Not only does a technology suggest new ways of working, it can enforce them! Observing and understanding this in domain-specific ways is important.
  • what needs to be supported – basic interactional requirements, we have to be able to make sense of each other, and the work, together, in an efficient manner.
  • what can be left to one side – the exact models and metaphors of how work is constructed in reality, in VR we can create work setups and perspectives that cannot exist in reality. For instance, shared spatial perspectives i.e. seeing the same thing from the same perspective is impossible in reality as we have to occupy a separate physical space. In repositioning basic features of spatial collaboration, the effects need to be understood in terms of interaction and domain requirement. But the value is in finding new ways of doing things not possible in face to face collaboration.

Overall, the key theme that should be taken away is that of humans’ need to communicate and collaborate. In this sense, any research that looks to make collaboration easier is provisioning for basic human understanding. That is quite nice to be a part of.

Software Architecture for Polyadic

The Polyadic interface enables collaborative composition of 16 step drum loops to accompany backing tracks in 4 different genres of electronic music for two or more co-located participants utilising two user interface media, Virtual Reality (VR) and Desktop (DT).
To accommodate the cross-platform development of the system an overview of programming paradigms was made, to determine an architecture for quick prototyping. The architecture design goals were:
  • Ease of feature development for testing multiple approaches.
  • Deterministic network interaction with interfaces.
  • Modular code structure to allow a Git-Flow style of development with parallel feature development not causing merge nightmares.

The final architectures included:

  • Entity-Component-System – the final winner, talk more about this later.
  • Dependency Injection / Inversion of Control – lots of supporters, but it all seemed a bit weird to set up and work with for this application. Initial tests were positive, but the style of the structure started to annoy me.
  • Model-view-controller – classic, solid design pattern. But scaling it to maintain a tidy feature set for the cross-platform network application felt dangerous. I saw it turning into a pseudo pattern, where best intentions are kept but the flexibility and my laziness would make me turn it into illogical spaghetti.
  • Hack away – what I have done previously, a lot. The speed of just bringing functions together however you like is always appealing in the short term, like a really fatty burger, but it will shorten your life somehow.


Herehere, and here are some good introductions/discussions by Maxim Zaks, a major contributor to the Entitas ECS framework. To summarise, ECS, and specifically Entitas, reduce everything down to data, groups of data, and systems that act on data. This is very different from classic OOP design and required a little retraining of my process and thinking. I made many mistakes and introduced a lot of pseudo-dependencies during this process. Now, in the process of refactoring, after doing some other projects with it, I am rooting out these pseudo-dependencies and reducing the reliance on wasteful LINQ operations. In the end, it mostly produced decoupled code that allowed very feature-driven development. As I’m working by myself on the project, I haven’t got into the unit testing possibilities, but these are said to be great.

So when not to use ECS?

Building frameworks for others or purely computational systems, see this for discussion. Though I am toying with a fully ECS driven audio signal processing idea, might be folly though…

Positive future

Also, the good news is that by choosing ECS, I have started to train myself in the current path that Unity is taking multithreaded systems, so that’s nice! But as this is still in early development I will stick with Entitas.

Polyadic update: changing hands

Managed to get the VR version of Polyadic scaled down, instead of a massive panel you have to stretch across to operate on, the scaled down version is roughly the width of an old MPC. This is important for visual pattern recognition in the music making process, but also the sizing allows for alternate workspace configurations, that are more ergonomic and can handle more toys being added!

To get the scaled down features to work a tool morphing process has been designed. The problem is the Oculus Rift and HTC Vice controllers are quite large, especially in comparison to a mouse pointer. So by using smaller hand models when you are in the proximity of the drum machine you can have a higher ratio of control to display, with respect to less of the hand model being able to physically touch features in the interface.


Control-display (C-D) ratio adaptation is an approach for facilitating target acquisition, for a mouse the C-D ratio is a coefficient that maps the physical displacement of the pointing device to the resulting on-screen cursor movement (Blanch, 2004), for VR it is the ratio between the amplitude of movements of the user’s real hand and the amplitude of movements of the virtual hand model. Low C-D ratio (high sensitivity) could save time when users are approaching a target, while high C-D ratio (low sensitivity) could help a user with fine adjustment when they reach the target area. Adaptive Control-Display ratios such as non-linear mappings have been shown to benefit 3D rotation and 3D navigation tasks.

But the consequence of this mapping change will be an expressive difference. In the original prototype with the oversized wall of buttons and sliders, the experience of physical exertion might have been quite enjoyable? By reducing this down, a very different body space will be created, the effects of this remain to be tested. Subjectively it did feel more precise and coherent as a VR interface, less toy-like and comical. As mentioned in the introduction, the sizing can have implications for pattern recognition. The smaller size allows you to overview the whole pattern while working on it, whereas previously the size meant stepping back or craning your neck to take it all in. It would be interesting to know how much effect the gestalt principles of pattern recognition have on cognitive load in music making situations, given the time-critical nature of the audiovisual interaction.

Blanch, R., Guiard, Y. & Beaudouin-Lafon, M., 2004. Semantic Pointing – Improving Target Acquisition with Control-display Ratio Adaptation. Proceedings of the International Conference on Human Factors in Computing Systems (CHI’04), 6(1), pp.519–526. Available at: http://doi.acm.org/10.1145/985692.985758.

Adaptive landscapes

A little experiment on how to modulate a mesh using a video and a sound file.

Used the following to achieve the effect.


  • Add video file to your assets
  • Copy the Tesselation example from Wireframe shader samples
  • Remove the animated control script that controls the material values
  • Create a render texture for the video frames to go to
  • Replace the displacement texture of the material with the render texture
  • Using Klak, grab the RMS of an audio source in the scene, map this to the displacement height of the material/shader.


Media comparison for collaborative music making

Image credit Nicolas Ulloa

Do you create electronic music? Are you a musician, producer, artist or DJ? Or are you a student or professional in Music / Music Technology? If so, I am running a study over the next few weeks (July & August) and would love your participation!

You will be invited to use and compare two different interfaces one in virtual reality and another screen-based. You will be asked to create some drum loops collaboratively with another person using the provided interfaces. You will then be asked to complete a survey about your experience.

The study will take two hours to complete, and you will be paid £25 for your participation. All studies will be done in Computer Science building on the Mile End Campus of Queen Mary University of London.

Study slots are available from 25/07/17 to 18/08/17. Monday-Friday – time slots at 10 am, 12.30 pm, 3.30 pm, and 6 pm. If none of these are suitable for you alternative arrangements can easily be made.

Unfortunately, this study has ended and further appointments are not being made.

If you are interested in the context of the research I have some resources here:

  • Polyadic: design and research of an interface for collaborative music making on a desktop or in VR.
  • Design for collaborative music making: some previous work on the user-centred design cycle involved in the progress of my PhD.