Beep. Boop. That’s the sound of my commute
How an impulse buy and a bit of playing around with Python led to an ‘almost’ musical experience of my daily commute.
In spare moments, I find myself strangely drawn to discount site Wish. On a whim, I forked out a few quid on a go-pro rip-off. It lasted less than a minute before it developed a fault. But it kind-of-works (and I got a full refund!). So I stuck my now free camera on my helmet and filmed my cycle commute.
I was interested how I might ‘see’ the commute in a way that didn’t mean sitting through 20 minutes or so of me huffing and blowing down various streets. I remembered a project by ace designer Brendan Dawes called CinemaRedux.
He created ‘visual fingerprints’ of well known films by taking one frame every second of the film and laying them out in rows of 60; one row for every minute of running time. They are fascinating and give an interesting perspective on the film, especially the use of colour. I thought this would be a nice way to see my commute.
Brendan Dawes CinemaRedux of Taxi Driver.
A while ago a developer called Ben Sandofsky created an app called Thumberwhich creates them, but it didn’t work on my mac (it was built for leopard). So, having recently dipped my toes into python programming (I’ve been scraping twitter for some research), I thought why not see if I could do it using Python.
A lot of GiantCap development later and I got it to work. The result…
You can see its no Taxi Driver. But there’s the occasional splash of green in the grey of the road and Manchester sky. As a ‘fingerprint’ of my journey, I think it works well. The final python code that makes them is available on GitHub.
It’s clunky and inefficient. But it works and I was inordinately pleased just by the fact that it doesn’t crash (much). So what could I do next with my new found programming powers?
What does my commute sound like?
In my last job, one of the PhD students. Jack Davenport (he does some really cool stuff btw.), was working on a project called the sound of colour which explored playful ways to make music that broke away from standard interfaces like keyboards etc. One experiment included constructing a large table that users could roll loads of coloured balls across. A camera tracked the balls and converted their position and colour into data to play sounds and loop. I loved the idea. Maybe it was there in the back of my mind when I thought it might be cool to work out what the cinemaredux of my commute sounded like.
Sonification of data
Making data audible is not a new concept. As well as a healthy and diverse electronic music scene, there’s a growing and scarily creative programmers and musicians experimenting with real time coding of music. There’s also loads of interesting stuff around using it to explore research data. It’s even got a bit of a foothold in data journalism. Check out the The Center for Investigative Reporting piece on creating an audio representation of earthquake data by Michael Corey @mikejcorey. There’s code and everything. On that note you should also check out Robert Kosara’s piece Sonification: The Power, The Problems. But I digress.
After some reading around I settled on the following basic idea;
- analyse each image generated by my cinemaredux script and work out what the dominant colour was in each. But I didn’t want one note per picture, the information was too rich for that. But at the same time I didn’t want to create loads of notes from pixels in the image. I needed to filter the data somehow.
- convert the RGB values of each colour into a MIDI note. I chose MIDI because it gave me the most flexibility and I had a vague idea of how it worked left over from my distant past in music tech. It’s a essentially a data file with what note to play, when and for how long. No sounds etc. I thought this would be easier — once I had data from the image it would just be a case of converting numbers. It would also give me more room to experiment with what the data ‘sounded’ like later on.
Skipping over a good deal of frustrating cut-and-paste, I finally got a script together that would take each frame of the video and give me a range of the dominant colours or ‘clusters’. Converting those into notes and duration didn’t take too much messing around and, thankfully there are some very easy to use Midi libraries for Python out there!
I ended up with each image generating a kind of arpeggio from a cluster — each colour represents a note that plays for a duration equals to the ‘amount’ of that colour in the image analysis. I could have made them play at the same time for a chord, but I knew that would sound odd and the rise and fall of the notes seemed to suit the idea of motion more.
Here’s the first test output from the script — A random image of my daughter messing with the camera, analysed for four clusters. The resulting midi file was run through Garageband with a random instrument (chosen by my daughter) and looped over few bars. It grows on you! (note: the soundcloud embed is a bit flakey on chrome )
Applying the same analysis to my cinemaredux images was just an exercise in time — more images take more time to analyse. But eventually I got a midi file and this is the result. (note: the soundcloud embed is a bit flakey on chrome )
Like my thumbnail experiment I’m happy with the result because, well, it works. At some point I may do a more technical post* explaining what I did. For now though, if you want to see the code and see if you can get it to work, then head over to github.
Some further work
It would be nice if the code was neater and faster, but it works. Where it falls down is in timing. The duration of the midi file is much longer than the actual journey. That means some experimenting with the ratio of notes, tempo and number of images. But I’m happy with the result so far. I’ve also a few more ideas to try:
- It would be nice to have a version that was more ‘tuneful’ in the traditional sense. In tutorials I’ve read, like Michael Corey’s earthquake piece mentioned earlier, its common to* tune the data* by mapping the values to a key e.g moving all the notes so they are in Cmajor scale. That way I guess, I could risk generate a chord for each image without it sounding like I’m constantly crashing my bike.
- It would also be nice to look break colours up across musical tracks. Low value RGB colours like black and grey could be used to play bass notes and higher value colours on another track to play melody.
- By using MIDI I’m not limited to playing ‘instruments’. I could, for example, use samples of the environment I cycle through and then ‘trigger’ them using the notes. e.g. red plays middle c which triggers the sound of a car. It’s also possible to use data to filter sounds. So I could use the sound from the head camera itself and use the data to apply filters and other effects over its duration.
- Finally, it would be nice to create a cinemaredux style image just of the colours selected, like a colour based piano roll or musical score.
You might be reading this and thinking why? You may listen to the ‘music’ and really think ‘MY GOD MAN! WHY?’ But the process of thinking about how data points can be ‘transformed’ was fun and I’m now a lot more confident using Python to structure and manage data
There’s a lot of assumptions and work-arounds in this script. The process of making the content more musical alone means a level of engagement with music theory (and midi) that I’m not really up for right now. The more I dive into some of the areas I’ve skated over in the script, the more I become aware that there’s also similar work out there. But my approach was to see how quickly I could get a half-baked idea into a half-made product.
For now I’d be interested in what you think.
*Essentially when looking for scripts to average out the colour of an image I came across a method called k-means clustering for colour segmentation. That’s what is used to generate the stacked chart of colours. That gave me the idea for the arpeggio approach.