I am a research fellow, conducting research into automatic analysis of bird sounds using machine learning.
—> Click here for more about my research.
I live in Tower Hamlets, the London borough with the largest proportion of Muslims in the UK. I see plenty of women every day who wear a veil of one kind or another. I don't have any kind of Muslim background so what could I do to start understanding why they wear what they do?
I went on a book hunt and luckily I found a book that gives a really clear background: "A Quiet Revolution" by Leila Ahmed. It's a book that describes some of the twentieth-century back-and-forth of different Islamic traditions, trends and politics, and how they relate to veils. The book has a great mix of historical overview and individual voices.
So, while of course there's lots I still don't understand, this book gives a really great grounding in what's going on with Muslim women, veils, and Western society. It's compulsory reading before launching into any naive feminist critique of Islam and/or veils. I'm sure feminists within Islam still have a lot to work out, and I don't know what the balance of "progress" is like there - please don't mistake me for thinking all is rosy. (There are some obvious headline issues, such as those countries which legally enforce veiling. I think to some Western eyes those headlines can obscure the fact that there are feminist conversations happening within Islam, and good luck to them.)
A couple of things that the book didn't cover, that I'd still like to know more about:
The Long Mynd is a range of hills in Shropshire. Very beautiful area this time of year. Lots of birds too. People often comment on the birds of prey: the buzzard, red kite and kestrel, soaring silently above and occasionally plummeting to pounce on something. Of course I'm more interested in the birds making the sounds all around.
I was most taken by the meadow pipits - as you walk around on the Mynd, they often leap surprised out of the heather and flitter away making alarmed "peeppeeppeep" sounds (or maybe more whistly than that, "pssppssppssp"). I saw a skylark too, ascending from the ground about 20 metres in front of me. It's great to witness it when they do that: an unhurried circling ascent, all the while burbling out their famously complex melodious song, like a little enraptured fax machine going to heaven.
While hanging around in the forest I noticed how many non-vocal bird sounds you can hear. The most common example is wing flutter sounds, I heard them from lots of different species, and the sound can often be very deliberate. The most surprising sound of all was when I was walking past a tree and heard a knocking sound and I thought, "Oh, is that a woodpecker starting up?" - but it wasn't. I could see the little bird on a branch a few metres away and it was a coal tit, doing a bit of a woodpecker impression. It would peck at the branch hard, about four times in a row, repeatedly, giving me the impression it might have been trying to do some DIY of some sort. It also tried it on a second branch.
Lots of gangs of ravens around too - their curious adaptable calls reminding me of the ones I saw recently at Seewiesen. I often heard (from a distance) the song of the nuthatch - that nice simple ascending note that I first encountered when camping in Dorset. Now and again a jay, lovely orangey and cyan colouring contrasting with its raspy magpie-ish yell. The jays seem to be shy around here, unlike the one that used to hang around in our garden in London.
Of course all the usual gang was there too: lots of robins singing, jackdaw, magpie, wren, house sparrow, blackbird, stock pigeon, one wood pigeon, the occasional chiff chaff. I think I heard a goldcrest at one point but I'm unsure. One willow warbler down by the reservoir.
Our journal paper Detection and Classification of Acoustic Scenes and Events is now out in IEEE Transactions on Multimedia! It evaluates many different methods for detecting/classifying in everyday audio recordings.
I'm highlighting this paper because it covers the whole process of the IEEE DCASE evaluation challenge that we ran a little while ago, with many international research teams submitting systems either for audio event detection or audio scene classification.
It was a big team effort, with various people putting many months of time in, from 2012 through to 2015 (even though it was essentially an unfunded initiative!). Specific thanks to Dimitrios and Emmanouil, who I know put lots of manual effort in, repeatedly, to get this right.
The International Bioacoustics Congress 2015 was a fantastic conference. Lots of fascinating research, in a great place (Murnau, Bavaria, Germany), and very well organised! In this note I want to capture some thoughts that it triggered, about the practical organisation of a conference.
The staff that faciliated the conference made it run very smoothly. There were helpful people in the downstairs office almost all week, to ask questions etc. I particularly appreciated the facilitation for conference speakers: downstairs, the organisers loaded our presentations onto the laptop and checked they worked; then upstairs, there was a sound engineer who very efficiently fitted us with the radio mic and opened the presentations. This kind of support was crucial to make it possible to have such a busy schedule: many sessions had only 15 minutes per speaker! So no time for messing around.
Various IBAC people said, and I agree, that it's vital to keep it as a single-track conference: that seems to be part of its friendly community atmosphere. This is tricky, as IBAC has grown so that the schedule is now tightly-packed, and one "easy" way to reduce the pressure would be to go multi-track. I suspect the biggest risk there is of splitting the community into taxa (birds, marine, anurans, etc). So if parallel sessions were to be used (not my preferred solution), it'd be better to do that with the "open" rather than themed sessions, as someone at the AGM suggested. (The mix of open and themed sessions was well-balanced here in 2015.)
Every day opened with a 60-minute keynote, which is a great and widely-used pattern. We then had 20-minute slots in the themed sessions, and 15-minute slots in the open sessions. In my home discipline I've never seen 15-minute talk slots, and I think that's too short. I think that 20-minute slots are good, as long as the chair insists on keeping some time for questions, since I personally believe that public discussion with conference speakers is a really important part of what conference presentations are for. The IBAC chairs didn't insist on this at all really, which is a shame. That aside, they were well hosted.
The poster sessions were lively and very interesting, but physically they were too full! It was often very difficult to even read the titles of posters, let alone talk to the person standing there, if one or two people were discussing a nearby poster. This could have been improved by having 4 separate sessions of 40 posters, rather than 2 sessions of 80 which were each repeated for two days.
So, as I've already implied, IBAC was very highly subscribed, with many talks and posters, and I've been suggesting it could be better if the programme was a bit less tightly-packed. How could this be done (without going multi-track)? One answer is to be more selective, i.e. to accept fewer abstracts. Immediately I want to highlight a risk of this: it's great at IBAC to have lots of student and early-postgrad presenters, so we would want to avoid a selection process that favoured big names or experienced abstract-submitters. (We'd also want to maintain a decent balance across taxa.) I'd suggest a simple quota: minimum 50% student or recently-graduated people, both for talks and for posters.
Being selective has a cost: interesting things get rejected. The quality of IBAC 2015 was high, there's no need to be selective for quality purposes. IBAC is currently every two years. I wonder if the IBAC community would be interested in having IBAC every year? There's clearly enough content for that. Would it suit the rhythm of the community? Could the IBAC steering committee cope with the doubled workload?
I find a printed programme absolutely essential. The 2015 organisers decided that many people don't want it because they use electronic versions, so printing it would be wasteful. That's fine, but for me and many others we need something. I think ideal would be simply to have a tick-box on the conference registration form, "Would you like a printed programme?" Simple to handle, and reduces unnecessary printing.
A few other miscellaneous thoughts:
Of course almost everything I've written is about general conference organisation, not just IBAC. These thoughts are spurred by conversations we had at IBAC, and spurred by the overall extremely good conference organisation. Massive thanks to the IBAC 2015 organisers and staff!
P.S. I previously blogged about the research at IBAC 2015.
The International Bioacoustics Congress 2015 was a fantastic conference. Lots of fascinating research, in a great place (Murnau, Bavaria, Germany), and very well organised. Here I'm making some notes on the interesting research topics I encountered. I can't list everything because almost everyone at the conference was doing something fascinating! What a niche this is ;)
This was my first IBAC. I'd say the majority of people were animal communication or animal behaviour researchers, plus ecologists, sound archivists, a composer or two, a couple of industry people and a couple of computer scientists. (I didn't spot any acousticians/physicists, I was wondering if I would.) Lots of great people talking about animal sounds.
My own presentations went down well, I'm pleased to say. I had a talk about our Warblr bird sound recogniser (here's the journal paper, Stowell and Plumbley (2014)), and a poster about inferring the communication network underlying the timing of animal calls. (From the latter, lots of good conversation about whether cross-correlation was a good tool for the job. My answer is that it's perfectly fine for pairs. For larger groups it's tolerable if you have enough data, but I have a better way... need to write it up.)
My colleague Rob Lachlan presented his really neat work on vocal learning in chaffinches. Apparently chaffinch syllable transmission is one of the most precise cultural transmission processes that's yet been quantified. I'd imagine he could tell you more about how that might relate to questions of the birds' innate biases etc.
Now here are some good things that were new to me. (Note that I'm quite a bit biased towards birds rather than the other taxa.) I'll save all the zebra finch items until the end since they're interrelated and something I'm currently thinking about. First, miscellaneous highlights:
Stefan Schoeneich described tracing the exact set of neurons responsible for call detection in a cricket species. This was fascinating - it's only a handful of neurons, so how to the crickets do it? Stefan described how post-inhibitory rebound is a crucial piece of the puzzle since it's a very simple neural phenomenon that provides a "delay line" that the cricket uses to detect repetition. The important thing is that this delay line is the same mechanism in the caller and the listener. This enables co-adaptation: evolution can change the repeat rate without breaking the communication channel. (Rohini Balakrishnan told me afterwards that this is not a new idea, though it's novel to me - Stefan's contribution is to demonstrate the exact network that uses this mechanism.)
Diego Llusia presented a playback experiment to modify the timing of dawn choruses. Interesting to see this: playbacks often involve a single species, but this investigated the timing of a whole assemblage of chorusing bird species. The study raised lots of good questions - it'd be good to see more development of this line of inquiry.
There was a good session on female vocalisations (led by Michelle Hall). From a European-biased perspective we often think of birdsong as being largely a male preserve. Karen Odom talked about the patterns of usage in one species (troupials). The main thing I note is actually her finding published last year (Odom et al 2014, Nature Communications) that female song is highly likely to be ancestral in songbirds, i.e. the reason it's seen less often in the northern hemisphere is that it was dropped (multiple separate times) by evolution, as songbirds radiated north. Lauryn Benedict then discussed why this might be. Maybe we can find correlates in life history - i.e. maybe the songbirds that dropped female song concomitantly developed some other communication or behavioural pattern, and this might help us understand what happened? Lauryn's study found no correlation either with migration or dichromatism. She noted that studying this is tricky because although lots of songbirds are described as having no female song, in many cases this might be due to our own biases and failure to spot it (especially in non-dimorphic species). Lauryn showed that her lack of correlation was robust to this issue.
Coen Elemans showed his work on physical modelling of the songbird syrinx. He found that the "myoelastic aerodynamic" model (developed in the context of the human larynx) works well for the syrinx. This was a surprise to me, since many songbirds have two oscillators in the syrinx rather than one, and I would have suspected the model might noticeably fail to account for interactions between them. It seems his model is tested for bird species with relatively independent sets of vocal folds, so maybe this suspicion is yet to be fully tested.
Lots of interesting discussion around acoustic diversity indices during the ecoacoustics session (led by Jerome Sueur). I remain to be convinced that we have robust useful "acoustic index" measurements directly from the audio signal without heavy user configuration. In that context it was interesting to hear from the experience of others. For example Nadia Pieretti found the ACI useful and robust for her shallow marine soundscapes, while Gianni Pavan working with forest soundscapes found it too strongly affected by weather sound (wind, rain).
Karla Rivera-Caceres showed that when plain wrens develop a duet code - meaning a specific choice of syllables to combine into their duet - it's due to learned association between the syllables, and not a private code designated by individual ID.
Karen Rowe talked about automatic detection in practice - really interesting from my point of view, to see how people fare when they use automatic detectors for their immediate practical work. She had deployed Songmeters in the Grampians, using an occupancy framework, which means that they only need to know presence/absence not the whole set of calls - a single positive detection is all that's needed. They tested a two-pass approach with an initial detection pass, then a second pass using some of the already-detected syllables as templates. They found that the manual work involved (in checking false positives, tweaking the classifier etc) meant that the automation was not in fact more efficient! In their case it was approximately as efficient to do fully manual annotation.
Peter Slater gave an evocative talk on their study of many wren species. He noted various things about duetting, and male and female song, finding that these traits correlate with phylogeny. It seems wrens have, multiple times, developed introductory phrases to lead in to duets - that's an interesting fact, food for thought.
Andrea Thibault showed us the behaviour of foraging seabirds, and the calls they make just before diving - apparently to warn others of the impending dive.
Lisa Gill showed a poster on jackdaw "addressing" call. We (with Rob too) had a good chat about how to computationally analyse corvid "caw"-like sounds - still very tricky and non-obvious. Lisa also told me about her paper just accepted for eLife about zebra finch social networks and call patterns - very pertinent to me! Look forward to reading it.
A nice session on comparative work with music, speech and language (led by Carel ten Cate). Marisa Hoeschele described that songbirds are - in general - sensitive to absolute pitch not relative pitch. They're much easier to train to discriminate absolute pitch variation rather than relative. (This is notably unlike humans!) She then showed her experimental evidence that black-capped chickadees can do relative pitch discrimination, but they're much better at it when the stimuli are made of chickadee syllables rather than pure sinewaves. Particularly interesting since the chickadee syllables are fairly pure-tone, not harmonic stacks, so the difference might not be the presence of harmonics. It also shows that a simple pitch-following model is not sufficient to explain their good performance, there must be some other attribute that makes things accessible to them.
Vera Klimsova gave us all a lesson in how to listen like an impala, to alarm calls from other species (including other species that don't live near impalas). She also gave us all a lesson in how to do a talk when you're the last speaker of a 5-day conference - an entertaining and memorable talk!
Now the zebra-finch-based research:
Solveig Mouterde described her work on how zebra finch calls degrade as they propagate through the environment, and how that affects individual recognition, both for zebra finch listeners and for machines. I'd like to see more of this kind of work because I think there are still many issues that are not completely addressed by some of the older bioacoustic concepts. For example Solveig referred to "active space" - a useful concept, but one that needs to incorporate the complexities of perceptual and acoustic variation before it really gets to the issue of how far an animal can be heard. Solveig's work goes towards addressing that.
Pietro d'Amelio talked about duetting in zebra finch mate pairs, showing very consistent antiphonal calling patterns, some symmetrical, some asymmetrical.
Had a good chat with Manfred Gahr and Albertine Leitao about how to measure tutored vocal learning in zebra finches. I have an idea that it could be done usefully with feature learning, which would be good to study some time.
Nicole Geberzahn studied how individuality emerges in zf song, through an experiment with many tutors who were themselves all taught from the same song. She found that new phrases emerged by mechanisms such as repeating whole phrases or adding call-like syllables onto the end. In a recognition test, zf listeners heard individual identity to be encoded in syllable details, not in phrase structure.
Andries ter Maat presented the work of his student Hanneke Poot, finding that pupil syllables are often not shared with any of the tutors, and complete copies of tutor songs are very rare. (Unlike Nicole's test mentioned above, in this case the tutors were quite varied.) Also that zfs don't particularly choose their genetic or social father to learn from. He also noted that Tchernikovsky's sound similarity measures (as calculated by Sound Analysis Pro, that is) can depend strongly on the syllable type, so you need to apply some kind of standardisation procedure if you want to make global similarity comparisons.
Marie S A Fernandez looked at the calling patterns of zf pairs when they are together, separated, and then reunited. She found that the cross-correlation or Markov analysis found strong back-and-forth structure only while the birds were separated. (I wonder: if we could include all visual and other cues, would there in fact be a detectable structure in all cases? It would be a different structure with/without visual contact, presumably. Very hard to annotate all possible multimodal cues though.) Perez et al (2015)
Clementine Vignal studied zf negotiation over parental care, finding that the length of some zf conversations could predict the subsequent balance of parental care. (This correlation was over and above the obvious factors such as how much nest-time each parent had recently spent.)
Buddhamas Pralle Kriengwatana presented an experiment in which zebra finches were trained to discriminate very short audio clips of human "i"/"e" vowels. She showed that once trained, the zfs can generalise to clips from another language (with slightly different formant positions), which demonstrates a generalisation ability that is not just about formant frequencies, possibly some relative rather than absolute distinction. For me there's a niggling question: formants are not the only way that vowels differ - there's also aspiration etc - so I'd be interested to know how such confounds were avoided when using real speech recordings. Pralle's suggestion seems plausible, though, that the ability could be explained by a perceptual mechanism based on using the sound to infer some physical trait such as the volume of the mouth cavity.
To all at IBAC: my apologies if I misrepresent you here, missed you out, or misspelt your name! In particular I didn't manage to see much of the second poster session since I was myself presenting a poster.
At the end of the conference there was an organised visit to MPIO Seewiesen, where a lot of good bird studies are happening. I was most struck by the magnificent ravens, living in outdoor aviaries and showing off their awesome vocal skills.
What else? Well, lots more. A great hike organised in the wetlands around Murnau (Murnauer Moos). Bavarian beer and food. The mountains as a backdrop...
So, our Warblr bird sound recognition app has been out for almost a month, and we've had many thousands of people using it and submitting bird audio recordings (thanks!). We've also had lots of great reviews in the consumer press. (Listen to this evocative piece on BBC Radio Scotland, fast-forward to 1hr 43.)
One thing which we knew was going to happen was that some people would demo it by playing back sound recordings into the mic, rather than recording actual birds. After all, sound recordings are easier to grab... What I didn't realise, from my own perspective, is that people would think this was a good way to test the app.
Playing back recordings is usually a really bad way to test the app, or any sound recognition app really, because recorded sounds differ in many many ways:
All of these things make the audio drastically different from a genuine direct recording, even though our human ears are clever enough to understand the correspondence. Yes, ideally a system would be as clever as our human ears, but that's for the future. (Note the difference from a product like Shazam, which recognises recordings but does not recognise the real live musician... interesting eh!)
Plus there's yet another aspect to consider: we make use of your location to help determine what kind of bird is likely. This is thanks to the BTO whose amazing crowdsourced bird data helps us know which birds to expect where and when. So, if you're playing a sound file that isn't native to where you are, our system is doubtful that the bird is there... and quite rightly doubtful, perhaps.
I can't emphasise enough that playing back recorded sounds is not the best way to test. We can't prevent people from doing this, of course! That's fine, but always bear in mind that you didn't test it in proper field conditions, only at your desk. You're not testing a bird recognition app if you're not testing it against real wild birds...
A baked germanic cheesecake with blackberries and lemon curd. Yes please. Makes a cheesecake for 12 slices.
Put the oven on at 180C. Line a round springform cake tin (7" diameter maybe) with greaseproof paper.
Crush the biscuits roughly in a bag, and melt the butter/marge in a pan or in a microwave. Mix the biscuits and butter/marge well then press it into the tin, forming an even base all the way to the edges. Put in the oven for 10 minutes, then take it out and leave it out to cool. If you have time, put it in the fridge for up to an hour to firm up.
Turn the oven down to 140C.
Beat the quark, cream cheese, icing sugar and two egg yolks together.
Whisk the egg whites to stiff peaks. Then fold them gently into the quark mixture, with a wooden or plastic spoon, taking care not to over-mix (which would take the air out).
Now to assemble the thing in layers:
Now bake this in the oven, for about 90 minutes. (Cooking slowly, at 140 rather than 180, is so that it doesn't brown on top, or at least not much.) Turn off the oven and let the cheesecake cool in the oven, with the door ajar (cooling it slowly helps prevent cracking, though when using the blackberries it's quite unlikely you'll avoid all cracking). Refrigerate.
Serve with some blackberry coulis if you have more blackberries! Not necessary though. It's great as-is - ideally you should get it out of the fridge a while before you eat it so it isn't too chilly.
What tracks would you take into a shop to test out a hifi?
FWIW here's what I'm thinking.