Other things on this site...

Evolutionary sound
Listen to Flat Four Internet Radio
Learn about
The Molecules of HIV
Make Oddmusic!
Make oddmusic!

SuperCollider running ultra-low-latency on Bela (Beaglebone)

OK now here we've got lots of lovely good news. Not only have my colleague Andrew McPherson and his team created an ultra-low-latency linux audio board called Bela. Not only can it do audio I/O latencies measured in microseconds (as opposed to the usual milliseconds). Not only did it just finish its kickstarter launch and received eleven times more funding than they asked for.

The extra good news is that we've got SuperCollider running on Bela. So you can run your favourite crazy audio synthesis/processing ideas on a tiny little low-latency box, almost as easily as running it on a laptop.

Can everyone use it? Well not just yet - the code to use Bela's audio driver isn't yet merged into the main SuperCollider codebase, and you need to compile my forked version of SC. So this blog is just to preview it. But we've got the code, as well as instructions for compiling, in this fork over here, and two of the Bela crew (Andrew and Giulio) have helped get it to the point where now I can run it in low-latency mode with no audio glitching.

Where do we go from here? It'd be nice if other people can test it out. (All those Kickstarter backers who are receiving their boards sometime soon...) There are a couple of performance improvements that can hopefully be done. Then eventually I hope we can propose it gets merged in to the SC codebase, perhaps for SC 3.8 or suchlike.

Monday 4th April 2016 | IT | Permalink

A static site generator for a research group website using Poole and git

The whole idea of static site generators is interesting, especially for someone who has had to deal with the pain of content management systems for website making. I've been dabbling with a static site generator for our research group website and I think it's a good plan.

What's a static site generator?

Firstly here's a quick historical introduction to get you up to speed on what I'm talking about:

  1. When the web was invented it was largely based on "static" HTML webpages with no interaction, no user-customised content etc.
  2. If you wanted websites to remember user details, or generate content based on the weather, or based on some database, the server-side system had to run software to do the clever stuff. Eventually these evolved into widely-used "content management systems" (CMSes) - such as drupal, wordpress, plone, mediawiki.
  3. However, CMSes can be a major pain in the arse to look after. For example:
    • They're very heavily targeted by spammers these days. You can't just leave the site and forget about it, especially if you actually want to use CMS features such as user-edited content, or comments - you need to moderate the site.
    • You often have to keep updating the CMS software, for security patches, new versions of programming languages, etc.
    • They can be hard to move from one web host to another - they'll often have not-the-right version of PHP or whatever.
  4. More recently, HTML+CSS+JavaScript have developed to the point where they can do a lot of whizzy stuff themselves (such as loading and formatting data from some data source), so there's not quite as much need for the server-side cleverness. This led to the idea of a static site generator - why not do all the clever stuff at the point of authoring the site, rather than at the point of serving the site? The big big benefit there is that the server can be completely dumb, just serving files over the network as if it was 1994 again.
    • That gets rid of many security issues and compatibility issues.
    • It also frees you up a bit: you can use whatever software you like to generate the content, it doesn't have to be software that's designed for responding to HTTP requests.
    • It does prevent you from doing certain things - you can't really have a comments system (as in many blogs) if it's purely client-side, for example. There are workarounds but it's still a limitation.

It's not as if SSGs are poised to wipe out CMSes, not at all. But an SSG can be a really neat alternative for managing a website, if it suits your needs. There are lots of nice static site generators out there.

Static site generators for academic websites

So here in academia, we have loads of old websites everywhere. Some of them are plain HTML, some of them are CMSes set up by PhD students who left years ago, some of them are big whizzy CMSes that the central university admin paid millions for and doesn't quite do everything you want.

If you're setting up a new research group website, questions that come to mind are:

  • How much pain it would take to convince the IT department to install this specific version of python/PHP/ruby, plus all the weird little plugins that this software demands?
  • Who's going to maintain the website for years, applying security patches, dealing with hacks, etc?
  • If I go through this hassle of setting up a CMS, which of its whizzy features do I actually want to use? Often you don't really care about many core CMS features, and the features you do want (such as publications lists) are handled by some half-baked plugin that a half-distracted academic cobbled together years ago and now doesn't work properly.

So using a static site generator (SSG) might be a really handy idea. So that's what I've done. I used a static site generator called Poole which is written in Python and it appealed to me because of how minimal it is.


It has one HTML template which you can make yourself, and then it takes content written in markdown syntax and puts the two together to produce your HTML website. It lets you embed bits of python code in the markdown too, if there's any whizzy stuff needed during page generation. And that's it, it doesn't do anything else. Fab!

But there's more: how do people in our research group edit the site? Do they need to understand this crazy little coding system? No! I plugged Poole together with github for editing the markdown pages. The markdown files are in a github project. As with any github project, anyone can propose a change to one of the textfiles. If they're not pre-authorised then it becomes a "Pull Request" which someone like me checks before approving. Then, I have a little script that regularly checks the github project and regenerates the site if the content has changed.

(This is edging a little bit more towards the CMS side of things, with the server actually having to do stuff. But the neat thing is firstly that this auto-update is optional - this paradigm would work even if the server couldn't regularly poll github, for example - and secondly, because Poole is minimal the server requirements are minimal. It just needs Python plus the python-markdown module.)

We did need a couple of whizzy things for the research site: a publications list, and a listing of research group members. We wanted these to come from data such as a spreadsheet so it could be used in multiple pages and easily updated. This is achieved via the embedded bits of python code I mentioned: we have publications stored in bibtex files, and people stored in a CSV file, and the python loads the data and transforms it into HTML.

It's really neat that the SSG means we have all our content stored in a really portable format: a single git repository containing some of the most widely-handled file formats: markdown, bibtex and CSV.

So where is this website? Here: http://c4dm.eecs.qmul.ac.uk/

Saturday 21st March 2015 | IT | Permalink

Python scipy gotcha: scoreatpercentile

Agh, I just got caught out by a "silent" change in the behaviour of scipy for Python. By "silent" I mean it doesn't seem to be in the scipy 0.12 changelog even though it should be. I'm documenting it here in case anyone else needs to know:

Here's the simple code example - using scoreatpercentile to find a percentile for some 2D array:

import numpy as np
from scipy.stats import scoreatpercentile
scoreatpercentile(np.eye(5), 50)

On my laptop with scipy 11.0 (and numpy 1.7.1) the answer is:

array([ 0.,  0.,  0.,  0.,  0.])

On our lab machine with scipy 13.3 (and numpy 1.7.0) the answer is:


In the first case, it calculates the percentile along one axis. In the second, it calculates the percentile of the flattened array, because in scipy 12 someone added a new "axis" argument to the function, whose default value "None" means to analyse the flattened array. Bah! Nice feature, but a shame about the compatibility. (P.S. I've logged it with the scipy team.)

Friday 14th February 2014 | IT | Permalink

An app for a conference - with a surprising set of features

I'm going to a conference next week, and the conference invites me to "Download the app!" Well, OK, you think, maybe a bit of overkill, but it would be useful to have an app with schedules etc. Here is the app listed on google play.

Oh and here's a list (abbreviated) of permissions that the app requires:

"""This application has access to the following:

  • Your precise location (GPS and network-based)
  • Full network access
  • Connect and disconnect from Wi-Fi - Allows the app to connect to and disconnect from Wi-Fi access points and to make changes to device configuration for Wi-Fi networks.
  • Read calendar events plus confidential information
  • Add or modify calendar events and send email to guests without owners' knowledge
  • Read phone status and identity
  • Camera - take pictures and videos. This permission allows the app to use the camera at any time without your confirmation.
  • Modify your contacts - Allows the app to modify the data about your contacts stored on your device, including the frequency with which you've called, emailed, or communicated in other ways with specific contacts. This permission allows apps to delete contact data.
  • Read your contacts - Allows the app to read data about your contacts stored on your device, including the frequency with which you've called, emailed, or communicated in other ways with specific individuals. This permission allows apps to save your contact data, and malicious apps may share contact data without your knowledge.
  • Read call log - Allows the app to read your device's call log, including data about incoming and outgoing calls. This permission allows apps to save your call log data, and malicious apps may share call log data without your knowledge.
  • Write call log - Allows the app to modify your device's call log, including data about incoming and outgoing calls. Malicious apps may use this to erase or modify your call log.
  • Run at start-up.


Now tell me, what fraction of those permissions should a conference-information app legitimately use? (I've edited out some of the mundane ones.) Should ANYONE install this on their phone/tablet?

Monday 20th May 2013 | IT | Permalink

python: combining interpolation with heatmaps

I saw Brandon Mechtley's splmap which is for plotting sound-pressure measurements on a map. He mentioned a problem: the default "heatmap" rendering you get in google maps is really a density estimate which combines the density of the points with their values. "I need to find a way to average rather than add" he says.

Just playing with this, here's my take on the situation. You don't average the values, you create some kind of interpolated overall map, but separately you also use the density of datapoints to decide how confident you are in your estimate at various points on the map. Python code is here and here's an example plot:


Dataviz folks might already have a name for this...

Tuesday 16th April 2013 | IT | Permalink

Amazon Kindle Fire HD limitations

Over Christmas I helped someone set up their brand new Kindle Fire HD. I hadn't realised quite how coercive Amazon have been: they're using Android as the basis for the system (for which there is a whole world of handy free stuff), but they've thrown various obstacles in your way if you want to do anything that doesn't involve buying stuff from Amazon.

Now, many of these obstacles can be circumvented if you are willing to do moderately techy things such as side-loading apps, but for the non-techy user those options simply won't appear to exist, and I'm sure Amazon uses this to railroad many users into just buying more stuff. It's rude to be so obstructive to their customers who have paid good money for the device.

The main symptoms of this this attitude which I encountered:

  • You need to set up an Amazon one-click credit-card connection even before you can download FREE apps. It's not enough to have an Amazon account connected; you also need the one-click credit card thing.

  • One of the most vital resources for ebooks readers is Project Gutenberg, the free library of out-of-copyright books - but Amazon don't want you to go there. There's no easy way to read Project Gutenberg stuff on Kindle Fire. (Instructions here.) They will happily sell you their version of a book that you could easily get for zero money, of course.

  • You can't get Google Maps. This is just one result of the more general lockdown where Amazon doesn't want you to access the big wide Google Play world of apps, but it's a glaring absence since the Fire has no maps app installed. We installed Skobbler's ForeverMap 2 app which is a nice alternative, which can calculate routes for walking and for driving. In my opinion, the app has too many text boxes ("Shall I type the postcode in here?" "No that's the box to type a city name") and so the search could do with being streamlined. Other than that it seems pretty good.

So, unlike most tablet devices out there, if you have a Kindle Fire it's not straightforward to get free apps, free ebooks, or Google tools. This is disappointing, since the original black-and-white Kindle was such a nicely thought-through object, an innovative product, but now the Kindle Fire is just an Android tablet with things taken away. That seems to be why the Project Gutenberg webmaster recommends "don't get a Kindle Fire, get a Nexus 7".

There are good things about the device, though. It has a nice bright screen, good for viewing photos (though the photo viewer app has a couple of odd limitations: it doesn't rotate to landscape when you rotate the device - seems a very odd and kinda obvious omission since almost everything else rotates; and it doesn't make it obvious whether you've reached the end of a set, so you end up swiping a few times before you're sure you've finished). There's a good responsive pinch zoom on photos, maps etc. And the home screen has a lovely and useful skeumorph: the main feature is a "pile" of recently-used things, a scrollable pile of books and apps. A great way to visually skim what you recently did and how to jump back to it - biased towards books and Amazon things, but still, a nice touch. Shame about the overall coercive attitude.

Sunday 30th December 2012 | IT | Permalink

How to remove big old files from git history

I've been storing a lot of my files in a private git repository, for a long time now. Back when I started my PhD, I threw all kinds of things into it, including PDFs of handy slides, TIFF images I generated from data, journal-article PDFs... ugh. Mainly a lot of big bloaty files that I didn't really need to be long-term-archived (because I already had archived the nice small files that generated them - scripts, data tables, tex files).

So now I'm many years on, and I know FOR SURE that I don't need any trace of those darn PDFs in my archive, I want to delete them from the git history. Not just delete them from the current version, that's easy ("git rm"), but delete them from history so that my git repository could be nice and compact and easy to take around with me.

NOTE: Deleting things from the history is a very tricky operation! ALL of your commit IDs get changed, and if you're sharing the repos with anyone you're quite likely to muck them up. Don't do it casually!

But how can you search your git history for big files, inspect them, and then choose whether to dump them or not? There's a stackoverflow question about this exact issue, and I used a script from one of the answers, but to be honest it didn't get me very far. It was able to give me the names of many big files, but when I constructed a simple "git-filter-branch" command based on those filenames, it chugged through, rewriting history, and then failed to give me any helpful size difference. It's quite possible that it failed because of things like files moving location over time, and therefore not getting 100% deleted from the history.

Luckily, Roberto is a better git magician than I am, and he happened to be thinking about a similar issue. Through his git-and-shell skills I got my respository down to 60% of its previous size, and cleared out all those annoying PDFs. Roberto's tips came in some github gists and tweets - so I'm going to copy-and-link them here for posterity...

  1. Make a backup of your repository somewhere.

  2. Create a ramdisk on which to do the rewriting - this makes it go MUCH faster, it can be a slow process. (For me it reduced two days to two hours.)

    mkdir repo-in-ram
    sudo mount -t tmpfs -o size=2048M tmpfs repo-in-ram
    cp -r myrepo.git repo-in-ram/
    cd repo-in-ram/myrepo.git/
  3. This command gets a list of all blobs ever contained in your repo, along with their associated filename (quite slow), so that we can check the filenames later:

    git verify-pack -v .git/objects/pack/*.idx | grep tree | cut -c1-40 | xargs -n1 -iX sh -c "git ls-tree X | cut -c8- | grep ^blob | cut -c6-" | sort | uniq > blobfilenames.txt
  4. This command gets the top 500 biggest blobs in your repo, ordered by size they occupy when compressed in the packfile:

    git verify-pack -v .git/objects/pack/*.idx | grep blob | cut -c1-40,48- | cut -d' ' -f1,3 | sort -n -r --key 2 | head -500 > top-500-biggest-blobs.txt
  5. Go through that "top-500-biggest-blobs.txt" file and inspect the filenames. Are there any you want to keep? If so DELETE the line - this file is going to be used as a list of things that will get deleted. What I actually did was I used Libreoffice Calc to cross-tabulate the filenames against the blob IDs.

  6. Create this file somewhere, with a name "replace-with-sha.sh", and make it executable:

    #!/usr/bin/env sh
    TREEDATA=$(git ls-tree -r $2 | grep ^.......blob | cut -c13-)
    while IFS= read -r line ; do
        echo "$TREEDATA" | grep ^$line | cut -c42- | xargs -n1 -iX sh -c "echo $line > 'X.REMOVED.sha' && rm 'X'" &
    done < $1
  7. Now we're ready for the big filter. This will invoke git-filter-branch, using the above script to trim down every single commit in your repos:

    git filter-branch --tree-filter '/home/dan/stuff/replace-with-sha.sh /home/dan/stuff/top-500-biggest-blobs.txt $GIT_COMMIT' -- --all
  8. (Two hours later...) Did it work? Check that nothing is crazy screwed up.

  9. Git probably has some of the old data hanging around from before the rewrite. Not sure if all three of these lines are needed but certainly the last one:

    rm -rf .git/refs/original/
    git reflog expire --all
    git gc --prune=now

After I'd run that last "gc" line, that was the point that I could tell that I'd successfully got the disk-space right down.

If everything's OK at this point, you can copy the repos from the ramdisk back to replace the repos in its official location.

Now, when you next pull or push that repository, please be careful. You might need to rebase your latest work on top of the rewritten history.

Wednesday 5th December 2012 | IT | Permalink

Installing Cyanogenmod 7 on HTC Tattoo

Just notes for posterity: I am installing Cyanogenmod on my HTC Tattoo Android phone.

There are instructions here which are perfectly understandable if you're comfortable with a command-line. But the instructions are incomplete. As has been discussed here I needed to install "tattoo-hack.ko" in order to get the procedure to complete (at the flash_image step I got "illegal instruction" error). Someone on the wiki argues that you don't need that instruction if you're upgrading from stock Tattoo - but I'm upgrading from stock Tattoo and I needed it.

In the end, as advised in the forum thread linked above I followed bits of these instructions and used tattoo-hack.ko as well as the different versions of "su" and "flash_image" linked from that forum thread, NOT the versions in the wiki.

Also, there's an instruction in the wiki that says "reboot into recovery mode". It would have been nice if it had spelt that out for me. The command to run is

       adb reboot recovery
Wednesday 8th August 2012 | IT | Permalink

Some things I have learnt about optimising Python

I've been enjoying writing my research code in Python over the past couple of years. I haven't had to put much effort into optimising it, so I never bothered, but just recently I've been working on a graph-search algorithm which can get quite heavy - there was one script I ran which took about a week.

So I've been learning how to optimise my Python code. I've got it running roughly 10 or 20 times faster than it was doing, which is definitely worth it. Here are some things I've learnt:

  • The golden rule is to profile your code before optimising it; don't waste your effort. The cProfile module is surprisingly easy to use, and it shows you where your code is using the most CPU. Here's an example of what I do on the commandline:

    # run your heavy code for a good while, with the cProfile module logging it ('streammodels.py' is the script I'm profiling):
    python -m cProfile -o scriptprof streammodels.py
    # then to analyse it:
    import pstats
    p = pstats.Stats('scriptprof')
  • There are some lovely features in Python which are nice ways to code, but once you want your code to go fast you need to avoid them :( - boo hoo. It's a bit of a shame that you can't tell Python to act as an "optimising compiler" and automatically do this stuff for you. But here are two key things to avoid:

    • list-comprehensions are nice and pythonic but they're BAD for fast code because they create an array even if you don't need it. Instead, use things like map() or filter() which process data without constructing a new array. (Also use "xrange" rather than "range" if you just want to iterate a range rather than keeping the resulting list.)
    • lambdas and locally-defined functions (by which I mean something like "def myfunc():" as a local thing inside a function or method) are lovely for flexible programming, but when your code is ready to run fast and solid, you will often need to replace these with more "ordinary" functions. The reason is that you don't want these functions constructed afresh every time you use them; you want them constructed once and then just used.
  • Shock for scientists and other ex-matlabbers: using numpy isn't necessarily a good idea. For example, I lazily used numpy's "exp" and "log" when I could have used the math module and avoided dragging in the heavy array-processing facilities that I didn't need. After I changed my code to not actually use numpy (I didn't need it - I wasn't really using array/matrix maths for this particular code), I went much faster.

  • Cython is easy to use and speeds up your python code by turning it into C and compiling it for you - who could refuse? you can also add static typing things to speed it up even more but that makes it not pure python code so ignore that until you need it.

  • Name-lookups are apparently expensive in python (though I don't think the profiler really shows the effect this, so I can't tell if it's important). there's no harm in storing something in a local variable -- even a function, e.g. "detfunc = scipy.linalg.det".

So now I know these things my Python runs much faster. I'm sure there are many more tricks of the trade. For me as a researcher, I need to balance the time saved by optimising against the flexibility to change the code on a whim and sto be able to hack around with it, so I don't want to go too far down the optimisation rabbit-hole. The benefit of Python is its readability and hackability. It's particularly handy, for example, that Cython can speed up my code without me having to make any weird changes to it.

Any other top tips, please feel free to let me know...

Tuesday 17th July 2012 | IT | Permalink

How I made a nice map handout from OpenStreetMap

OpenStreetMap is a nice community-edited map of everything - and you can grab their data at any time. So in theory it should be the ideal thing to choose when you want to make a little map for an open-source conference or something like that.

screenshot of map handout. click for PDF

For our event this year I made these nice map handouts. It took a while! Quite tricky for a first-timer. But they're nice pretty vector PDF maps, with my own custom fonts, colour choices etc.

For anyone who fancies having a go, here's what I did:

  1. I followed the TileMill "30 minute tutorial" to install and set up TileMill on my Ubuntu laptop. It takes longer than 30 minutes - it's still a little bit tricky and there's a bit of a wait while it downloads a lump of data too.
  2. I started a new map project based on the example. I wanted to tweak it a bit - they use a CSS-like stylesheet language ("MSS") to specify what maps are supposed to look like, and it's nice that you can edit the stylesheets and see the changes immediately. However, I found it tricky to work out what to edit to have the effect I wanted. Here's what I managed to do:

    • I changed the font choice to match the visual style of our website. That bit is easy - find where there are some fonts specified, and put your preferred font at the FRONT of all the lists.
    • I wanted to direct people to specific buildings, but the default style doesn't show building names. However, I noticed that it does show names for cemeteries... in labels.mss on line 306 there was

          #area_label[type='cemetery'][zoom>=10] {

      and I can add buildings to that:

          #area_label[type='cemetery'][zoom>=10] {
    • The underground train line was being painted on top of the buildings, which looks confusing and silly. To fix this I had to rearrange the layers listed in the Layers panel - drag the "buildings" layer higher up the list, above the "roads" ones.

  3. When I'd got the map looking pretty good, I exported it as an SVG image.
  4. Then I quit TileMill and started up Inkscape, a really nice vector graphics program. I load the SVG that I saved in the previous step.
  5. I edited the image to highlight specific items:
    • The neatest way to do this is to select all and put it all into a layer, then select the items you want to highlight and move them to a new layer above. Once they're in a separate layer, it's easier to use Inkscape's selection tools to select all these items and perform tweaks like thickening the line-style or darkening the fill colour.
    • Selecting a "word" on the map is not so easy because each letter is a separate "object", and so is the shadow underneath. If there's a single word or street-name you're working on, it's handy to select all the letters and group them into a group (Ctrl+G), so you can treat them as a single unit.
    • You can also add extra annotations of your own, of course. I had to add tube-station icons manually, cos I couldn't find any way of getting TileMill to show those "point-of-interest"-type icons. I think there's supposed to be a way to do it, but I couldn't work it out.
  6. The next job is to clip the map image - the map includes various objects trailing off to different distances, it's not a neat rectangle. In Inkscape you can do a neat clipping like this:
    • Select all the map objects. If you've been doing as I described you'll need to use "Select all in all layers" (Ctrl+Shift+A).
    • Group them together (Ctrl+G).
    • Now use the rectangle tool to draw a rectangle which matches the clipping area you want to use.
    • Select the two items - the rectangle and the map-item-group - then right-click and choose "Set clip". Inkscape unites the two objects, using the rectangle to create a clipped version of the other.
  7. Now with your neatly-cropped rectangle map, you can draw things round the outside (e.g. put a title on).
  8. If you ever need to edit inside the map, Inkscape has an option for that - right-click and choose "Enter group" and you go inside the group, where you can edit things without disturbing the neat clipping etc.
  9. Once you're finished, you can export the final image as a PDF or suchlike.
Sunday 22nd April 2012 | IT | Permalink

Learning prolog, eight queens

I'm following the "7 languages in 7 weeks" book. This week, PROLOG! However, I'm failing on this task: solve the eight queens puzzle in prolog. Why does this fail:

    queens(List) :-
            List = [Q1, Q2, Q3, Q4, Q5, Q6, Q7, Q8],

    valid([Head|Tail]) :-

    validone(One,[Head|[]]) :-
            pairok(One, Head).
    validone(One,[Head|Tail]) :-
            pairok(One, Head),
            validone(One, Tail).

    pairok((X1, Y1), (X2, Y2)) :-
            Range = [1,2,3,4,5,6,7,8],
            member(X1, Range),
            member(Y1, Range),
            member(X2, Range),
            member(Y2, Range),
            (X1 =\= X2),
            (Y1 =\= Y2),
            (X1+Y1 =\= X2+Y2),
            (X1-Y1 =\= X2-Y2).

I load it in gprolog using


then I ask it to find me the eight unknowns (A through to H) by executing this:


What it should do (I think) is suggest a set of values that the unknowns can take. What it does instead is say:


(which means it thinks there are no possible solutions.) Anyone spot my error?

Thursday 19th January 2012 | IT | Permalink

isobar python pattern library

One of the nicest things about the SuperCollider language is the Patterns library, which is a very elegant way of doing generative music and other stuff where you need to generate event-patterns.

Dan Jones made a kind of copy of the Patterns library but for Python, called "isobar", and I've been meaning to try it out. So here are some initial notes from me trying it for the first time - there may be more blog articles to come, this is just first impressions.

OK so here's one difference straight away: in SuperCollider a Pattern is not a thing that generates values, it's a thing that generates Streams, which then generate values. In isobar, it's not like that: you create a pattern such as a PSeq (e.g. one to yield a sequence of values 6, 8, 7, 9, ...) and immediately you can call .next on it to return the values. Fine, cutting out the middle-man, but I'm not sure what we're meant to do if we want to generate multiple similar streams of data all coming from the same "cookie cutter".

For example in SuperCollider:

      p = Pseq([4, 5, 6, 7]);
      q = p.asStream;
      r = p.asStream;
      r.next;  // outputs 4
      r.next;  // outputs 5
      q.next;  // outputs 4
      q.next;  // outputs 5

and in isobar it looks like we'd have to do:

      q = PSeq([4, 5, 6, 7])
      r = PSeq([4, 5, 6, 7])
      r.next()  # outputs 4
      r.next()  # outputs 5
      q.next()  # outputs 4
      q.next()  # outputs 5

Note how I have to instantiate two "parent" patterns. (I could have cached the list in a variable, of course.) It looks pointless with such a simple example, who cares which of the two we do. But I wonder if this will inhibit the pattern-composition fun in isobar, that you can do in SuperCollider by putting patterns in patterns in patterns... who can say. Will dabble.

The other thing that was missing is Pbind, the bit of magic that constructs SuperCollider's "Event"s (similar to Python "dict"s).

As a quick test of whether I understood Dan's code I added a PDict class. It seems to work:

      from isobar import *
      p = PDict({'parp': PSeq([4,5,6,7]), 'prep': PSeq(['a','b'])})

      p.next()   # outputs {'prep': 'a', 'parp': 4}
      p.next()   # outputs {'prep': 'b', 'parp': 5}
      p.next()   # outputs {'prep': 'a', 'parp': 6}
      p.next()   # outputs {'prep': 'b', 'parp': 7}
      p.next()   # outputs {'prep': 'a', 'parp': 4}

This should make things go further - as in SuperCollider, you should be able to use this to construct sequences with various parameters (pitch, filter cutoff, duration) all changing together, according to whatever patterns you give them.

There's loads of stuff not done; for example in SuperCollider there's Pkey() which lets you cross the beams - you can use the current value of 'prep' to decide the value of 'parp' by looking up its current value in the dict, whereas here I'm not sure if that's even going to be possible.

Anyway my fork of Dan's code, specifically the branch with PDict added, is at:


Sunday 8th January 2012 | IT | Permalink

How can we help music education with FOSS?

Recently I've been doing a lot of work with secondary schools, in music lessons. I've seen a lot of interesting use of music software, music hardware, and web-based things.

One of the things that surprised me was that sequencers like Cubase, Pro Tools, Logic, are pretty solidly integrated into the curriculum, but this does have its problems: I saw various bits of confusion about dongles not working, licenses expiring, etc etc. Also, since school budgets are limited, it does worry me that we should be building an important part of the school music curriculum on top of some quite expensive software. Digital music is an important part of the modern re-democratisation of music; if schools or government see it as more expensive than it needs to be, then it's at risk.

Where is the free/open-source software (FOSS)? Why isn't it the ideal solution here? A few reasons spring to mind, not in any order:

  • These commercial programs are often the ones used in industry, though not exclusively. Learning these commercial programs makes a good transferable skill from school to industry. (But, a real transferable skill is not a skill tied to one piece of software, it's a skill based on the underlying principles. Unfortunately, many ICT skills taught in schools and encouraged by government are of the knowing-where-to-click-in-Powerpoint variety, and that's not always unjustified.)
  • FOSS can often be harder to install/maintain - teachers' time is extremely precious, and their time is demanded from many angles already. They do not want or need extra maintenance burden.
  • FOSS often has no marketing, and doesn't lobby government.
  • Schools mostly use Windows computers, and Windows is not the main focus of the FOSS community.

There's nothing wrong with commercial software, don't get me wrong - but if we're going to give all our secondary schools a good complement of software to teach proper digital music education, it seems like a risky strategy to be tied to some fairly expensive software. Ideally, there would be FOSS available alongside commercial software, and teachers would choose one or the other according to the local situation, and the educational outcomes could be just as good with either.

So let's stick with the example of software sequencers. What FOSS programs are there that might spring to the rescue?

  • Ardour? Well, it's pretty heavy-duty and complex, and although it runs on Mac and Linux it doesn't run on Windows. Doesn't seem ideal.
  • Rosegarden, MusE, Qtractor? Darn, Linux-only.
  • Jokosher? Now this is possibly ideal - it deliberately aims to have a non-intimidating interface, and it runs on Windows. Jokosher is a fairly new kid on the block, which is perhaps why it isn't yet used more widely, but has a lot of potential and has backing from the Ubuntu community. My main concern about Jokosher is that it has such a deliberately "anti-pro" feel that I'd be surprised if any music studio was using it. So teachers might not be keen on the idea of teaching software that isn't used in the professional context.

So I don't think there's anything that 100% perfectly fits the bill. FOSS being FOSS, it won't necessarily emerge unbidden; most school music teachers are not FOSS programmers and wouldn't know where to start. We either need a community of people willing to develop it for idealistic purposes (kinda what's happening in Jokosher), or a government-funded initiative (there have been many of these, for example to improve online learning systems) - ideally, both.

If the Jokosher community is interested in helping out with this, and making quite an important impact on digital music education, basically there are two angles: firstly make it easy to install and maintain on Windows (yes Windows - one battle at a time folks); and secondly, make sure it has all the main features of what others might call a "proper" DAW used in industry, so that someone can teach with Jokosher and be confident the learning is transferable.

Thursday 14th April 2011 | IT | Permalink

Demon broadband fixed, security fix for Thomson TG585 v7

A while ago I had big problems with Demon broadband because they "upgraded" the service and made it incompatible with my router. After a bit of back-and-forth Demon kindly replaced the router with a newer one, a "Thomson TG585 v7". It works fine.

While trying to get our radio station back online and streaming, I discovered something dodgy about the router setup, so if you happen to have one of these routers then do this check described below, to make sure your router's admin page isn't exposed to the world. I have to thank the very helpful people on the portforward.com forums who spotted the issue (thread here, with more details).

(1) Connect to the router's admin interface using telnet. On my Mac I do this by launching Terminal and typing telnet (then giving the username and password when prompted).

(2) Type config dump (and press return) and a massive massive screed of text will appear, listing all the config settings for the device.

(3) In that text, look for a subsection labelled [ servmgr.ini ] (for me it was near the bottom). Check to see if these lines are in that bit:

    ifadd name=HTTP group=wan
    ifadd name=TELNET group=wan

The important thing here is "wan". "lan" is OK, it means you can have local access to the admin, but "wan" is dodgy because it means you're providing an opportunity for the world to access your router.

(4) If you do have those lines then you can fix the situation by running the following commands (the final one will reboot your router):

    service system ifdelete name HTTP group wan
    service system ifdelete name TELNET group wan
    system reboot

Voila. After rebooting you may wish to go through the steps again to check that the config settings have been changed.

Sunday 30th August 2009 | IT | Permalink

How to put an imagemap in the header of your wordpress template

Edit: THIS INFORMATION IS OUT OF DATE. Wordpress has changed a lot since this blog article was published. I don't have any updated info for you, sorry.

Someone I know is setting up a wordpress website, and wanted to use an imagemap to put links in the header image. It's tricky, because the default template uses an image as background not foreground, and you can't use an imagemap with that.

Here's a quick hack for making it possible: Find the file /wp-content/themes/default/header.php and open it in a text editor. Near the end of that file you'll find the h1 element where the heading is printed. The block looks like this:

<div id="header" role="banner">
    <div id="headerimg">
        <h1><a href="<?php echo get_option('home'); ?>/"><?php bloginfo('name'); ?></a></h1>
        <div class="description"><?php bloginfo('description'); ?></div>

The quick hack is to change it like this:

<div id="header" role="banner" style="background-image: none !important;">
    <div id="headerimg">
        <h1 style="margin: 0px; padding: 0px;">
            <img src="<?php bloginfo('stylesheet_directory'); ?>/images/mynewimage.png" alt="<?php bloginfo('name'); ?>" id="_Image-Maps_3200907261653144"  usemap="#Image-Maps_3200907261653144" />
        <map id="_Image-Maps_3200907261653144" name="Image-Maps_3200907261653144">
            <area shape="rect" coords="32,20,234,188" href="http://www.parpface.com/" alt="go to parpface" title="go to parpface"    />

There are four changes:

  • I added some CSS in that first line to stop the background image from showing.
  • I added an img tag inside the h1 tag to show my image
  • I added some CSS to the h1 tag to stop it having a big empty border
  • I added the image map - i.e. the map tag and all that

That's pretty hacky but it seems to work.

Sunday 26th July 2009 | IT | Permalink

Demon broadband big problems

Our Demon broadband service has been good for years, but over the past few weeks it's been really bad. We have had the same wifi router for ages (a D-Link) and it's been reliable, but in recent weeks the ADSL service has cut out completely, three or four times per day.

The wifi is still working (I can communicate from one local computer to another) but the connection to the outside (in either direction) is totally gone, and the "ADSL" light on the router is flashing indicating a problem. I can "solve" the problem by rebooting the router, but rebooting the router three or four times a day is completely impractical and a right pain - and, of course, 100% impossible if I'm away from home and trying to log in remotely.

(I'm actually having problems posting this blog article, since my connection only lasts about two or three minutes at a time this evening. Have rebooted router five times while writing this post. Cor blimey this is a bad service....)

I rang Demon customer service and I could tell I wasn't the only one having the problem - they've recently added a message saying "if you're having broadband problems try rebooting your router before ringing". I see from a recent news article that it is definitely affecting lots of people.

The first thing Demon told me to do was move my router so it was plugged into the main landline socket (not an extension) and try changing the microfilter. So eventually I found the kit for that and did it, but the problems were exactly the same.

Then I rang back, spent another 45 minutes waiting for an answer from the tech support, and after going through the same questions, the only thing they could tell me was "buy a new router". Hmm, so if the problem is that my router is crap, how come everyone's having the exact same problem as me, all at the same time? Everyone's router has broken at once? Not particularly likely. Apparently Demon have done some kind of "upgrade" to their broadband service, the details of which I don't know, but it looks like something might have gone badly wrong with that because lots of people seem to be having the same problem as me - a service that worked perfectly well and pretty much rock-solid for five years is now "upgraded" to a totally awful state.

IF YOU HAVE PROBLEMS: First reboot the router - turn it off, wait 20 seconds, turn it on. If you have the same as me (rebooting fixes it but not for long), try making sure your router is plugged into the primary landline wall socket (with no extension cables), and try changing the microfilter. Tech support will refuse to help you until you do those steps. Then if you ring tech support, they might say that your router must be faulty and try a different one. I certainly don't know of a wifi router I can borrow. But if everyone is having the same problem then it isn't our routers that are at fault but Demon's service.

Tuesday 16th June 2009 | IT | Permalink

Real-time audio software and multicore processing

We've been thinking about how best to incorporate multicore processing into SuperCollider's audio engine. A bit of background: the trend in computing is that although computers used to have one single CPU to do all the thinking, the latest computers tend to have multiple CPUs (each with access to shared memory). Furthermore, it's now even possible to make use of the number-crunching power lying unused on many graphics chips - although that doesn't use the same shared memory so it's a slightly different situation.

This all means that most software, which runs on an "old-fashioned" single-core model, might not be using the full power available. There are libraries available to help programmers easily move into this multicore world, such as the well-established and very easy-to-use OpenMP.

How does OpenMP work? It's very much like a traditional threading model, where if you want multiple things to happen at once, you launch as many separate "threads" as you need. OpenMP simplifies this by automatically creating the threads as needed (e.g. it can automatically parallellise the separate iterations of a for-loop), and also by automatically distributing the threads over the CPUs. It's often called a "fork-and-join" model: when the program reaches a block of code which could be parallellised, it divides itself up into many parallel threads - and then when the parallel bit is over, the program logic all joins back to the single thread that started it all.

With real-time audio processing there's a complication. We want the software to take some chunks of input audio (if used), do some processing, and create some chunks of output audio, all within a very tight timeframe. This has a few implications:

  • Performing a fork-and-join procedure at every audio "block", typically around a hundred times a second, is expensive in computer effort. I know because I tried it, and a highly efficient sine-wave generator suddenly became extremely heavy...
  • Multicore programming libraries often don't guarantee how fast they will do their job. Plus, there's an overhead involved in dividing tasks up. Plus, there may be added overhead because of the very nature of parallel processing (e.g. transferring data from main memory to GPU memory). All of which means that certain interesting-looking APIs (e.g. GPGPU systems such as Nvidia's CUDA; Apple's Grand Central) are unlikely to be particularly helpful for realtime audio.
  • More prosaically, my experiments find that CoreAudio (the Mac audio infrastructure) and OpenMP don't play well together, which is a shame - makes it harder for anyone trying to parallellise audio software on Mac. Luckily I didn't have this problem on Linux.

So the question remains. Do we want to make our realtime audio apps multicore, and if so, how? You don't always improve things by spreading them over more cores, because of the inherent overheads I mentioned. However, on an 8-core system it certainly seems a shame to be limited to a maximum of 1/8 of the computer's thinking power.

SuperCollider has a nice aspect which helps here. The audio engine ("scsynth") is a separate application, and you can have multiple instances. So you could quite easily launch multiple audio engines, and have each one of them handle different parts of your audio scene. Great - nice and easy - although with some limitations. The different audio engine instances wouldn't be able to share memory, so sharing data between them is a bit of a pain. Also, it seems that you can't really guarantee which CPU core is used to run which process (the "affinity") - typically they would tend to be distributed over the cores, but it'd be nicer if we could guarantee that.

So, an approach to within-process parallellisation? Maybe we need to launch a thread for each core, and have these threads do a kind of busy-waiting until the audio callback wants some work to be done. Busy-waiting would be hard to get right though, compromising between responsiveness and CPU cycles wasted on the active waiting.

Wednesday 10th June 2009 | IT | Permalink

Installing SuperCollider on Ubuntu Studio 9.04

I just installed Ubuntu Studio 9.04 as dual-boot on my Mac and it's fantastic. Pretty much everything works out of the box, soundcard support, low-latency realtime audio, etc. (The only problem I had was a mild annoyance with my wacom tablet, bug 375329.)

Since I want to use jack as my audio subsystem, I launched "JACK Control" (aka "qjackctrl") from the applications menu and pressed its "Start" button. Then I was ready to make sound.

Installing SuperCollider was super-easy too: I installed SuperCollider from the packages at http://launchpad.net/~supercollider/+archive/ppa and scvim worked straight from its little menu icon.

Darn it, this is easy

(A rough indication of performance: on this dual-core Intel Mac, 2GHz, I can generate 1000 sinewaves in SuperCollider (random freq between 100 and 1000 Hz), at 44.1kHz and with jack's buffer size left at its default of 1024, with a comfortable DSP load of 76% on one of the cores. There are no audio dropouts even if I do some big compiling tasks etc - they naturally get assigned to the other core, of course.)

Wednesday 13th May 2009 | IT | Permalink

Open-source 3D plotting on Mac

I've been working on Self-Organising Maps for timbre analysis, and I needed a good way to make an interactive 3D plot of the SOMs so that I could visually verify what was going on. ... Actually it took me a while trying out a few options, to get something decent in place. That's why I'm documenting it here.

  • Of course Matlab has some good scientific 3D plotting, and I did a couple of preliminary visualisations using that. However, it's not open-source, it doesn't dovetail particularly nicely with SuperCollider (which is where my data is coming from), and it is a bit of a behemoth and I don't want it installed on my tiny Linux Eee PC.
  • GNU Octave is pretty much an open-source clone of Matlab (it aims for language compatibility). It provides some basic matlabby plotting but lacks a lot of the advanced control, and also some features such as patch() are missing, which I needed.
  • My number 1 hunch was that Python, with its popular scientific modules, would have some powerful 3D plotting right there. However, the standard matlab-like plotting module matplotlib doesn't have any 3D support. There's also something called mayavi which I think does OpenGL-based fancy 3D graphics, but I couldn't get it installed on my Mac so I couldn't test it out. I was really surprised not to be able to get very far with python and scientific 3D.
  • Someone reminded me about scilab - silly of me to forget this one; it's a long-standing open-source science platform with visualisation tools, I bet it could have helped.

And here's the solution I finally settled on:

  • gnuplot, plus the GNUPlot quark to be able to use it directly within SuperCollider. Gnuplot has a nice diversity of plotting styles available from its scripting language, and in the end it was surprisingly simple to script it to build what I wanted: 3D surface plots with little lines sticking out, representing the mapping from datapoints onto the SOM. It took me a while to understand that it's not oriented towards inline data: it gets much easier if you drop your data into a text file (CSV or suchlike) and work from that.

Here's an example of some test data which I piped straight from SuperCollider into gnuplot:

it works!

Tuesday 12th May 2009 | IT | Permalink

Fridge vs laptop update: it's the amp

For those of you freaked out by yesterday's thing where my fridge could turn off my laptop's sound, here's an update. It seems it's somehow due to my amp.

With the exact same setup, but listening on headphones rather than the amp, the fridge has no more power over my computer. My amp is old (1970s? no idea) and so I'm not surprised, but it's still a mystery how the chain of events occurs. Somehow, the glitch needs to propagate through the chain, fridge->amp->audiointerface->laptop, in such a way that the laptop gets a bad file descriptor error which makes the audio fall over.

Here's the full error message, in case it matters to anyone:

ALSA: prepare error for playback on "hw:1,1" (File descriptor in bad state)
DRIVER NT: could not run driver cycle
jack caught main signal 12
no message buffer overruns
Tuesday 25th November 2008 | IT | Permalink

My fridge breaks my computer's audio...

Today I've been trying to work out why the sound on my Eee sometimes stops working. I've narrowed it down to one slightly surprising cause: the fridge! I can leave the audio (SuperCollider via jackd) running absolutely fine for three-quarters of an hour, fine... and then the fridge's thermostat kicks in, turning the fridge on - and at the exact same moment the audio stops!

I know that the fridge emits some kind of radio interference when the thermostat kicks in/out, since it always disrupts the Freeview TV signal for a fraction of a second. So how would that affect the sound on my Eee PC?

  1. First suspect: wifi activity. Maybe some kind of weird wifi reaction is triggered by the fridge's outburst, and the computer's response to that trips up the audio. But I can turn off the wifi and it still happens. (That doesn't completely rule it out - maybe the wireless card still does something weird, even if the system isn't trying to maintain any wireless connections.)
  2. Second suspect: a blip in the AC power supply. It's quite likely that the fridge kicking in/out warps the mains electricity in our little place, and maybe the computer reacts badly to that. No, doesn't seem so, since it happens even if the Eee is running on battery power.
  3. Third suspect: is it possible that the fridge's radio outburst does something to the USB connection between the Eee and the audio interface? I'm not sure exactly - seems less likely than the other two candidates, to me. I haven't yet tried to make the glitch to occur while using system built-in audio rather than the audio interface.

Well it's a strange case and I haven't solved it yet. Turning off the fridge while using my computer would be a bit of a hassle...

Sunday 23rd November 2008 | IT | Permalink

Efficiency geek 2: copying data in C/C++, optimisation

Having benchmarked different ways to zero an array, there's also the question of copying lumps of floating-point data from one place to another, which can be done in a similar range of different ways. Here I've benchmarked in the same way as in my first note, using the analogous approach in each case (except for method 9, which doesn't have an analogue here):

MethodMac PPCLinux Intel
1, sc3 21 %69 %
2, for, array 40 %75 %
3, for, post 38 %51 %
4, for, pre 38 %75 %
5, do-while 39 %75 %
6, duff's, post40 %56 %
7, duff's, pre 40 %75 %
8, memcpy 13 %39 %
10, unrolled-for39 %47 %

(This shows results for copying aligned blocks of data. I also did a test using unaligned blocks, there are no differences worth reporting.)

For PPC Mac it's a very consistent story: all of the loopy methods basically take exactly the same amount of effort. JMC's crafty use of doubles is a clever optimisation here, but (as in the zeroing test) there's a definite outright winner, and it's simpler: memcpy.

For Intel Linux there's some variation in the results. For some reason postincremented pointers are better than their alternatives, and the unrolling in method 10 helps noticeably. But again, memcpy is the outright winner.

So it looks like the recommendation is a direct parallel of the first test: memcpy() please, in this kind of circumstance. YMMV.

Sunday 19th October 2008 | IT | Permalink

Efficiency geek: zeroing data in C/C++, optimisation

On the SuperCollider developer list we were discussing what was the most efficient way to set a block of floating-point data to zero. So I've run a test... here's the results.

I wrote a plugin which repeatedly clears a block of 512 floating-point values by calling SuperCollider's Clear() macro, and then ran 10 of these plugins in a synth. By changing what happens inside the macro I could test different approaches, and see the CPU load in each case.

This shows the different ways that I used to clear the data:

// (1) SuperCollider's old code, hand-optimised for powerpc.
//   requires 8-byte alignment, annoyingly.
if ((numSamples & 1) == 0) {
    double *outd = (double*)out - ZOFF;
    LOOP(numSamples >> 1, ZXP(outd) = 0.; );
} else {
    out -= ZOFF;
    LOOP(numSamples, ZXP(out) = 0.f; );

// (2) for-loop method using array indexing:
int i;
for(i = 0; i < numSamples; ++i)
    out[i] = 0.f;

// (3) for-loop method using pointer postincrement:
int i;
float *loc = out;
for(i = 0; i < numSamples; ++i)
    *(loc++) = 0.f;

// (4) for-loop method using pointer preincrement:
int i;
float *loc = out - 1;
for(i = 0; i < numSamples; ++i)
    *(++loc) = 0.f;

// (5) a do-while loop:
int i = numSamples;
    out[--i] = 0.f;
}while(i != 0);

// (6) a duff's device using pointer postincrement:
float *loc = out;
DUFF_DEVICE_8(numSamples, *(loc++)=0.f;);

// (7) a duff's device using pointer preincrement:
float *loc = out - 1;
DUFF_DEVICE_8(numSamples, *(++loc)=0.f;);

// (8) memset'ing the data to a char value of zero:
memset(out, 0, numSamples * sizeof(float));

// (9) bzero, which sets character data to a value of zero:
bzero(out, numSamples * sizeof(float));

// (10) for-loop method using pointers and manual unrolling:
int i;
float *loc = out;
for(i = numSamples >> 2; i != 0; --i){ // Unroll into blocks of four
    *(loc++) = 0.f;
    *(loc++) = 0.f;
    *(loc++) = 0.f;
    *(loc++) = 0.f;
// These two "if"s handle the remainder, if not divisible exactly by four
if(numSamples & 1){
    *(loc++) = 0.f;
if(numSamples & 2){
    *(loc++) = 0.f;
    *(loc++) = 0.f;

These methods come in two groups: the semantically-correct ones (the for-loops and do-loops, plus the Duff's device) which treat a float as a float; and the hacky ones (memset, bzero, the double-trick) which make use of our knowledge that the binary representation will turn out to be the same. The C/C++ standards don't guarantee that the binary representation will be the same, so some would say these are dangerous approaches. However, the IEEE floating-point standards specify that all-zero-bits equates to zero, so on any machine using IEEE floating-point then this is definitely OK.

And these are the two systems I tested on:

  • Mac OSX 10.4.11, 1 GHz PPC G4
  • Asus Eee, Xandros Linux, 900 MHz Intel clocked at 630 MHz

Compiled the plugin using ordinary (non-debug) compiler settings from SuperCollider build scripts (on Mac this uses -Os, on Linux -O3). Results:

MethodMac PPCLinux Intel
1, sc3 67 %58 %
2, for, array 31 %75 %
3, for, post 31 %51 %
4, for, pre 73 %75 %
5, do-while 32 %74 %
6, duff's, post35 %56 %
7, duff's, pre 31 %48 %
8, memset 11 %44 %
9, bzero 11 %44 %
10, unrolled for32 %51 %

Some surprises here. One is how much more efficient memset/bzero is than other standard methods. (The optimising compiler converts memset to bzero on my machines, which is why their results are the same.)

Also the strong inefficiency of the preincrementing for-loop method. It's possible that the compiler automatically converts some types of loop into a Duff's device, which would explain why there's a strange pattern of fastness and slowness in the standard loops with the Duff's device as a lower limit on their efficiency. The manual unrolling (method 10) is no help either!

SC's method seems poor on both systems, despite apparently being designed for ppc. (The original author has indeed said that the code in that header-file is getting out-of-date...)

I read that bzero is nonstandard so it would look like one good way to proceed, and pretty easy to do, is to use memset(), but keep in mind that you might not be able to use it on systems with non-IEEE fp.

These graphs show the results for different data sizes, within the range we most care about for SC:

Mac, Intel Core DuoMac, PPC G4Intel Linux (Eee)
Graph Graph Graph

Note that each graph is produced with a different number of parallel UGens - I had to push it up to 500 to get significant CPU usage on the Core Duo! The clear tendency over all three graphs is for memset to have the edge, although there's some interesting alternation in the Core Duo graph.

Reminder of the two important rules: (1) take all benchmarks with a big pinch of salt; (2) don't optimise until you can prove that you should do. These benchmarks are specific to UGen plugins running in SuperCollider, YMMV.

Sunday 19th October 2008 | IT | Permalink

New MCLD homepage

I've redesigned my homepage, it is now 100 times cooler. Let me know what you think...


Thanks to Jan Trutzschler von Falkenstein, Rain Rabbit, Samuel Craven and Gregorio Karman for the wicked photos.

Sunday 4th May 2008 | IT | Permalink

A one-line PDF merge command

There's a "pdftk" thing that provides a command-line "pdf merge" tool, but it won't install on my Mac for boring reasons. I found this tip which gives a commandline way to do it without installing anything:

gs -q -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=merged.pdf \ source1.pdf source2.pdf source3.pdf etc.pdf


Friday 2nd May 2008 | IT | Permalink

Asus Eee beer repair

Last weekend I had a mini-calamity when someone poured a bottle of ale all over my tiny little Eee laptop (while it was still plugged in, it was having a bit of a rest from music-making). Dried it off etc but found out the next day that three of the keys were half-stuck down, which made it really difficult to use. They'd presumably got some sticky dried-out beer deposit in the keyboard contacts.

After a bit of prompting I took the plunge and discovered that you can pop the keyboard off by flipping three little tabs (there's a video here). Then, online advice told me to give the keyboard a soak in either: (1) ethanol (70% or more), (2) isopropanol (70% or more), or (3) deionised water. (Never use meths or acetone. If water, it has to be deionised so that it doesn't leave deposits which may harm the electronics.)

I couldn't find any of the chemicals so I had to use de-ionised water in the end, which I got from Robert Dyas. I soaked the affected bit of the keyboard (the top) for about 8 hours, then dried it off thoroughly by pointing a cool hairdryer at it for a couple of hours, then left it out overnight to dry more.

And hey presto it seems to have worked. I'm writing this now from my revived Eee...

Saturday 15th March 2008 | IT | Permalink

Script to archive last.fm feeds

Last.fm provides data feeds of what you've been listening to recently. But they don't give you a feed for the full list of things you've ever listened to. So I wrote a shell script (a bash script) which should run on Mac/Linux, to archive the XML feed data for me:

To archive the data regularly, you need to set this up to run often, e.g. using cron.

Tuesday 27th November 2007 | IT | Permalink

How to disable Adobe Reader Safari plugin

On Mac OSX, the default thing that happens when you click a PDF link in Safari is that it is displayed, quickly and efficiently, with the "Preview" tool.

Unfortunately, if you download and install Adobe's own "Reader" software, it changes Safari's behaviour to use the crap slow-loading slow-displaying Adobe tool instead. Granted, it can do some other things like PDF forms, but for most PDFs it just gets in the way.

I couldn't find out from the web how to remove this behaviour without uninstalling the Reader (which you need to have around sometimes), but I worked it out, so here's how - just delete this file from your hard drive: /Library/Internet Plug-Ins/AdobePDFViewer.plugin

Thursday 7th December 2006 | IT | Permalink

What to install on a Mac

I'm thinking about practicalities of setting myself up in my new PhD position, and one of the things to think about (since I'll be doing a lot of computer work) is how to set up the Mac I'll be using. So, largely for my own reference but for anyone else, here's my list of the really really useful software to install on a Mac. Most of these items are free so the whole lot costs very little.

Things to install on a Mac

The first lot allow you to install a whole range of excellent Unix software, so I install them before (almost) anything else. But skip over this bit if you're not into Unix software:

  • Apple's Developer Tools (comes on a separate CD along with your OSX installer discs)
  • X11 (I think it also comes on the same disc as the developer tools.) X11 is a way for open-source software to create graphical interfaces (windows, menus, etc)
  • Fink (depends on Dev Tools)
  • Darwinports (depends on Dev Tools)

Now the really essential software:

  • Audacity - the best audio editor.
  • BBEdit - a high-performance HTML and text editor for the Macintosh. Excellent for programming but also just for opening text files etc. (NB not free)
  • iDefrag - defragments your hard disk, improving your computer's performance. (NB not free)
  • JDiskReport - excellent graphical way to see what's on your hard disk, what's taking up all the space, etc.
  • Firefox - I like Safari a lot, but it's often helpful to have Firefox too.
  • Firefox Web Developer toolbar
  • Fetch - the nicest FTP software I ever did see. (NB not free)
  • Thunderbird - I don't like Apple's Mail software. Thunderbird does newsfeeds as well as email, and gives you lots and lots of control.

Then there's good software but not essential:

  • Chicken of the VNC - for accessing remote desktops
  • Gimp for image editing. Not Photoshop but near enough, and free
  • WriteRoom - minimalist writing environment for people who need to concentrate
  • Google Earth - sometimes you just need to look at the planet
  • Jreepad - store all the notes you ever think of in a sprawling tree-like structure
  • MenuCalendarClock for iCal - really valuable little tool which gives instant access to your calendar. (NB not free)
  • Growl - allows applications to send you unobtrusive notifications. You can get iTunes to flash up what tracks it's playing, among many many other applications.
  • Coriolis CDMaker - Comes free with iDefrag, and lets you create bootable CDs for rescuing your computer when (in two or three years' time) something goes completely wrong...
Audio things

These are essential for me. If you're into music/audio they may be essential for you too:

  • SuperCollider - programming language and environment for sound.
  • Audio Hijack Pro - allows you to grab the audio from any application and record it to disk. Useful in so many ways. (NB not free)
  • Tartini - a beautifully useable tool "designed as a practical analysis tool for singers and instrumentalists", giving highly detailed pitch contours.
  • VLC - media player.
  • MPlayer - media player.
  • SPEAR - spectral analysis of sounds.
The geeky section

Some useful unix tools I install from fink:

  • svn-client-ssl (this is needed to install SuperCollider from current source)

Useful command-line things I install from darwinports:

  • wireshark (neat tool for sniffing on network connections and seeing what's going on)
  • lame (library for creating MP3s)
Wednesday 20th September 2006 | IT | Permalink

Internet Explorer 7

I searched the web for IE7 and found this helpful website about Internet Explorer 7. Nice...
Monday 11th September 2006 | IT | Permalink

Making the Internet Archive useful

The Internet Archive is a wonderful project, but I'm a little worried that they aren't learning the lesson of successes like Flickr.

The Internet Archive's mission is to preserve as much as possible of the internet (including images, movies, music) in a reliable long-term storage system. It's an excellent plan, run with a librarian's approach which is missing from many internet startups. These startups and their customers are generating lots of fantastic information, but I don't think there's any attempt to preserve this data for future generations.

Anyway. One lesson from Flickr is that their success is largely due to the ability for people to embed Flickr photo collections into their own websites, blogs, etc. They have features such as being able to access your Flickr collection as an RSS feed. The Internet Archive has nothing like this. There is the Ourmedia project, which is a nice project about putting a friendly and social interface onto the Internet Archive, but not much for the existing range of excellent content stored at archive.org.

Here's something which I hope might help: I've created a tool which transforms an archive.org search into an RSS feed. Feedback welcome. It's only a start, but I hope that tools like this could make it possible for archive.org's content to spread outwards throughout the internet. Let's turn the biggest online public archive into a lending library!

Saturday 14th January 2006 | IT | Permalink

Networking Macs over Firewire

Phew! I've just managed to work out how to get my PowerBook on the internet, by connecting over FireWire through my iMac. The iMac's ethernet port is taken (by my girlfriend's PC), and it has no wireless capability, so I needed to work out a simple way to get online. Mac OSX computers can network over a Firewire cable, so I connected a cable between the two, but I was encountering major headaches trying to get on the web. Or even to anything as simple as connecting from one computer to another using SSH or ping.

Here's what you (probably) need to do: on the computer with the internet connection, make sure internet sharing is enabled for "Built-in FireWire" (look in System Preferences > Sharing > Internet). Then go to the Network Preferences and set up your Firewire network connection as using "DHCP with manual address", and type in a normal "local" IP address such as "". On the other computer, go to the Network Preferences and make sure "Built-in FireWire" is enabled, and make sure the connection is using "DHCP" for connecting.

I couldn't seem to get the internet sharing to work without running a proxy server, so I also run tinyproxy on my iMac, and in the PowerBook's Network Preferences I make sure it's set up to use the proxy.

From what I've read, I'm inferring that in order for the two Macs to talk to each other (using Bonjour/Rendezvous), they both need to be using DHCP. This may or may not be true, but it's my conclusion for today. On top of which, I need a fixed IP address for my main computer since otherwise the other computer won't be able to find its proxy host.

Saturday 10th September 2005 | IT | Permalink
Creative Commons License
Dan's blog articles may be re-used under the Creative Commons Attribution-Noncommercial-Share Alike 2.5 License. Click the link to see what that means...