MajorMiner music search
We’ve started using the data that we collected through the MajorMiner game. We’re using it in two ways: making it searchable directly, and training autotaggers with it. The human search finds all of the clips that have had a particular tag applied to them by at least two people, sorted by the number of times it’s been applied. You can type a search directly into the search box, or browse through the top few. People are pretty good at finding things in music, as it turns out, check out british, u2, tambourine, and scratch. This search also takes advantage of the newly introduced canonicalization of tags, so that funk matches funky. But there are always ambiguity issues, e.g. club as lyric vs genre.
The machine search is a little more involved. We took all of the tags that had been applied to enough (35) clips and used them to train classifiers. Actually, we only used clips from half of the artists in our collection to train the classifiers, then we ranked all of the clips from the rest of the artists by each classifier’s output. This means we can look at all of those clips sorted by how much they appeal to the rap classifier, the saxophone classifier, the house classifier, and so on. I like how the guitar classifier catches Outkast’s acoustic guitar (!), but also the Jesus and Mary Chain’s fuzzed out guitar. For those of you interested in the details, we have a couple of papers that we’ve submitted recently describing them, but the gist is that we’re using the features from last MIREX and the usual SVM classifier.
Some thought went into the ranking of the tags on the main search page as well. Since we know the answers for some of the clips in the test set, and we ranked the tags by how well their classifier was able to learn them. Actually, we used a Bayesian estimate of the classification accuracy from the beta-binomial model to do the ranking more intelligently. The basic idea is that test accuracy is measured more accurately for tags with a lot of test examples, and less accurately for tags with few test examples. The measured accuracy of tags are then shrunk towards the overall mean accuracy in proportion to how well the model thinks they are estimated. So even though club has a better raw accuracy than rap, it was tested on many fewer examples, so it ends up below rap in the final ranking, i.e. the raw accuracy is more likely a random fluctuation than a meaningful result.
So go check out some of the creative ways our players have found to describe music, and describe some music yourself!
April 28th, 2008 at 2:22 pm
Michael: this is really cool - some of the autotags are doing an excellent job. I was surprised at how well you did on a tag like ‘repetitive’ - are you using some longer term features to capture this?
April 28th, 2008 at 2:33 pm
Cool results. You should also check the latest idea of Luis von Ahn: Two users listen to a song. They may listen to the same song, or to a different one. They can also communicate through a text interface, describing what they listen to. Based on the communication, they have to decide whether they listen to the same song or not and they get points if they guess correctly. Communication that leads to successful inference is deemed as a good description of the song. I am not sure if the game is publicly available or not.
April 28th, 2008 at 2:34 pm
I’m using both timbral and rhythmic features. The timbral features are the MFCC mean and covariance. The temporal features are the sort of modulation spectrum features we used at MIREX last year. Basically, it’s the DCT (along time) of the low frequency modulation spectra. They do a good job of capturing the sorts of things you’re talking about: repetition, rhythms, and tempos. With ‘repetitive’, I think we get good precision, but it’s less clear how good the recall is.