Beat detection from MIDI data

19.01.15

Lately I've been working on giving Whole-Play beat detection capabilities. There's heaps of information online on beat detection algorhythms, but it generally deals with audio signals, whereas I'm dealing with a stream of MIDI messages. In particular I'm concerned with extracting tempo from an improviser, tyipically on guitar or piano, which makes it quite a different problem to detecting beat from an audio source with drums, for example.

The problems

There's multiple issues that makes this difficult. Here's some of them:

  • Simple vs. compound meters.
    Detecting the beat is not very useful if I can't distinguish between binary and ternary subdivision. Two sub-problems are:
    • What if it's mixed? (i.e. using triplets on a simple meter)
    • What's the difference between a fast 3/4 and a slow 6/8?  This one doesn't really have an answer in my opinion, and it touches directly on rhythmic perception, which not only is tricky to evalutate, but it can be different for different listeners. So I'm not aiming at universal 'perfect' answers here, just want to make sure that standard scenarios are detected correctly.
  • Syncopation, off-beat accents, rubato...
  • Odd time signatures.
    So, what's the beat on a 7/8? And what happens at different tempos?
  • Hardware imprecision.
    When I use a MIDI keyboard lag there's virtually no lag between the keys being pressed and the MIDI being sent, but when using a MIDI pickup for a guitar this is not the case, and to make it more interesting, lag it's not constant.
  • Human imprecision.
    Even if I find a really good algorhythm, human introduced imprecisions can throw results off quite a lot. This is tricky to adjust, but I'm happy if I can get good results in reasonable accurate performances.

With all of that, finding a really good solutions seems a bit hopeless. But hey, let's give this a go!

The not-perfect-but-not-too-shabby solution so far

First I tried to detect the beat and from there try to guess the subdivision, but this is very tricky, especially if the source is a freely improvised bit of music on a keyboard or guitar. After struggling with the problem of beat vs. subdivision for a while, I decided to approach it bottom-up: let's find the speed of the subdivisions first, which might be a bit easier than finding the beat, and then try to work out if it's a simple or a compound meter, or a mixture (such as an odd time signature, or using triplets and such).

1. Detecting the speed of the subdivisions

This seems much easier than detecting the beat, in most cases. The general approach is to collect the durations of notes during a certain timeframe, and compare them all against each other. Comparing their durations and looking for simple ratios (2:1, 3:1...) can lead to a reasonably accurate estimation of the speed of subdivisions. There are cases where this won't be easy, but these are often cases where even human perception would have difficulty or ambiguity. I've managed to get reasonbly accurate results with this so I'm happy with it, at least for now.

2. Simple, compound or 'mixed' meter

This is a bit less straightforward. It's quite interesting to explore why we perceive things as one or another. In most cases the key are the accents in the music fragment, or course. But it gets more interesting than that, because motifs and harmony can often be important factors. My initial attempts deal only with accents, and the strategy is to look at only louder notes in the fragment, and see what provides a better match, binary or ternary subdivision. This works well in very basic cases, reasonably well in 'standard' cases, but has trouble under various circumstances.

The main 'false positive' is failing to detect a compound meter, and reporting a beat that's 1.5x the actual beat. For example on a 6/8 at 60 bpm each beat lasts 1s, so each eighth note is 0.33s. Whole-Play will pick up the speed of the eighth notes, but might fail to detect that it's a compound meter, and instead report a beat of 90 bpm (with binary subdivision). Which is not terrible, I use that kind of transformation often. :) But I'm aming at polishing the algorhythm to make this kind of error the exception. Maybe when I work on motif detection I'll have an added tool to infer beat information (and possibly time signature).

0 comments

Add a comment


[ change image ]

PS: no links allowed in comment.


End of page. Back to page navigation.