Question

Well, this is my first question on stackoverflow so kinda excited about it :) Here it is: My input is a wave file. For now, I have recorded a piece using my guitar. So the wave file contains this instrumental recording. What I want to do is, get the musical notation(A,B,C and so on) of each note that is being played. I have heard about techniques like the FFT but considering my poor knowledge of how to use FFT, I thought of using the aubio library.

So aubio provides: aubiopitch which extracts pitch candidates and aubiocut which extracts onsets.

Where I am stuck is: How do I get the frequency at the particular time of the note played using aubio? According to me, aubiopitch and aubiocut would help but I dont understand how to do the mapping between them. Any help would be greatly appreciated :)


Hi piem: Thanks for your answer. Could you please analyse this output?

aubiopitch -i Reverse_Open.wav

1.408 68.9486465454

1.536 81.7372512817

1.664 164.290893555

1.792 164.464691162

1.92 82.6862487793

2.048 328.539306641

2.176 218.885116577

2.304 219.06237793

2.432 219.042160034

2.56 219.133621216

2.688 145.751785278

2.816 146.437744141

2.944 146.199829102

3.072 195.059829712

3.2 194.912689209

3.328 195.724975586

3.456 195.517547607

3.584 247.317428589

3.712 246.764221191

3.84 246.857452393

3.968 145.454727173

4.096 328.569610596

4.224 329.625823975

4.352 329.16619873

4.48 328.906402588

4.608 328.96786499

4.736 329.187835693

4.864 145.741394043

My notes with frequencies are: E(82 approx),A(110),D(147),G(197),B(247),E(329.2) which are played at 1.344,1.888,2.4,2.88,3.36,3.872 resp(according to aubiopitch which I suppose is correct). Any idea how do I extract these 6 notes and their times from the above output?

Was it helpful?

Solution

aubiopitch outputs a list of tuple. Each tuple contains two floats:

  • a timestamp in seconds
  • a fundamental frequency in Hertz

Here is an example on a guitar sound:

$ aubiopitch -i guitar_Cold_Blood_-_Baby_I_Love_You.wav | head
0.000000 0.000000
0.005805 293.884338
0.011610 386.387207
0.017415 0.000000
0.023220 551.689758
0.029025 3608.569336
0.034830 3588.231201
0.040635 416.824066
0.046440 3606.715576
0.052245 417.116425

if you are curious (please be), you can get the latest git version and try the demo script demo_pitch.py:

$ ./python/demos/demo_pitch.py bass_Don_Ellis_-_Conquistador.wav

you would get the following plot:

aubio pitch demo plot

  • The first row represents the waveform.
  • The second row, the extracted pitch track, in midi frequency.
  • The third, the confidence of these pitch candidates (using the yinfft algorithm).

In this sample of a bass line, extracting the pitch during the transient attacks is more challenging than in the steady state. Pitch candidates that are found below an arbitrary threshold (here 0.8) can be discarded (dashed green line), while others can be kept (solid blue line).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top