Question

I'm new to the audio world, and I've been assigned a task I'm not sure how to deal with. I need to display a graph representing the pitch of a WAV file. The WAV file can be polyphonic, and then I need to display the pitch graph of the dominant (read: with the strongest volume) instrument (or human voice singing). I'm quite familiar with .NET, and knows next to nothing in C, C++, Java.

I started a research in the web, and from what I understood pitch detection of polyphonic WAV files is a not solved problem, but I don't need the exact pitch, just to know if it's higher from the next one, and what is its length.

My questions are:

  1. Where should I start for learning the theory behind this task? Are there any recommended books for that?

  2. Is there any API/tool that can do that?

Thanks,

ML

Was it helpful?

Solution

You can use Fourier Analysis to extract the constituent frequencies of the wave. Here's a related SO question: Fast Fourier Transform in C#

OTHER TIPS

I started on signal processing with Digital Signal Processing. You can buy the book printed for around $30, or download the pdf for free. It's written for non-scientists, and contains a great deal of information to get you up and running with various signal processing techniques.

FFT (fast fourier transforms) will convert a time vs amplitude signal into a frequency (bins) vs the intensity (energy). You can go from frequency to a note via standard charts found on a lot of websites.

I've been working on a somewhat related project. You'll want to investigate fast fourier transforms. I can recommend this project written in C# - it's written as a guitar tuner but it can be easily modified to your purpose.

This however won't do multi-pitch. There is a product called "celemony" which claims to have this ability using "Direct Note Access (DNA)". Their technology is understandably closed source, but you may be able to license it at the right price.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top