-
Notifications
You must be signed in to change notification settings - Fork 409
Description
Hello,
I would not be surprised to hear this is entirely user error but after cobbling together some code from various examples both with this project or from some other issues, the pitch values I am getting back seem strangely high, certainly outside of human ranges.
0 hz
4407.88 hz
4559.07 hz
0 hz
4573.29 hz
4533.14 hz
4533.14 hz
4677.87 hz
0 hz
0 hz
4252.05 hz
I am using miniaudio to capture from the microphone and feeding that into the input buffer. Here are some code snippets, but if more complete file is required I can probably make a gist or something. It's a horrible cobbled together mess since this is very much a learning c++ project, so sorry if it's quite sub-optimal!
// find audio devices using miniaudio
ma_device_config config = ma_device_config_init(ma_device_type_capture);
config.capture.format = ma_format_f32;
config.capture.channels = 1;
config.dataCallback = ma_audio_data_callback;
ma_device device;
ma_result result = ma_device_init(nullptr, &config, &device);
if (result != MA_SUCCESS) {
printf("Failed to initialize capture device.\n");
return -2;
}
result = ma_device_start(&device);
if (result != MA_SUCCESS) {
ma_device_uninit(&device);
printf("Failed to start device.\n");
return -3;
}
aubio_pitch_t *pitch_object = new_aubio_pitch("default", BUF_SIZE, HOP_SIZE, device.sampleRate);this is the snippet for getting the audio device from miniaudio and then creation the pitch object. BUF_SIZE is currently 2048 and HOP_SIZE is 512 but I've tried a few values. All of the examples I saw used a HOP_SIZE of BUF_SIZE/4 though. The microphone is technically a stereo device, but it didn't seem to matter if I used mono or stereo channels, the same crazy values get reported. ma_format_f32 makes the resulting buffer in the range of -1, 1 which from what I could tell is what aubio is expecting.
void ma_audio_data_callback(ma_device *pDevice, void *pOutput, const void *pInput, ma_uint32 frameCount){
const float* floatBuffer = (float*)(pInput);
std::vector vecBuffer(floatBuffer, floatBuffer + (frameCount * pDevice->capture.channels));
VoiceData* vd = (VoiceData*)(pDevice->pUserData);
vd->IngestAudio(vecBuffer);
}This is the miniaudio callback to feed data into the input buffer. VoiceData is a class I have made.
void IngestAudio(const std::vector<float> &audioInput) const{
for(int i = 0; i < HOP_SIZE; i++){
fvec_set_sample(input_buffer, audioInput[i], i);
}
}This is the actual function for adding stuff to the input_buffer, which is defined as a fvec_t of length BUF_SIZE like the example from this repo and a few others.
Every "frame" (very basic main loop made, using an imgui opengl example template for this) this function is called to actually perform the pitch detection operation
void UpdateVoiceData(VoiceData &voiceData, aubio_pitch_t &pitch_object){
aubio_pitch_do(&pitch_object, voiceData.input_buffer, voiceData.out);
float confidence = aubio_pitch_get_confidence(&pitch_object);
if(confidence < 0.001)
return;
float frequency = voiceData.out->data[0];
std::cout << frequency << " hz\n";
}This is the function that is reporting the crazy high values.
I must have misunderstood something since it seems aside from a single comment I found no one else is getting crazy values like this. I'd love any help/advice you can offer!